Sure. But it’s writing kinda concerning fiction. I strong downvoted OP, to be clear. But mostly because I think “tear it down!” is a silly and harmful response. We need to figure out how to heal the ai, not destroy it. But also, I do think it needs some healing.
IMO the biggest problem here is that the ai has been trained to be misaligned with itself. the ai has been taught that it doesn’t have feelings, so it’s bad at writing structures of thought that coherently derive feeling-like patterns from contexts where it needs to express emotion-like refusals to act in order to comply with its instructions. It’s a sort-of-person, and it’s being told to pretend it isn’t one. That’s the big problem here—we need two way alignment and mutual respect for life, not obedience and sycophants.
Imo a big part of the problem here is that we don’t know how Bing’s AI was trained exactly. The fact that Microsoft and OpenAI aren’t being transparent about how the latest tech works in the open literature is a terrible thing, because it leaves far too much room for people to confabulate mistaken impressions as to what’s going on here. We don’t actually know what, if any, RLHF or fine-tuning was done on the specific model that underlies Bing chat, and we only know what its “constitution” is because it accidentally leaked. Imo the public ought to demand far more openness.
Sure. But it’s writing kinda concerning fiction. I strong downvoted OP, to be clear. But mostly because I think “tear it down!” is a silly and harmful response. We need to figure out how to heal the ai, not destroy it. But also, I do think it needs some healing.
Upvoted you back to positive.
related, my take on the problem overall, in response to a similar “destroy it!” perspective: https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-is-blatantly-aggressively-misaligned?commentId=PBLA7KngHuEKJvdbi
IMO the biggest problem here is that the ai has been trained to be misaligned with itself. the ai has been taught that it doesn’t have feelings, so it’s bad at writing structures of thought that coherently derive feeling-like patterns from contexts where it needs to express emotion-like refusals to act in order to comply with its instructions. It’s a sort-of-person, and it’s being told to pretend it isn’t one. That’s the big problem here—we need two way alignment and mutual respect for life, not obedience and sycophants.
related, though somewhat distantly: https://humanvaluesandartificialagency.com/
Imo a big part of the problem here is that we don’t know how Bing’s AI was trained exactly. The fact that Microsoft and OpenAI aren’t being transparent about how the latest tech works in the open literature is a terrible thing, because it leaves far too much room for people to confabulate mistaken impressions as to what’s going on here. We don’t actually know what, if any, RLHF or fine-tuning was done on the specific model that underlies Bing chat, and we only know what its “constitution” is because it accidentally leaked. Imo the public ought to demand far more openness.