Doomers worry about AIs developing “misaligned” values. But in this scenario, the “values” implicit in AI actions are roughly chosen by the organisations who make them and by the customers who use them
I think this is the critical crux of the disagreement. A part of the Elizer’s argument, as I understand it, is that the current technology is completely incapable of anything close to actually “roughly choosing” the AI values. On this point, I think Elizer is completely right.
If you have played with chatGPT4 its pretty clear that it is aligned (humans have roughly chose its values), especially compared to reports of the original raw model before RLHF, or less sophisticated alignment attempts in the same model family—ie Bing. Now its possible of course that its all deception, but this seems somewhat unlikely.
I think this is the critical crux of the disagreement. A part of the Elizer’s argument, as I understand it, is that the current technology is completely incapable of anything close to actually “roughly choosing” the AI values. On this point, I think Elizer is completely right.
If you have played with chatGPT4 its pretty clear that it is aligned (humans have roughly chose its values), especially compared to reports of the original raw model before RLHF, or less sophisticated alignment attempts in the same model family—ie Bing. Now its possible of course that its all deception, but this seems somewhat unlikely.
I took issue with the same statement, but my critique is different: https://www.lesswrong.com/posts/mnCDGMtk4NS7ojgcM/linkpost-what-are-reasonable-ai-fears-by-robin-hanson-2023?commentId=yapHwa55H4wXqxyCT