I hope I’m not misinterpreting again, but this is a Giant cheesecake fallacy. The problem is that AI’s decisions depend on its motive. “An AI, upon realizing that it’s programmed in a way its designer didn’t intend to, would try to convince the programmer that what the AI turned out to be is exactly what he intended in the first place”, “An AI, upon realizing that it’s programmed in a way its designer didn’t intend to, would print a string “Styggron” to the console”.
I hope I’m not misinterpreting again, but this is a Giant cheesecake fallacy. The problem is that AI’s decisions depend on its motive. “An AI, upon realizing that it’s programmed in a way its designer didn’t intend to, would try to convince the programmer that what the AI turned out to be is exactly what he intended in the first place”, “An AI, upon realizing that it’s programmed in a way its designer didn’t intend to, would print a string “Styggron” to the console”.
Thanks, that’s a good one. I’ll try it.