Ok, really, all of this has already been answered. These are standard misconceptions about alignment, probably based on some kind of antropomorphic reasoning.
Where? By whom?
Why would you possibly make this assumption?
Why would you possibly assume that deep, intelligent understanding of life, consciousness, joy and suffering has 0 correlation with caring about these things?
All of the assumptions we make about biological, evolved life do not apply to AI.
But where do valid assumptions about AI come from? Sure, I might be antropomorphizing AI a bit. I am hopeful that we, biological living humans, do share some common ground with non-biological AGI. But you’re forcefully stating the contrary and claiming that it’s all so obvious, but why is that? How do you know that any AGI is blindly bound to a simple utility function that cannot be updated by understanding the world around it?
Why would you possibly assume that deep, intelligent understanding of life, consciousness, joy and suffering has 0 correlation with caring about these things?
The orthogonality thesis says that an AI can have any combination of intelligence and goals, not that P(goal =x|intelligence =y)=P(goal =x) for all x and y. It depends entirely on how the AI is built. People like Rohin Shah assign significant probability on alignment by default, at least last I heard.
It’s worth noting that (and the video acknowledges that) “Maybe it’s more like raising a child than putting a slave to work” is a very very different statement than “You just have to raise it like a kid”.
In particular, there is no “just” about raising a kid to have good values—especially when the kid isn’t biologically yours and quickly grows to be more intelligent than you are.
Where? By whom?
Why would you possibly assume that deep, intelligent understanding of life, consciousness, joy and suffering has 0 correlation with caring about these things?
But where do valid assumptions about AI come from? Sure, I might be antropomorphizing AI a bit. I am hopeful that we, biological living humans, do share some common ground with non-biological AGI. But you’re forcefully stating the contrary and claiming that it’s all so obvious, but why is that? How do you know that any AGI is blindly bound to a simple utility function that cannot be updated by understanding the world around it?
You know, I’m not sure I remember. You tend to pick this stuff up if you hang around LW long enough.
I’ve tried to find a primer. The Superintelligent Will by Nick Bostrom seems good.
The orthogonality thesis (also part of the paper I linked above).
Edit: also, this video was recommended to me.
The orthogonality thesis says that an AI can have any combination of intelligence and goals, not that P(goal =x|intelligence =y)=P(goal =x) for all x and y. It depends entirely on how the AI is built. People like Rohin Shah assign significant probability on alignment by default, at least last I heard.
It’s worth noting that (and the video acknowledges that) “Maybe it’s more like raising a child than putting a slave to work” is a very very different statement than “You just have to raise it like a kid”.
In particular, there is no “just” about raising a kid to have good values—especially when the kid isn’t biologically yours and quickly grows to be more intelligent than you are.