Hmm. I guess I start with the knowledge that humans don’t seem to be VNM-consistent, so it’s quite reasonable to start by tabooing “want” and “prefer”, because they don’t apply in the way that’s usually studied and analyzed.
I disagree with steven0461 that “just ask” provides any more information than watching an artificial choice. Both are trying to infer something that doesn’t exist from something easily observable.
For many humans, we CAN say they “currently prefer” the expected outcome of an actual choice they make, but that’s a pretty weak and circular definition.
So—what do you hope to actually model about an individual human that you’re using the word “want” for?
Ah, yeah. That’s why I’m not very hopeful about AI alignment. I don’t think anyone’s even defined the problem in a useful way.
Neither humans as a class nor most humans as individuals HAVE preferences that AI is able to fulfill, or even be compatible with as they are conceived today. We MAY have mental frameworks that let our preferences evolve to survive well in an AI-containing world.
Hmm. I guess I start with the knowledge that humans don’t seem to be VNM-consistent, so it’s quite reasonable to start by tabooing “want” and “prefer”, because they don’t apply in the way that’s usually studied and analyzed.
I disagree with steven0461 that “just ask” provides any more information than watching an artificial choice. Both are trying to infer something that doesn’t exist from something easily observable.
For many humans, we CAN say they “currently prefer” the expected outcome of an actual choice they make, but that’s a pretty weak and circular definition.
So—what do you hope to actually model about an individual human that you’re using the word “want” for?
The overarching problem is figuring out human preferences so that AI can fulfill them. We’re all on the same page that humans aren’t VNM-consistent.
Ah, yeah. That’s why I’m not very hopeful about AI alignment. I don’t think anyone’s even defined the problem in a useful way.
Neither humans as a class nor most humans as individuals HAVE preferences that AI is able to fulfill, or even be compatible with as they are conceived today. We MAY have mental frameworks that let our preferences evolve to survive well in an AI-containing world.