Really, it’s a meaningless question unless you provide a definition of ‘aligned’. If your definition is ‘a good person’ then the question is really just “do you see yourself as moral?”
This is part of what I was getting at in terms of meta-questions. There have been attempts to solve this, of which CEV seems to come up the most; I haven’t personally found any of them strictly compelling. The parent question mixes the self-or-other-human thought experiment with the self-extrapolation part, but I wanted to see what the answers would be like without it—if it’s testing for “does extrapolating yourself misalign you”, and if it means comparatively, then surely something like a control group should be in play, even if I can’t sort one out in a more ordered fashion.
I think there are really two definitions of alignment here; one is “do you have close to an ideal moral system such that you would endorse it being scaled up to the whole universe for eternity” and the other is “would you commit omnicide given the opportunity.” When we talk about AI alignment, we are talking about a machine that will commit omnicide, so it doesn’t matter that much if it has the right opinions on the trolley problem or the repugnant conclusion or whatever. Hopefully no one here is planning omnicide (and probably no one on this forum would admit to desiring omnicide even if they did secretly desire it) so therefore everyone on this forum (who is not hiding a dark secret) is more aligned than Clippy, and that’s what matters right now.
This is just “are you a good person” with few or no subtle twists, right?
Really, it’s a meaningless question unless you provide a definition of ‘aligned’. If your definition is ‘a good person’ then the question is really just “do you see yourself as moral?”
This is part of what I was getting at in terms of meta-questions. There have been attempts to solve this, of which CEV seems to come up the most; I haven’t personally found any of them strictly compelling. The parent question mixes the self-or-other-human thought experiment with the self-extrapolation part, but I wanted to see what the answers would be like without it—if it’s testing for “does extrapolating yourself misalign you”, and if it means comparatively, then surely something like a control group should be in play, even if I can’t sort one out in a more ordered fashion.
I think there are really two definitions of alignment here; one is “do you have close to an ideal moral system such that you would endorse it being scaled up to the whole universe for eternity” and the other is “would you commit omnicide given the opportunity.” When we talk about AI alignment, we are talking about a machine that will commit omnicide, so it doesn’t matter that much if it has the right opinions on the trolley problem or the repugnant conclusion or whatever. Hopefully no one here is planning omnicide (and probably no one on this forum would admit to desiring omnicide even if they did secretly desire it) so therefore everyone on this forum (who is not hiding a dark secret) is more aligned than Clippy, and that’s what matters right now.