But you seem to mean something broader with “possible worlds.” Something like “in theory, there is a physically possible arrangement of atoms/energy states that would result in an ‘aligned’ AGI, even if that arrangement of states might not be reachable from our current or even a past world.”
–> Am I interpreting you correctly?
Yup, that’s roughly what I meant. However, one caveat would be that I would change “physically possible” to “metaphysically/logically possible” because I don’t know if worlds with different physics could exist, whereas I am pretty sure that worlds with different metaphysical/logical laws couldn’t exist. By that, I mean stuff like the law of non-contradiction and “if a = b, then b = a.”
You saying this shows the ambiguity here of trying to understand what different people mean. One researcher can make a technical claim about the possibility/tractability of “alignment” that is similarly worded to a technical claim others made. Yet their meaning of “alignment” could be quite different.
It’s hard then to have a well-argued discussion, because you don’t know whether people are equivocating (ie. switching between different meanings of the term).
I think the main antidote against this is to ask the person you are speaking with to define the term if they are making claims in which equivocation is especially likely.
The way I deal with the wildly varying uses of the term “alignment” is to use a minimum definition that most of those six interpretations are consistent with
Yup, that’s roughly what I meant. However, one caveat would be that I would change “physically possible” to “metaphysically/logically possible” because I don’t know if worlds with different physics could exist, whereas I am pretty sure that worlds with different metaphysical/logical laws couldn’t exist. By that, I mean stuff like the law of non-contradiction and “if a = b, then b = a.”
I think the main antidote against this is to ask the person you are speaking with to define the term if they are making claims in which equivocation is especially likely.
Yeah, that’s reasonable.