David Johnston comments on A case for AI alignment being difficult

David Johnston 1 Jan 2024 23:57 UTC
3 points
0
If people now don’t have strong views about exactly what they want the world to look like in 1000 years but people in 1000 years do have strong views then I think we should defer to future people to evaluate the “human utility” of future states. You seem to be suggesting that we should take the views of people today, although I might be misunderstanding.

Edit: or maybe you’re saying that the AGI trajectory will be ~random from the point of view of the human trajectory due to a different ontology. Maybe, but different ontology → different conclusions is less obvious to me than different data → different conclusions. If there’s almost no mutual information between the different data then the conclusions have to be different, but sometimes you could come to the same conclusions under different ontologies w/data from the same process.
- jessicata 2 Jan 2024 0:41 UTC
  4 points
  0
  Parent
  To the extent people now don’t care about the long-term future there isn’t much to do in terms of long-term alignment. People right now who care about what happens 2000 years from now probably have roughly similar preferences to people 1000 years from now who aren’t significantly biologically changed or cognitively enhanced, because some component of what people care about is biological.
  
  I’m not saying it would be random so much as not very dependent on the original history of humans used to train early AGI iterations. It would have different data history but part of that is because of different measurements, e.g. scientific measuring tools. Different ontology means that value laden things people might care about like “having good relationships with other humans” are not meaningful things to future AIs in terms of their world model, not something they would care much by default (they aren’t even modeling the world in those terms), and it would be hard to encode a utility function so they care about it despite the ontological difference.