Vaniver comments on AGI Ruin: A List of Lethalities

Vaniver 8 Jun 2022 18:36 UTC
LW: 7 AF: 2
1
AF
Why is the process by which humans come to reliably care about the real world
IMO this process seems pretty unreliable and fragile, to me. Drugs are popular; video games are popular; people-in-aggregate put more effort into obtaining imaginary afterlives than life extension or cryonics.
But also humans have a much harder time ‘optimizing against themselves’ than AIs will, I think. I don’t have a great mechanistic sense of what it will look like for an AI to reliably care about the real world.
- TurnTrout 8 Jun 2022 19:01 UTC
  LW: 3 AF: 2
  1
  AF Parent
  One of the problems with English is that it doesn’t natively support orders of magnitude for “unreliable.” Do you mean “unreliable” as in “between 1% and 50% of people end up with part of their values not related to objects-in-reality”, or as in “there is no a priori reason why anyone would ever care about anything not directly sensorially observable, except as a fluke of their training process”? Because the latter is what current alignment paradigms mispredict, and the former might be a reasonable claim about what really happens for human beings.
  EDIT: My reader-model is flagging this whole comment as pedagogically inadequate, so I’ll point to the second half of section 5 in my shard theory document.