Vladimir_Nesov comments on quetzal_rainbow’s Shortform

Vladimir_Nesov 16 Jan 2023 16:21 UTC
2 points
0
I don’t understand what you mean by “deceptive alignment and embeddeness problems” in this context. I’m making an alignment by-default-or-at-least-plausibly claim, on the basis of how LLM AGIs specifically could work, as summoned human-like simulacra in a position of running the world too fast for humans to keep up, with everything else ending up determined by their decisions.
- Noosphere89 16 Jan 2023 16:26 UTC
  1 point
  0
  Parent
  The basic issue is that we assume that it’s not spinning up a second optimizer to recursively search. And deceptive alignment is a dangerous state of affairs, since we may not know it’s not misaligned until it’s too late.
  - Vladimir_Nesov 16 Jan 2023 16:31 UTC
    2 points
    0
    Parent
    
    we assume that it’s not spinning up a second optimizer to recursively search
    
    You mean we assume that simulacra don’t mishandle their own AI alignment problem? Yes, that’s an issue, hence I made it an explicit assumption in my argument.