CronoDAS comments on AGI Ruin: A List of Lethalities

CronoDAS 7 Jun 2022 2:55 UTC
4 points
0

All of the paths, of course, bottom out in everyone dying, with detailed explanations of why.

A strange game. The only winning move is not to play. ;)
- Thane Ruthenis 7 Jun 2022 4:26 UTC
  4 points
  0
  Parent
  I guess we should also kidnap people and force them to play it, and if they don’t succeed we kill them? For realism? Wait, there’s something wrong with this plan.
  More seriously, yeah, if you’re implementing it more like a game and less like an interactive article, it’d need to contain some promise of winning. Haven’t considered how to do it without compromising the core message.
  - AdamB 15 Jun 2022 13:23 UTC
    5 points
    0
    Parent
    What if “winning” consists of finding a new path not already explored-and-foreclosed? For example, each time you are faced with a list of choices of what to do, there’s a final choice “I have an idea not listed here” where you get to submit a plan of action. This goes into a moderation engine where a chain of people get to shoot down the idea or approve it to pass up the chain. If the idea gets convincingly shot down (but still deemed interesting), it gets added to the story as a new branch. If it gets to the top of the moderation chain and makes EY go “Hm, that might work” then you win the game.
    - Thane Ruthenis 15 Jun 2022 23:00 UTC
      4 points
      1
      Parent
      Mmm. If the CYOA idea is implemented as a quirky-but-primarily-educational article, then sure, integrating the “adapt to feedback” capability like this would be worthwhile. Might also attach a monetary prize to submitting valuable ideas, by analogy to the ELK contest.
      For a game-like implementation, where you’d be playing it partly for the fun/challenge of it, that wouldn’t suffice. The feedback loop’s too slow, and there’d be an ugh-field around the expectation that submitting a proposal would then require arguing with the moderators about it, defending it. It wouldn’t feel like a game.
      It’d make the upkeep cost pretty high, too, without a corresponding increase in the pay-off.
      Just making it open-ended might work, even without the moderation engine? Track how many branches the player explored, once they’ve explored a lot (i. e., are expected to “get” the full scope of the problem), there appears an option for something like “I really don’t know what to do, but we should keep trying”, leading to some appropriately-subtle and well-integrated call to support alignment research?
      Not excited about this approach either.