Raemon comments on What’s Up With Confusingly Pervasive Goal Directedness?

Raemon 21 Jan 2022 19:45 UTC
5 points
0
I might be conflating Richard, Paul, and my own guesses here. But, I think part of the argument here is about what can happen before AGI, that gives us lines of hope to pursue.
Like, my-model-of-Paul wants various tools for amplifying his own thought to (among other things) help think about solving the longterm alignment problem. And the question is whether there are ways of doing that that actually help when trying to solve the sorts of problems Paul wants to solve. We’ve successfully augmented human arithmetic and chess. Are there tools we actually wish we had, that narrow AI meaningfully helps with,
I’m not sure if Richard has a particular strategy in mind, but I assume he’s exploring the broader question of “what useful things can we build that will help navigate x-risk”
The original dialogs were exploring the concept of pivotal acts that could change humanity’s strategic position. Are there AIs that can execute pivotal acts that are more like calculators and Deep Blue than like autonomous moon-base-builders? (I don’t know if Richard actually shares the pivotal act / acute risk period frame, or was just accepting it for sake of argument)
- Garrett Baker 21 Jan 2022 20:43 UTC
  15 points
  0
  Parent
  The problem is not with whether we call the AI AGI or not, it’s whether we can either 1) fully specify our goals in the environment space it’s able to model (or otherwise not care too deeply about the environment space it’s able to model), or 2) verify the effects of the actions it says to do have no disastrous consequences.
  
  To determine whether a tool AI can be used to solve problems Paul wants to solve, or execute pivotal acts, we need a to both 1) determine that the environment is small enough for us to accurately express our goal, and 2) ensure the AI is unable to infer the existence of a broader environment.
  
  (meta note: I’m making a lot of very confident statements, and very few are of the form “<statement>, unless <other statement>, then <statement> may not be true”. This means I am almost certainly overconfident, and my model is incomplete, but I’m making the claims anyway so that they can be developed)