Capybasilisk comments on Half-baked idea: a straightforward method for learning environmental goals?

Capybasilisk 7 Feb 2025 23:49 UTC
1 point
0

Also note that fundamental variables are not meant to be some kind of “moral speed limits”, prohibiting humans or AIs from acting at certain speeds. Fundamental variables are only needed to figure out what physical things humans can most easily interact with (because those are the objects humans are most likely to care about).

Ok, that clears things up a lot. However, I still worry that if it’s at the AI’s discretion when and where to sidestep the fundamental variables, we’re back at the regular alignment problem. You have to be reasonably certain what the AI is going to do in extremely out of distribution scenarios.
- Q Home 8 Feb 2025 9:53 UTC
  5 points
  0
  Parent
  The subproblem of environmental goals is just to make AI care about natural enough (from the human perspective) “causes” of sensory data, not to align AI to the entirety of human values. Fundamental variables have no (direct) relation to the latter problem.
  
  However, fundamental variables would be helpful for defining impact measures if we had a principled way to differentiate “times when it’s OK to sidestep fundamental variables” from “times when it’s NOT OK to sidestep fundamental variables”. That’s where the things you’re talking about definitely become a problem. Or maybe I’m confused about your point.
  - Capybasilisk 10 Feb 2025 9:29 UTC
    1 point
    0
    Parent
    Thanks. That makes sense.