Noosphere89 comments on Motivation control

Noosphere89 31 Oct 2024 18:33 UTC
6 points
2
The key point is assuming you have avoided the bad takeover behavior, the solutions become more like a normal scientific or engineering solution, which we have known records of solving, and in particular it removes one of the most nasty difficulties in AI safety, where you can’t iterate on the system very much or at all, because the AI will fight you.

So this makes us much more likely to succeed at the problems you’ve mentioned, conditional on getting controlled AI.
- Charlie Steiner 31 Oct 2024 23:05 UTC
  4 points
  0
  Parent
  Well, it makes things better. But it doesn’t assure humanity’s success by any means. Basically I agree but will just redirect you back to my analogy about why the paper “How to solve nuclear reactor design” is strange.
  - Noosphere89 31 Oct 2024 23:41 UTC
    4 points
    2
    Parent
    The paper you describe in your comment would have a lot of it’s details filled in by default by the capabilities people inside an AI lab, and the alignment team would outsource most of the details to the people who would want to make the AI go fast.
    
    While I don’t think it would ensure humanity’s success by any means, I do think that the alignment field could mostly declare victory and stop working if we knew there were no problems that were resistant to iterative correction, since other people will solve it for us.