AprilSR comments on Best arguments against the Natural Abstractions Hypothesis applying to human values?

AprilSR 6 Apr 2022 14:58 UTC
1 point
The most obvious-to-me flaw with the plan of “hang out in the slightly superhuman range for a few decades and try to slowly get better alignment work done, possibly with the help of the AI” is that it involves no one ever turning up an AI even a little bit.

That level of coordination isn’t completely infeasible but it doesn’t seem remotely reliable.
- Not Relevant 6 Apr 2022 17:11 UTC
  7 points
  Parent
  100%, if I thought we had other options I’d obviously choose them.
  The only reason this might be even hypothetically possible is self-interest, if we can create really broad social consensus about the difficulty of alignment. No one is trying to kill themselves.