Yeah, I feel this is quite similar to OpenAI’s plan to defer alignment to future AI researchers, except worse, because if we grant that the plan proposed actually made the augmented humans stably aligned with our values, then it would be far easier to do scalable oversight, because we have a bunch of advantages around controlling AIs, like the fact that it would be socially acceptable to control AI in ways that wouldn’t be socially acceptable to do if it involved humans, the incentives to control AI are much stronger than controlling humans, etc.
I truly feel like Eliezer has reinvented a plan that OpenAI/Anthropic are already doing, except worse, which is deferring alignment work to future intelligences, and Eliezer doesn’t realize this, so the comments treat it as though it’s something new rather than an already done plan, just with AI swapped out for humans.
It’s not just coy, it’s reinventing an idea that’s already there, except worse, and he doesn’t tell you that if you swap the human for AI, it’s already being done.
Link for why AI is easier to control than humans below:
Yeah, I feel this is quite similar to OpenAI’s plan to defer alignment to future AI researchers, except worse, because if we grant that the plan proposed actually made the augmented humans stably aligned with our values, then it would be far easier to do scalable oversight, because we have a bunch of advantages around controlling AIs, like the fact that it would be socially acceptable to control AI in ways that wouldn’t be socially acceptable to do if it involved humans, the incentives to control AI are much stronger than controlling humans, etc.
I truly feel like Eliezer has reinvented a plan that OpenAI/Anthropic are already doing, except worse, which is deferring alignment work to future intelligences, and Eliezer doesn’t realize this, so the comments treat it as though it’s something new rather than an already done plan, just with AI swapped out for humans.
It’s not just coy, it’s reinventing an idea that’s already there, except worse, and he doesn’t tell you that if you swap the human for AI, it’s already being done.
Link for why AI is easier to control than humans below:
https://optimists.ai/2023/11/28/ai-is-easy-to-control/
fwiw, this seems false to me and not particularly related to what I was saying.