carboniferous_umbraculum comments on Can we get an AI to “do our alignment homework for us”?

carboniferous_umbraculum 29 Feb 2024 9:23 UTC
1 point
0
I for one would find it helpful if you included a link to at least one place that Eliezer had made this claim just so we can be sure we’re on the same page.

Roughly speaking, what I have in mind is that there are at least two possible claims. One is that ‘we can’t get AI to do our alignment homework’ because by the time we have a very powerful AI that can solve alignment homework, it is already too dangerous to use the fact it can solve the homework as a safety plan. And the other is the claim that there’s some sort of ‘intrinsic’ reason why an AI built by humans could never solve alignment homework.