Thane Ruthenis comments on Current AIs Provide Nearly No Data Relevant to AGI Alignment

Thane Ruthenis 17 Dec 2023 18:24 UTC
2 points
0
This isn’t necessarily the ideal setup, but it’s also basically what the ChatGPT team does. So I don’t think there is significant added risk. If one accepts the thesis of your OP that ChatGPT is OK, this seems OK too
Oh, if we’re assuming this setup doesn’t have to be robust to AutoGPT being superintelligent and deciding to boil the oceans because of a misunderstood instruction, then yeah, that’s fine.
Once AutoGPT is running on a faster processor, I might choose to use AutoGPT more ambitiously
That’s the part that would exacerbate the issue where it sometimes misunderstands your instructions. If you’re using it for more ambitious tasks, or more often, then there are more frequent opportunities for misunderstanding, and their consequences are larger-scale. Which means that, to whichever extent it’s prone to misunderstanding you, that gets amplified, as does the damage the misunderstandings cause.
Cool, well maybe we should get alignment people to work at AutoGPT to influence the AutoGPT people to not develop dangerous capabilities then, by focusing on e.g. imitating experts :-)
Oh, sure, I’m not opposing that. It may not be the highest-value place for a given person to be, but it might be for some.