Noosphere89 comments on Thoughts on “AI is easy to control” by Pope & Belrose

Noosphere89 3 Dec 2023 1:09 UTC
6 points
0

It’s odd that you understood me as talking about misuse. Well, I guess I’m not sure how you’re using the term “misuse”. If Person X doesn’t follow best practices when training an AI, and they wind up creating an out-of-control misaligned AI that eventually causes human extinction, and if Person X didn’t want human extinction (as most people don’t), then I wouldn’t call that “misuse”. Would you? I I would call it a “catastrophic accident” or something like that. I did mention in the OP that some people think human extinction is perfectly fine, and I guess if Person X is one of those people, then it would be misuse. So I suppose I brought up both accidents and misuse.

Perhaps, but I want to create a distinction between “People train AI to do good things and aren’t able to control AI for a variety of reasons, and thus humans are extinct, made into slaves, etc.” and “People train AI to do stuff like bio-terrorism, explicitly gaining power, etc, and thus humans are extinct, made into slaves, etc.” Because the optimal responses look very different if we are in a world where control is easy but preventing misuse is hard, vs if controlling AI is hard in itself, because AI safety actions as currently done are optimized far more for the case where controlling AI by humans is hard or impossible, but if this is not the case, then pretty drastic changes would need to be made in how AI safety organizations do their work, especially their nascent activist wing, and instead focus on different policies.

People who I think are highly prone to not following best practices to keep AI under control, even if such best practices exist, include people like Yann LeCun, Larry Page, Rich Sutton, and Jürgen Schmidhuber, who are either opposed to AI alignment on principle, or are so bad at thinking about the topic of AI x-risk that they spout complete and utter nonsense. (example). That’s not a problem solvable by Know Your Customer laws, right? These people (and many more) are not the customers, they are among the ones doing state-of-the-art AI R&D.

In general, the more people are technically capable of making an out-of-control AI agent, the more likely that one of them actually will, even if best practices exist to keep AI under control. People like to experiment with new approaches, etc., right? And I expect the number of such people to go up and up, as algorithms improve etc. See here.

https://www.lesswrong.com/posts/C5guLAx7ieQoowv3d/lecun-s-a-path-towards-autonomous-machine-intelligence-has-1

https://www.lesswrong.com/posts/LFNXiQuGrar3duBzJ/what-does-it-take-to-defend-the-world-against-out-of-control#3_4_Very_few_relevant_actors__and_great_understanding_of_AGI_safety

I note that your example of them spouting nonsense only has the full force it does if we assume that controlling AI is hard, which is what we are debating right now.

Onto my point here, my fundamental claim is that there’s a counterforce to what you describe to the claim that there will be more and more people being able to make an out of control AI agent, and that is the profit motive.

Hear me out, this will actually make sense here.

Basically, the main reasons that the profit motive is positive for safety is that the negative externalities of AI being not controllable is far, far more internalized to the person who’s making the AI, since they also suffer severe losses in profitability without getting any profit from the AI. This is combined with the fact that they also have profit in developing safe control techniques, assuming that it isn’t very hard, since the safe techniques will probably get used in government standards for releasing AI, and there’s already at least some fairly severe barriers to any release of misaligned AGI, at least assuming that there’s no treacherous turn/deceptive alignment over weeks-months.

Jaime Sevilla basically has a shorter tweet on why this is the case, and I also responded to LInch making something like the points above:

https://archive.is/wPxUV

https://twitter.com/Jsevillamol/status/1722675454153252940

https://archive.is/3q0RG

https://twitter.com/SharmakeFarah14/status/1726351522307444992