Thomas Kwa comments on Towards more cooperative AI safety strategies

Thomas Kwa Nov 4, 2024, 12:18 AM
4 points
1
Claim 2: The world has strong defense mechanisms against (structural) power-seeking.
I disagree with this claim. It seems pretty clear that the world has defense mechanisms against
- disempowering other people or groups
- breaking norms in the pursuit of power
But it is possible to be power-seeking in other ways. The Gates Foundation has a lot of money and wants other billionaires’ money for its cause too. It influences technology development. It has to work with dozens of governments, sometimes lobbying them. Normal think tanks exist to gain influence over governments. Harvard University, Jane Street, and Goldman Sachs recruit more elite students than all the EA groups and control more money than OpenPhil. Jane Street and Goldman Sachs guard private information worth billions of dollars. The only one with a negative reputation is Goldman Sachs, which is due to perceived greed rather than power-seeking per se. So why is there so much more backlash against AI safety? I think it basically comes down to a few factors:
- We are bending norms (billionaire funding for somewhat nebulous causes) and sometimes breaking them (FTX financial and campaign finance crimes)
- We are not able to credibly signal that we won’t disempower others.
  - MIRI wanted a pivotal act to happen, and under that plan nothing would stop MIRI from being world dictators
  - AI is inherently a technology with world-changing military and economic applications whose governance is unsolved
  - An explicitly consequentialist movement will take power by any means necessary, and people are afraid of that.
  - AI labs have incentives to safetywash, making people wary of safety messaging.
- The preexisting AI ethics and open-source movements think their cause is more important and x-risk is stealing attention.
- AI safety people are bad at diplomacy and communication, leading to perceptions that they’re the same as the AI labs or have some other sinister motivation.
That said, I basically agree with section 3. Legitimacy and competence are very important. But we should not confuse power-seeking—something the world has no opinion on—with what actually causes backlash.
- Richard_Ngo Nov 4, 2024, 1:50 AM
  4 points
  0
  Parent
  I think there’s something importantly true about your comment, but let me start with the ways I disagree. Firstly, the more ways in which you’re power-seeking, the more defense mechanisms will apply to you. Conversely, if you’re credibly trying to do a pretty narrow and widely-accepted thing, then there will be less backlash. So Jane Street is power-seeking in the sense of trying to earn money, but they don’t have much of a cultural or political agenda, they’re not trying to mobilize a wider movement, and earning money is a very normal thing for companies to do, it makes them one of thousands of comparably-sized companies. (Though note that there is a lot of backlash against companies in general, which are perceived to have too much power. This leads a wide swathe of people, especially on the left, and especially in Europe, to want to greatly disempower companies because they don’t trust them.)
  Meanwhile the Gates Foundation has a philanthropic agenda, but like most foundations tries to steer clear of wider political issues, and also IIRC tries to focus on pretty object-level and widely-agreed-to-be-good interventions. Even so, it’s widely distrusted and feared, and Gates has become a symbol of hated global elites, to the extent where there are all sorts of conspiracy theories about him. That’d be even worse if the foundation were more political.
  Lastly, it seems a bit facile to say that everyone hates Goldman due to “perceived greed rather than power-seeking per se”. A key problem is that people think of the greed as manifesting through political capture, evading regulatory oversight, deception, etc. That’s part of why it’s harder to tar entrepreneurs as greedy: it’s just much clearer that their wealth was made in legitimate ways.
  Now the sense in which I agree: I think that “gaining power triggers to defense mechanisms” is a good first pass, but also we definitely want a more mechanistic explanation of what the defense mechanisms are, what triggers them, etc—in particular so we don’t just end up throwing our hands in the air and concluding that doing anything is hopeless and scary. And I also agree that your list is a good start. So maybe I’d just want to add to it stuff like:
  - having a broad-ranging political agenda (that isn’t near-universally agreed to be good)
  - having non-transparent interactions with many other powerful actors
  - having open-ended scope to expand
  And maybe a few others (open to more suggestions).