Zach Stein-Perlman comments on Nearcast-based “deployment problem” analysis

Zach Stein-Perlman 22 Sep 2022 16:53 UTC
4 points
−2
- Deals with other companies. Magma might be able to reduce some of the pressure to “race” by making explicit deals with other companies doing similar work on developing AI systems, up to and including mergers and acquisitions (but also including more limited collaboration and information sharing agreements).
  Benefits of such deals might include (a) enabling freer information sharing and collaboration; (b) being able to prioritize alignment with less worry that other companies are incautiously racing ahead; (c) creating incentives (e.g., other labs’ holding equity in Magma) to cooperate rather than compete; and thus (d) helping Magma get more done (more alignment work, more robustly staying ahead of other key actors in terms of the state of its AI systems).
It’s often said that there exists “pressure to race.” We can break down this pressure, e.g. into
1. The extent to which Magma’s preferences are better satisfied if it has lots of power, relative to a competitor having lots of power (or: the extent to which the universe would look better-according-to-Magma if Magma was in charge, relative to a competitor being in charge)
2. Factors like wanting to make progress quickly or beat others or be first, independent of how being first would make the universe better-according-to-Magma.
For 1, it’s not clear that there exists any real force here—I don’t know what (e.g.) DeepMind or OpenAI would do long-term with lots of power, and I don’t think they do either, much less that they each believe that they would do something better than the other. (Perhaps causing them to think more carefully about what they’d do with power would cause them to realize that there are attractors like “do CEV” such that they would very likely do something very similar to the other, and thus see no reason to race… but existing race-y-ness doesn’t seem to be due to either specific beliefs or uncertainty about what others would do with lots of power...) For 2, these factors feel closer to mere-psychological than deep-strategic, and so can plausibly be overcome with other psychological factors or incentives that are tiny relative to the cosmic endowment...
We should think carefully about what actually causes racing (and how to negate or counterbalance those factors).
(Of course if a particular lab doesn’t believe that safety is a problem, it won’t slow down for safety—but we can try to solve the futures where all leading labs actually care about safety, solve the futures where they don’t in other ways, and try to nudge the latter toward the former.)