AI is pretty safe: unaligned AGI has a mere 7% chance of causing doom, plus a further 7% chance of causing short term lock-in of something mediocre
Your opponent risks bad lock-in: If there’s a ‘lock-in’ of something mediocre, your opponent has a 5% chance of locking in something actively terrible, whereas you’ll always pick good mediocre lock-in world (and mediocre lock-ins are either 5% as good as utopia, −5% as good)
Your opponent risks messing up utopia: In the event of aligned AGI, you will reliably achieve the best outcome, whereas your opponent has a 5% chance of ending up in a ‘mediocre bad’ scenario then too.
Safety investment obliterates your chance of getting to AGI first: moving from no safety at all to full safety means you go from a 50% chance of being first to a 0% chance
Your opponent is racing: Your opponent is investing everything in capabilities and nothing in safety
Safety work helps others at a steep discount: your safety work contributes 50% to the other player’s safety
This is more a personal note / call for somebody to examine my thinking processes, but I’ve been thinking really hard about putting hardware security methods to work. Specifically, spreading knowledge far and wide about how to:
allow hardware designers / manufacturers to have easy, total control over who uses their product for what for how much throughout the supply chain
make it easy to secure AI related data (including e.g. model weights and architecture) and difficult to steal.
This sounds like it would improve every aspect of the racey-environment conditions, except:
Your opponent is racing: Your opponent is investing everything in capabilities and nothing in safety
The exact effect of this is unclear. On the one hand, if racey, zero-sum thinking actors learn that you’re trying to “restrict” or “control” AI hardware supply, they’ll totally amp up their efforts. On the other hand, you’ve also given them one more thing to worry about (their hardware supply).
I would love to get some frames on how to think about this.
This is more a personal note / call for somebody to examine my thinking processes, but I’ve been thinking really hard about putting hardware security methods to work. Specifically, spreading knowledge far and wide about how to:
allow hardware designers / manufacturers to have easy, total control over who uses their product for what for how much throughout the supply chain
make it easy to secure AI related data (including e.g. model weights and architecture) and difficult to steal.
This sounds like it would improve every aspect of the racey-environment conditions, except:
The exact effect of this is unclear. On the one hand, if racey, zero-sum thinking actors learn that you’re trying to “restrict” or “control” AI hardware supply, they’ll totally amp up their efforts. On the other hand, you’ve also given them one more thing to worry about (their hardware supply).
I would love to get some frames on how to think about this.