Leon Lang comments on evhub’s Shortform

Leon Lang 28 Dec 2024 16:10 UTC
17 points
8
This is a low effort comment in the sense that I don’t quite know what or whether you should do something different along the following lines, and I have substantial uncertainty.

That said:
1. I wonder whether Anthropic is partially responsible for an increased international race through things like Dario advocating for an entente strategy and talking positively about Leopold Aschenbrenner’s “situational awareness”. I wished to see more of an effort to engage with Chinese AI leaders to push for cooperation/coordination. Maybe it’s still possible to course-correct.
2. Alternatively I think that if there’s a way for Anthropic/Dario to communicate why you think an entente strategy is inevitable/desirable, in a way that seems honest and allows to engage with your models of reality, that might also be very helpful for the epistemic health of the whole safety community. I understand that maybe there’s no politically feasible way to communicate honestly about this, but maybe see this as my attempt to nudge you in the direction of openness.
More specifically:

(a) it would help to learn more about your models of how winning the AGI race leads to long-term security (I assume that might require building up a robust military advantage, but given the physical hurdles that Dario himself expects for AGI to effectively act in the world, it’s unclear to me what your model is for how to get that military advantage fast enough after AGI is achieved).

(b) I also wonder whether potential future developments in AI Safety and control might give us information that the transition period is really unsafe; eg., what if you race ahead and then learn that actually you can’t safely scale further due to risks of loss of control? At that point, coordinating with China seems harder than doing it now. I’d like to see a legible justification of your strategy that takes into account such serious possibilities.