I don’t understand why it helps that much if instrumental convergence isn’t expected. All it takes is one actor to deliberately make a bad agentic AI and you have all the problems, but with no “free energy” being taken out by slightly bad, less powerful AI beforehand that would be there if instrumental convergence happened. Slow takeoff seems to me to make much more of a difference.
I actually don’t think the distinction between slow and fast takeoff matters too much here, at least compared to what the lack of instrumental convergence offers us. The important part here is that AI misuse is a real problem, but this is importantly much more solvable, because misuse isn’t as convergent as the hypothesized instrumental convergence is. It matters, but this is a problem that relies on drastically different methods, and importantly still reduces the danger expected from AI.
I don’t understand why it helps that much if instrumental convergence isn’t expected. All it takes is one actor to deliberately make a bad agentic AI and you have all the problems, but with no “free energy” being taken out by slightly bad, less powerful AI beforehand that would be there if instrumental convergence happened. Slow takeoff seems to me to make much more of a difference.
I actually don’t think the distinction between slow and fast takeoff matters too much here, at least compared to what the lack of instrumental convergence offers us. The important part here is that AI misuse is a real problem, but this is importantly much more solvable, because misuse isn’t as convergent as the hypothesized instrumental convergence is. It matters, but this is a problem that relies on drastically different methods, and importantly still reduces the danger expected from AI.