Hmm… I guess I’m skeptical that we can train very specialized “planning” systems? Making superhuman plans of the sort that could counter those of an agentic superintelligence seems like it requires both a very accurate and domain-general model of the world as well as a search algorithm to figure out which plans actually accomplish a given goal given your model of the world. This seems extremely close in design space to a more general agent. While I think we could have narrow systems which outperform the misaligned superintelligence in other domains such as coding or social manipulation, general long-term planning seems likely to me to be the most important skill involved in taking over the world or countering an attempt to do so.
In the intervening period, I’ve updated towards your position, though I still think it is risky to build systems with capabilities that open ended which are that close to agents in design space
Hmm… I guess I’m skeptical that we can train very specialized “planning” systems? Making superhuman plans of the sort that could counter those of an agentic superintelligence seems like it requires both a very accurate and domain-general model of the world as well as a search algorithm to figure out which plans actually accomplish a given goal given your model of the world. This seems extremely close in design space to a more general agent. While I think we could have narrow systems which outperform the misaligned superintelligence in other domains such as coding or social manipulation, general long-term planning seems likely to me to be the most important skill involved in taking over the world or countering an attempt to do so.
Well, simulator type systems like GPT-3 do not become agents if amplified to superhuman cognition.
Simulators could be used to generate/evaluate superhuman plans without being agents with independent objectives of their own.
In the intervening period, I’ve updated towards your position, though I still think it is risky to build systems with capabilities that open ended which are that close to agents in design space