Should we even be looking for “arguments for” or “arguments against” a particular component of AI risk? I’d much rather examine evidence that distinguishes models, and then look at what those models predict in terms of AI risk. As far as I can tell we do have some weak evidence, but prospects of stronger evidence in any direction soon are poor since our best AIs are still rather incapable in many respects.
This is concerning, because pretty much any variation of instrumental convergence implies some rather serious risks to humanity, though even without it there may still be major risks from AGI. I’m not convinced that any version of instrumental convergence is actually true, but there seem to be far too many people simply assuming that it’s false without evidence.
Should we even be looking for “arguments for” or “arguments against” a particular component of AI risk? I’d much rather examine evidence that distinguishes models, and then look at what those models predict in terms of AI risk. As far as I can tell we do have some weak evidence, but prospects of stronger evidence in any direction soon are poor since our best AIs are still rather incapable in many respects.
This is concerning, because pretty much any variation of instrumental convergence implies some rather serious risks to humanity, though even without it there may still be major risks from AGI. I’m not convinced that any version of instrumental convergence is actually true, but there seem to be far too many people simply assuming that it’s false without evidence.