Yep, that makes sense to wonder about here. I’ve got a couple main responses.
The first is that I’m taking a clear stand on separating optimization and agency. Optimization is the weaker thing, and includes balls rolling into basins, and so solar flares or whatever are probably interpretable as something like that. The really dangerous AIs will be agents. Agents are type of optimizer, and so I’m trying to understand the base class (optimization) before moving on to the more complicated subclass (agents). A lot of people are working directly on understanding agents first, and I think that’s great, but I suspect things would be a lot clearer if we all agreed on a framework for optimization first.
The second things is that the framework in the above draft suffers from a problem, similar to the idea of VNM rationality or whatever, where you can just take the trajectory that the system ended up taking, declare that to be the state ordering you’re measuring optimization with respect to, and then declare that the system is a strong optimizer. I think this is important to address, but is also a solved problem. I talk about it in this draft.
Yep, that makes sense to wonder about here. I’ve got a couple main responses.
The first is that I’m taking a clear stand on separating optimization and agency. Optimization is the weaker thing, and includes balls rolling into basins, and so solar flares or whatever are probably interpretable as something like that. The really dangerous AIs will be agents. Agents are type of optimizer, and so I’m trying to understand the base class (optimization) before moving on to the more complicated subclass (agents). A lot of people are working directly on understanding agents first, and I think that’s great, but I suspect things would be a lot clearer if we all agreed on a framework for optimization first.
The second things is that the framework in the above draft suffers from a problem, similar to the idea of VNM rationality or whatever, where you can just take the trajectory that the system ended up taking, declare that to be the state ordering you’re measuring optimization with respect to, and then declare that the system is a strong optimizer. I think this is important to address, but is also a solved problem. I talk about it in this draft.