At first I particularly liked the idea of identifying systems with “an optimizer” as those which are robust to changes in the object of optimization, but brittle with respect to changes in the engine of optimization.
On reflection, it seems like a useful heuristic but not a reliable definition. A counterexample: suppose we do manage to build a robust AI which maximizes some utility function. One desirable property of such an AI is that it’s robust to e.g. one of its servers going down or corrupted data on a hard drive; the AI itself should be robust to as many interventions as possible. Ideally it would even be robust to minor bugs in its own source code. Yet it still seems like the AI is the “engine”, and it optimizes the rest of the world.
Yeah I agree that duality is not a good measure of whether a system contains something like an AI. There is one kind of AI that we can build that is highly dualistic. Most present-day AI systems are quite dualistic, because they are predicated on having some robust compute infrastructure that is separate from and mostly unperturbed by the world around it. But there is every reason to go beyond these dualistic designs, for precisely the reason you point to: such systems do tend to be somewhat brittle.
I think it’s quite feasible to build highly robust AI systems, although doing so will likely require more than just hardening (making it really unlikely for the system to be perturbed). What we really want is an AI system where the core AI itself tends to evolve back to a stable configuration despite perturbations to its core infrastructure. My sense is that this will actually require a significant shift in how we think about AI—specifically moving from the agent model to something that captures what is good and helpful in the agent model but discards the dualistic view of things.
At first I particularly liked the idea of identifying systems with “an optimizer” as those which are robust to changes in the object of optimization, but brittle with respect to changes in the engine of optimization.
On reflection, it seems like a useful heuristic but not a reliable definition. A counterexample: suppose we do manage to build a robust AI which maximizes some utility function. One desirable property of such an AI is that it’s robust to e.g. one of its servers going down or corrupted data on a hard drive; the AI itself should be robust to as many interventions as possible. Ideally it would even be robust to minor bugs in its own source code. Yet it still seems like the AI is the “engine”, and it optimizes the rest of the world.
Yeah I agree that duality is not a good measure of whether a system contains something like an AI. There is one kind of AI that we can build that is highly dualistic. Most present-day AI systems are quite dualistic, because they are predicated on having some robust compute infrastructure that is separate from and mostly unperturbed by the world around it. But there is every reason to go beyond these dualistic designs, for precisely the reason you point to: such systems do tend to be somewhat brittle.
I think it’s quite feasible to build highly robust AI systems, although doing so will likely require more than just hardening (making it really unlikely for the system to be perturbed). What we really want is an AI system where the core AI itself tends to evolve back to a stable configuration despite perturbations to its core infrastructure. My sense is that this will actually require a significant shift in how we think about AI—specifically moving from the agent model to something that captures what is good and helpful in the agent model but discards the dualistic view of things.