But now every system is an optimizing system, because we can always come up with some preference ordering that explains a system as an optimizing system.
Hmmm, I’m a little uncertain about whether this is the case. E.g. suppose you have a box with a rock in it, in an otherwise empty universe. Nothing happens. You perturb the system by moving the rock outside the box. Nothing else happens in response. How would you describe this as an optimising system? (I’m assuming that we’re ruling out the trivial case of a constant utility function; if not, we should analogously include the trivial case of all states being target states).
As a more general comment: I suspect that what starts to happen after you start digging into what “perturbation” means, and what counts as a small or big perturbation, is that you run into the problem that a *tiny* perturbation can transform a highly optimising system to a non-optimising system (e.g. flicking the switch to turn off the AGI). In order to quantify size of perturbations in an interesting way, you need the pre-existing concept of which subsystems are doing the optimisation.
My preferred solution to this is just to stop trying to define optimisation in terms of *outcomes*, and start defining it in terms of *computation* done by systems. E.g. a first attempt might be: an agent is an optimiser if it does planning via abstraction towards some goal. Then we can zoom in on what all these words mean, or what else we might need to include/exclude (in this case, we’ve ruled out evolution, so we probably need to broaden it). The broad philosophy here is that it’s better to be vaguely right than precisely wrong. Unfortunately I haven’t written much about this approach publicly—I briefly defend it in a comment thread on this post though.
FYI: I think something got messed up with this link. The text of the link is a valid url, but it links to a mangled one (s.t. if you click it you get a 404 error).
suppose you have a box with a rock in it, in an otherwise empty universe [...]
Yes you’re right, this system would be described by a constant utility function, and yes this is analogous to the case where the target configuration set contains all configurations, and yes this should not be considered optimization. In the target set formulation, we can measure the degree of optimization by the size of the target set relative to the size of the basin of attraction. In your rock example, the sets have the same size, so it would make sense to say that the degree of optimization is zero.
This discussion is updating me in the direction that a preference ordering formulation is possible, but that we need some analogy for “degree of optimization” that captures how “tight” or “constrained” the system’s evolution is relative to the size of the basin of attraction. We need a way to say that a constant utility function corresponds to a degree of optimization equal to zero. We also need a way to handle the case where our utility function assigns utility proportional to entropy, so again we can describe all physical systems as optimizing systems and thermodynamics ensures that we are correct. This utility function would be extremely flat and wide, with most configurations receiving near-identical utility (since the high entropy configurations constitute the vast majority of all possible configurations). I’m sure there is some way to quantify this—do you know of any appropriate measure?
The challenge here is that in order to actually deal with the case you mentioned originally—the goal of moving as fast as possible—we need a measure that is not based on the size or curvature of some local maxima of the utility function. If we are working with local maxima then we are really still working with systems that evolve towards a specific destination (although there still may be advantages to thinking this way rather than in terms of a binary set).
My preferred solution to this is just to stop trying to define optimisation in terms of outcomes, and start defining it in terms of computation done by systems
Hmmm, I’m a little uncertain about whether this is the case. E.g. suppose you have a box with a rock in it, in an otherwise empty universe. Nothing happens. You perturb the system by moving the rock outside the box. Nothing else happens in response. How would you describe this as an optimising system? (I’m assuming that we’re ruling out the trivial case of a constant utility function; if not, we should analogously include the trivial case of all states being target states).
As a more general comment: I suspect that what starts to happen after you start digging into what “perturbation” means, and what counts as a small or big perturbation, is that you run into the problem that a *tiny* perturbation can transform a highly optimising system to a non-optimising system (e.g. flicking the switch to turn off the AGI). In order to quantify size of perturbations in an interesting way, you need the pre-existing concept of which subsystems are doing the optimisation.
My preferred solution to this is just to stop trying to define optimisation in terms of *outcomes*, and start defining it in terms of *computation* done by systems. E.g. a first attempt might be: an agent is an optimiser if it does planning via abstraction towards some goal. Then we can zoom in on what all these words mean, or what else we might need to include/exclude (in this case, we’ve ruled out evolution, so we probably need to broaden it). The broad philosophy here is that it’s better to be vaguely right than precisely wrong. Unfortunately I haven’t written much about this approach publicly—I briefly defend it in a comment thread on this post though.
FYI: I think something got messed up with this link. The text of the link is a valid url, but it links to a mangled one (s.t. if you click it you get a 404 error).
That’s weird; thanks for the catch. Fixed.
Yes you’re right, this system would be described by a constant utility function, and yes this is analogous to the case where the target configuration set contains all configurations, and yes this should not be considered optimization. In the target set formulation, we can measure the degree of optimization by the size of the target set relative to the size of the basin of attraction. In your rock example, the sets have the same size, so it would make sense to say that the degree of optimization is zero.
This discussion is updating me in the direction that a preference ordering formulation is possible, but that we need some analogy for “degree of optimization” that captures how “tight” or “constrained” the system’s evolution is relative to the size of the basin of attraction. We need a way to say that a constant utility function corresponds to a degree of optimization equal to zero. We also need a way to handle the case where our utility function assigns utility proportional to entropy, so again we can describe all physical systems as optimizing systems and thermodynamics ensures that we are correct. This utility function would be extremely flat and wide, with most configurations receiving near-identical utility (since the high entropy configurations constitute the vast majority of all possible configurations). I’m sure there is some way to quantify this—do you know of any appropriate measure?
The challenge here is that in order to actually deal with the case you mentioned originally—the goal of moving as fast as possible—we need a measure that is not based on the size or curvature of some local maxima of the utility function. If we are working with local maxima then we are really still working with systems that evolve towards a specific destination (although there still may be advantages to thinking this way rather than in terms of a binary set).
Nice—I’d love to hear more about this