Yudkowsky’s measure still feels weird to me in ways that don’t seem to apply to length, in the sense that length feels much more to me like a measure of territory-shaped things, and Yudkowsky’s measure of optimization power seems much more map-shaped (which I think Garrett did a good job of explicating). Here’s how I would phrase it:
Yudkowsky wants to measure optimization power relative to a utility function: take the rank of the state you’re in, take the total number of all states that have equal or greater rank, and then divide that by the total number of possible states. There are two weird things about this measure, in my opinion. The first is that it’s behaviorist (what I think Garrett was getting at about distinguishing between atom and non-atom worlds). The second is that it seems like a tricky problem to coherently talk about “all possible states.”
So, like, let’s say that we have two buttons next to each other. Press one button and get the world that maxes out your utility function. Press the other and, I don’t know, you get a taco. According to Yudkowsky’s measure, pressing one of these buttons is evidence of vastly more optimization power than the other even though, intuitively, these seem about “equally hard” from the agents perspective.
This is what I mean about it being “behaviorist”—with this measure you only care about which world state attains (and how well that state ranks), but not how you got to that state. It seems clear to me that both of these are relevant in measuring optimization power. Like, conditioned on certain environments some things become vastly easier or harder. Getting a taco is easy in Berkeley, getting a taco is hard in a desert. And if your valuation of taco utility doesn’t change, then your optimization power can end up being largely a function of your environment, and that feels… a bit weird?
On the flip side, it’s also weird that it can vary so much based on the utility function. If someone is maximally happy watching TV at home all of the time, I feel hesitant to say that they have a ton of optimization power?
The thing that feels lacking in both of these cases, to me, is the ability to talk about how hard these goals are to achieve in reality (as a function of agent and environment). Because the difficulty of achieving the same world state can vary dramatically based on the environment and the agent. Grabbing a water bottle is trivial if there is one next to me, grabbing one if I have to construct it out of thermodynamic equilibrium is vastly harder. And importantly, the difference here isn’t in my utility function, but in how the environment shapes the difficulty of my goals, and in my ability as an agent to do these different things. I would like to say that the former uses less optimization power than the latter, and that this is in part a function of the territory.
You can perhaps rescue this by using a non-uniform prior over “all possible states,” and talk about how many bits it takes to move from that distribution to the distribution we want. So like, when I’m in the desert, the state “have a taco” is less likely than when I’m in Berkeley, therefore it takes more optimization power to get there. But then we run into some other problems.
The first is what Garrett points out, that probabilities are map things, and it’s a bit… weird for our measure of a (presumably) territory thing to be dependent on them. It’s the same sort of trickiness that I don’t feel we’ve properly sorted out in thermodynamics—namely, that if we take the existence of macrostates to be reflections of our uncertainty (as Jaynes does), then it seems we are stuck saying something to the effect of “ice cubes melt because we become more uncertain of their state,” which seems… wrong.
The second is that I claim that figuring out the “default” distribution is the entire problem, basically. Like, how do I know that a taco appearing in the desert is less likely than it is in Berkeley? How do I know what grabbing a bottle is more likely when there is a bottle rather than an equilibrium soup? Constructing the “correct” distribution, to the extent that makes sense, over the default outcomes seems to me to be the entire problem of figuring out what makes some tasks easier or harder, which is close to what we were trying to measure in the first place.
I do expect there is a way to talk about the correct default distribution, but that it’s tricky, and that part of why it’s so tricky is because it’s a function of both map and territory shaped things. In any case, I don’t think you get a sensible measure of optimization or other agency-terms if you can’t talk about them as things-in-the-territory (which neither of these measures really do); I’d really like to be able to. I also agree that an explanation (or measure) of atoms as Garrett laid out is unsatisfying; I feel unsatisfied here too, for similar reasons.
Small note: Yudkowsky definition is about a preference order not a utility function. Indeed, this was half the reason we did the project in the first place !
The first is what Garrett points out, that probabilities are map things, and it’s a bit… weird for our measure of a (presumably) territory thing to be dependent on them. It’s the same sort of trickiness that I don’t feel we’ve properly sorted out in thermodynamics—namely, that if we take the existence of macrostates to be reflections of our uncertainty (as Jaynes does), then it seems we are stuck saying something to the effect of “ice cubes melt because we become more uncertain of their state,” which seems… wrong.
For this part, my answer is Kolmogorov complexity. An ice cube has lower K-complexity than the same amount of liquid water, which is a fact about the territory and not our maps. (And if a state has lower K-complexity, it’s more knowable; you can observe fewer bits, and predict more of the state.)
One of my ongoing threads is trying to extend this to optimization. I think a system is being objectively optimized if the state’s K-complexity is being reduced. But I’m still working through the math.
Yeah… so these are reasonable thoughts of the kind that I thought through a bunch when working on this project, and I do think they’re resolvable, but to do so I’d basically be writing out my optimization sequence.
I agree with Alexander below though, a key part of optimization is that it is not about utility functions, it is only about a preference ordering. Utility functions are about choosing between lotteries, which is a thing that agents do, whereas optimization is just about going up an ordering. Optimization is a thing that a whole system does, which is why there’s no agent/environment distinction. Sometimes, only a part of the system is responsible for the optimization, and in that case you can start to talk about separating them, and then you can ask questions about what that part would do if it were placed in other environments.
Yudkowsky’s measure still feels weird to me in ways that don’t seem to apply to length, in the sense that length feels much more to me like a measure of territory-shaped things, and Yudkowsky’s measure of optimization power seems much more map-shaped (which I think Garrett did a good job of explicating). Here’s how I would phrase it:
Yudkowsky wants to measure optimization power relative to a utility function: take the rank of the state you’re in, take the total number of all states that have equal or greater rank, and then divide that by the total number of possible states. There are two weird things about this measure, in my opinion. The first is that it’s behaviorist (what I think Garrett was getting at about distinguishing between atom and non-atom worlds). The second is that it seems like a tricky problem to coherently talk about “all possible states.”
So, like, let’s say that we have two buttons next to each other. Press one button and get the world that maxes out your utility function. Press the other and, I don’t know, you get a taco. According to Yudkowsky’s measure, pressing one of these buttons is evidence of vastly more optimization power than the other even though, intuitively, these seem about “equally hard” from the agents perspective.
This is what I mean about it being “behaviorist”—with this measure you only care about which world state attains (and how well that state ranks), but not how you got to that state. It seems clear to me that both of these are relevant in measuring optimization power. Like, conditioned on certain environments some things become vastly easier or harder. Getting a taco is easy in Berkeley, getting a taco is hard in a desert. And if your valuation of taco utility doesn’t change, then your optimization power can end up being largely a function of your environment, and that feels… a bit weird?
On the flip side, it’s also weird that it can vary so much based on the utility function. If someone is maximally happy watching TV at home all of the time, I feel hesitant to say that they have a ton of optimization power?
The thing that feels lacking in both of these cases, to me, is the ability to talk about how hard these goals are to achieve in reality (as a function of agent and environment). Because the difficulty of achieving the same world state can vary dramatically based on the environment and the agent. Grabbing a water bottle is trivial if there is one next to me, grabbing one if I have to construct it out of thermodynamic equilibrium is vastly harder. And importantly, the difference here isn’t in my utility function, but in how the environment shapes the difficulty of my goals, and in my ability as an agent to do these different things. I would like to say that the former uses less optimization power than the latter, and that this is in part a function of the territory.
You can perhaps rescue this by using a non-uniform prior over “all possible states,” and talk about how many bits it takes to move from that distribution to the distribution we want. So like, when I’m in the desert, the state “have a taco” is less likely than when I’m in Berkeley, therefore it takes more optimization power to get there. But then we run into some other problems.
The first is what Garrett points out, that probabilities are map things, and it’s a bit… weird for our measure of a (presumably) territory thing to be dependent on them. It’s the same sort of trickiness that I don’t feel we’ve properly sorted out in thermodynamics—namely, that if we take the existence of macrostates to be reflections of our uncertainty (as Jaynes does), then it seems we are stuck saying something to the effect of “ice cubes melt because we become more uncertain of their state,” which seems… wrong.
The second is that I claim that figuring out the “default” distribution is the entire problem, basically. Like, how do I know that a taco appearing in the desert is less likely than it is in Berkeley? How do I know what grabbing a bottle is more likely when there is a bottle rather than an equilibrium soup? Constructing the “correct” distribution, to the extent that makes sense, over the default outcomes seems to me to be the entire problem of figuring out what makes some tasks easier or harder, which is close to what we were trying to measure in the first place.
I do expect there is a way to talk about the correct default distribution, but that it’s tricky, and that part of why it’s so tricky is because it’s a function of both map and territory shaped things. In any case, I don’t think you get a sensible measure of optimization or other agency-terms if you can’t talk about them as things-in-the-territory (which neither of these measures really do); I’d really like to be able to. I also agree that an explanation (or measure) of atoms as Garrett laid out is unsatisfying; I feel unsatisfied here too, for similar reasons.
Small note: Yudkowsky definition is about a preference order not a utility function. Indeed, this was half the reason we did the project in the first place !
For this part, my answer is Kolmogorov complexity. An ice cube has lower K-complexity than the same amount of liquid water, which is a fact about the territory and not our maps. (And if a state has lower K-complexity, it’s more knowable; you can observe fewer bits, and predict more of the state.)
One of my ongoing threads is trying to extend this to optimization. I think a system is being objectively optimized if the state’s K-complexity is being reduced. But I’m still working through the math.
Yeah… so these are reasonable thoughts of the kind that I thought through a bunch when working on this project, and I do think they’re resolvable, but to do so I’d basically be writing out my optimization sequence.
I agree with Alexander below though, a key part of optimization is that it is not about utility functions, it is only about a preference ordering. Utility functions are about choosing between lotteries, which is a thing that agents do, whereas optimization is just about going up an ordering. Optimization is a thing that a whole system does, which is why there’s no agent/environment distinction. Sometimes, only a part of the system is responsible for the optimization, and in that case you can start to talk about separating them, and then you can ask questions about what that part would do if it were placed in other environments.