Oops, looks like I was wrong about what you meant (ignore the edit). But yes, if you give a stupid thing lots of power you should expect bad outcomes. A car directed with zero intelligence is not a car sitting still, but precisely what you said was dangerous: a car having its controls blindly fiddled with. But if you just run a stupid program on a computer, it will never acquire power in the first place. Most decisions are neutral, unless they just happen to be plugged into something that has already been optimized to have large physical effects (like a bulldozer). Of those decisions that do have large effects, most will be destructive, but that’s exactly what we should expect from a stupid optimization process acting on something that has already been finely honed by a smart optimization process.
what does “could” mean?
Good question. I think it has something to do with simply defining some set of actions to be your “options”, and temporarily putting all your options on an equal footing, so that you end up with the one with the best consequences, rather than the one that seemed like the one you’d be most likely to choose. I don’t think it even has much to do with probabilities, because then you run into self-fulfilling prophesies—doing what you predicted you’d do, thereby justifying the prediction.
In this case, we want to measure how good an agent did, relative to how it could have done. That is, how good were the consequences of the option it chose, relative to its other options. I don’t see any reason to weight those options according to a probability distribution, unless you know what “half an option” means. And choosing a distribution poses huge problems. After all, we know the agent chose one of the options with probability 1.0, and all the others with probability 0.0.
Forest fires are definitely OPs under my intuitive concept. They consistently select a subset of possible future (burnt forests).
Well, you could just compare the rate of oxidation under a flame, to the average rate of oxidation of all surfaces (including those that happen to be on fire) within whichever reference class you prefer. (I think choosing a reference class (set of options) is just part of how you define the OP. And you just define the OP whichever way helps you understand the world best.)
Thanks for all your comments!
Is this actually helpful? I try to read up on the background for this stuff, but I never know if I’m just rehashing what’s already been discussed, and if so, whether reviewing that here would be useful to anyone.
Oops, looks like I was wrong about what you meant (ignore the edit). But yes, if you give a stupid thing lots of power you should expect bad outcomes. A car directed with zero intelligence is not a car sitting still, but precisely what you said was dangerous: a car having its controls blindly fiddled with. But if you just run a stupid program on a computer, it will never acquire power in the first place. Most decisions are neutral, unless they just happen to be plugged into something that has already been optimized to have large physical effects (like a bulldozer). Of those decisions that do have large effects, most will be destructive, but that’s exactly what we should expect from a stupid optimization process acting on something that has already been finely honed by a smart optimization process.
Good question. I think it has something to do with simply defining some set of actions to be your “options”, and temporarily putting all your options on an equal footing, so that you end up with the one with the best consequences, rather than the one that seemed like the one you’d be most likely to choose. I don’t think it even has much to do with probabilities, because then you run into self-fulfilling prophesies—doing what you predicted you’d do, thereby justifying the prediction.
In this case, we want to measure how good an agent did, relative to how it could have done. That is, how good were the consequences of the option it chose, relative to its other options. I don’t see any reason to weight those options according to a probability distribution, unless you know what “half an option” means. And choosing a distribution poses huge problems. After all, we know the agent chose one of the options with probability 1.0, and all the others with probability 0.0.
Well, you could just compare the rate of oxidation under a flame, to the average rate of oxidation of all surfaces (including those that happen to be on fire) within whichever reference class you prefer. (I think choosing a reference class (set of options) is just part of how you define the OP. And you just define the OP whichever way helps you understand the world best.)
Is this actually helpful? I try to read up on the background for this stuff, but I never know if I’m just rehashing what’s already been discussed, and if so, whether reviewing that here would be useful to anyone.