Of course people don’t necessarily shoot themselves merely because they consider doing so. I don’t see what that has to do with the issue, though. I expect some kind of misunderstanding has arisen somewhere.
Of course people don’t necessarily shoot themselves merely because they consider doing so. I don’t see what that has to do with the issue, though.
What it has to do with the issue is: the procedure Eliezer_Yudkowsky was criticizing would permit self-justifying actions such as shooting yourself because you consider it, which was why he pointed out the correct constraint an agent should follow.
What I said was that if agents already knew what they were going to do, they would get on and do it—rather than doing more calculations about the issue.
That simply doesn’t imply they will commit suicide if someone tells them they will do that—because they don’t have to believe what people tell them.
I think people here are using a very weak definition of “know”. If I know that I will X, then I will X. If it later turns out that I didn’t X, then I did not actually know that I will X. That I will X is logically implied by anyone knowing that I will X.
I’m not sure how anyone could achieve actual knowledge of one’s future actions, but I agree that there’s little reason to deliberate over them if one does.
Often you know what your actions are going to be a split-second before you take them. After you have decided what to do but before the motor signals go out. That’s the type of advance-knowledge I was considering.
The information isn’t certain—but then NO information is certain.
it doesn’t imply that agent’s will kill themselves when you tell them they were going to, it implies that they can if you telling them is last scrap of bayesian evidence necessary to move the agent to act in that way. EY’s point is that agents have to figure out what maximizes utility, not predict what they will do because the self-reference causes problems.
E.g., we don’t want a calculator that outputs “whatever I output for 2+2” we want a calculator to output the answer to 2+2. The former is true no matter what the calculator outputs, the latter has a single answer. Similarly, there is only one action which maximizes utility (or at least a subset of all possible actions). But if an agent takes the action that it predicts it will take, it’s predictions are true by definition, so any action suffices.
I think real agents act as though they believe they have free will.
That means that they rate their own decisions to act as determining their actions, and advice from others about how they think they are going to act as being attempts to manipulate them. Another agent encouraging you to behave in a particular way isn’t usually evidence you should update on, it’s a manipulation attempt—and agents are smart enough to know the difference.
Are there circumstances under which you should update on such evidence? Yes, if the agent is judged to be both knowledgeable and trustworthy—but that is equally true if you employ practically any sensible decision process.
Re: if an agent takes the action that it predicts it will take, it’s predictions are true by definition, so any action suffices.
Agents do take the actions they predict they will take—it seems like a matter of fact to me. However, that’s not the criteria they use as the basis for making their predictions in the first place. I didn’t ever claim it was—such a claim would be very silly.
Agents do take the actions they predict they will take—it seems like a matter of fact to me. However, that’s not the criteria they use as the basis for making their predictions in the first place. I didn’t ever claim it was—such a claim would be very silly.
Indeed. You originally wrote:
The agent doesn’t know what action it is going to take. If it did, it would just take the action—not spend time calculating the consequences of its various possible actions.
You language is somewhat vague here, which is why EY clarified.
Of course people don’t necessarily shoot themselves merely because they consider doing so. I don’t see what that has to do with the issue, though. I expect some kind of misunderstanding has arisen somewhere.
What it has to do with the issue is: the procedure Eliezer_Yudkowsky was criticizing would permit self-justifying actions such as shooting yourself because you consider it, which was why he pointed out the correct constraint an agent should follow.
Er, no it wouldn’t.
What I said was that if agents already knew what they were going to do, they would get on and do it—rather than doing more calculations about the issue.
That simply doesn’t imply they will commit suicide if someone tells them they will do that—because they don’t have to believe what people tell them.
I think people here are using a very weak definition of “know”. If I know that I will X, then I will X. If it later turns out that I didn’t X, then I did not actually know that I will X. That I will X is logically implied by anyone knowing that I will X.
I’m not sure how anyone could achieve actual knowledge of one’s future actions, but I agree that there’s little reason to deliberate over them if one does.
Often you know what your actions are going to be a split-second before you take them. After you have decided what to do but before the motor signals go out. That’s the type of advance-knowledge I was considering.
The information isn’t certain—but then NO information is certain.
Maybe.
it doesn’t imply that agent’s will kill themselves when you tell them they were going to, it implies that they can if you telling them is last scrap of bayesian evidence necessary to move the agent to act in that way. EY’s point is that agents have to figure out what maximizes utility, not predict what they will do because the self-reference causes problems.
E.g., we don’t want a calculator that outputs “whatever I output for 2+2” we want a calculator to output the answer to 2+2. The former is true no matter what the calculator outputs, the latter has a single answer. Similarly, there is only one action which maximizes utility (or at least a subset of all possible actions). But if an agent takes the action that it predicts it will take, it’s predictions are true by definition, so any action suffices.
I think real agents act as though they believe they have free will.
That means that they rate their own decisions to act as determining their actions, and advice from others about how they think they are going to act as being attempts to manipulate them. Another agent encouraging you to behave in a particular way isn’t usually evidence you should update on, it’s a manipulation attempt—and agents are smart enough to know the difference.
Are there circumstances under which you should update on such evidence? Yes, if the agent is judged to be both knowledgeable and trustworthy—but that is equally true if you employ practically any sensible decision process.
Re: if an agent takes the action that it predicts it will take, it’s predictions are true by definition, so any action suffices.
Agents do take the actions they predict they will take—it seems like a matter of fact to me. However, that’s not the criteria they use as the basis for making their predictions in the first place. I didn’t ever claim it was—such a claim would be very silly.
Indeed. You originally wrote:
You language is somewhat vague here, which is why EY clarified.