Yes, the causality is from the decision process to the reward. The decision process may or may not be known to the agent, but its preferences are (data can be read, but the code can only be read if introspection is available).
You can and should self-modify to prefer acting in such a way that you would benefit from others predicting you would act a certain way. You get one-boxing behavior in Newcomb’s and this is still CDT/EDT (which are really equivalent, as shown).
Yes, you could implement this behavior in the decision algorithm itself, and yes this is very much isomorphic. Evolution’s way to implement better cooperation has been to implement moral preferences though, it feels like a more natural design.
That’s certainly possible, it’s also possible that you do not understand the argument.
To make things absolutely clear, I’m relying on the following definition of EDT
Policy that picks action a = argmax( Sum( P( Wj | W, ai ). U( Wj ), j ) , i )
Where {ai} are the possible actions, W is the state of the world, P( W’ | W, a ) the probability of moving to state of the world W’ after doing a, and U is the utility function.
I believe the argument I made in the case of Solomon’s problem is the clearest and strongest, would you care to rebut it?
I’ve challenged you to clarify through which mechanism someone with a cancer gene would decide to chew gum, and you haven’t answered this properly.
If your decision algorithm is EDT, the only free variables that will determine what your decisions are are going to be your preferences and sensory input.
The only way the gene can cause you to chew gum in any meaningful sense is to make you prefer to chew gum.
An EDT agent has knowledge of its own preferences.
Therefore, an EDT agent already knows if it falls in the “likely to get cancer” population.
That’s certainly possible, it’s also possible that you do not understand the argument.
The combination:
Uncontraversial understanding by academic orthodoxy
General position by those on lesswrong
My parsing of your post
Observation of your attempts to back up your argument when it was not found to be persuasive by myself or others
… is sufficient to give rather high confidence levels. It really is a huge claim you are making, to dismiss the understanding of basically the rest of the world regarding how CDT and EDT apply to the trivial toy problems that were designed to test them.
There is altogether too much deduction of causal mechanisms involved in your “EDT” reasoning. And the deductions involved rely on a premise (the second dot point) that just isn’t a part of either the problem or ‘genes’.
I’m making a simple, logical argument. If it’s wrong, it should be trivial to debunk. You’re relying on an outside view to judge; it is pretty weak.
As I’ve clearly said, I’m entirely aware that I’m making a rather controversial claim. I never bother to post on lesswrong, so I’m clearly not whoring for attention or anything like that. Look at it this way, in order to present my point despite it being so unorthodox, I have to be pretty damn sure it’s solid.
The second dot point is part of the problem description. You’re saying it’s irrelevant, but you can’t just parachute a payoff matrix where causality goes backward in time.
Find any example you like, as long as they’re physically possible, you’ll either have the payoff tied to your decision algorithm (Newcomb’s) or to your preference set (Solomon’s).
Yes, the causality is from the decision process to the reward. The decision process may or may not be known to the agent, but its preferences are (data can be read, but the code can only be read if introspection is available).
You can and should self-modify to prefer acting in such a way that you would benefit from others predicting you would act a certain way. You get one-boxing behavior in Newcomb’s and this is still CDT/EDT (which are really equivalent, as shown).
Yes, you could implement this behavior in the decision algorithm itself, and yes this is very much isomorphic. Evolution’s way to implement better cooperation has been to implement moral preferences though, it feels like a more natural design.
I suggest that what was ‘shown’ was that you do not understand the difference between CDT and EDT.
That’s certainly possible, it’s also possible that you do not understand the argument.
To make things absolutely clear, I’m relying on the following definition of EDT
Policy that picks action a = argmax( Sum( P( Wj | W, ai ). U( Wj ), j ) , i ) Where {ai} are the possible actions, W is the state of the world, P( W’ | W, a ) the probability of moving to state of the world W’ after doing a, and U is the utility function.
I believe the argument I made in the case of Solomon’s problem is the clearest and strongest, would you care to rebut it?
I’ve challenged you to clarify through which mechanism someone with a cancer gene would decide to chew gum, and you haven’t answered this properly.
If your decision algorithm is EDT, the only free variables that will determine what your decisions are are going to be your preferences and sensory input.
The only way the gene can cause you to chew gum in any meaningful sense is to make you prefer to chew gum.
An EDT agent has knowledge of its own preferences. Therefore, an EDT agent already knows if it falls in the “likely to get cancer” population.
The combination:
Uncontraversial understanding by academic orthodoxy
General position by those on lesswrong
My parsing of your post
Observation of your attempts to back up your argument when it was not found to be persuasive by myself or others
… is sufficient to give rather high confidence levels. It really is a huge claim you are making, to dismiss the understanding of basically the rest of the world regarding how CDT and EDT apply to the trivial toy problems that were designed to test them.
There is altogether too much deduction of causal mechanisms involved in your “EDT” reasoning. And the deductions involved rely on a premise (the second dot point) that just isn’t a part of either the problem or ‘genes’.
I’m making a simple, logical argument. If it’s wrong, it should be trivial to debunk. You’re relying on an outside view to judge; it is pretty weak.
As I’ve clearly said, I’m entirely aware that I’m making a rather controversial claim. I never bother to post on lesswrong, so I’m clearly not whoring for attention or anything like that. Look at it this way, in order to present my point despite it being so unorthodox, I have to be pretty damn sure it’s solid.
The second dot point is part of the problem description. You’re saying it’s irrelevant, but you can’t just parachute a payoff matrix where causality goes backward in time.
Find any example you like, as long as they’re physically possible, you’ll either have the payoff tied to your decision algorithm (Newcomb’s) or to your preference set (Solomon’s).