I know this isn’t quite rigorous, but if I can calculate the counterfactual “what would the other player’s strategy be if ze did not model me as an agent capable of responding to incentives,” blackmail seems easy to identify by comparison to this.
Perhaps this can be what we mean by ‘default’?
I think this ties into Larks’ point—if Larks didn’t think I responded to incentives, I think ze’d just help the child, so asking me $1,000 would be blackmail. Clippy would not help the child, and so asking me $1,000 is trade.
To first order, this means that folks playing decision-theoretic games against me actually have an incentive to self-modify to be all-else-equal sadistic, so that their threats can look like offers. But then I can assume that they would not have so modified in the first place if they hadn’t modelled me as responding to incentives, etc. etc.
“an agent incapable of responding to incentives” is not a well-defined agent. What do you respond to? A random number generator? Subliminal messages? Pie?
Should you find yourself in the greater Boston area, drop me a line and I will give you some pie.
(I suspect that there is a context to this comment, and I might even find it interesting if I were to look it up, but I’m sort of enjoying the comment in isolation. Hopefully it isn’t profoundly embarrassing or anything.)
Which option for you is “not responding”, the “default”? Maybe you give away $1000 by default, and since that leads to children not drowning, the better-valued outcome, it looks more like “least effort”. How do you measure effort?
I know this isn’t quite rigorous, but if I can calculate the counterfactual “what would the other player’s strategy be if ze did not model me as an agent capable of responding to incentives,” blackmail seems easy to identify by comparison to this.
Perhaps this can be what we mean by ‘default’?
I think this ties into Larks’ point—if Larks didn’t think I responded to incentives, I think ze’d just help the child, so asking me $1,000 would be blackmail. Clippy would not help the child, and so asking me $1,000 is trade.
To first order, this means that folks playing decision-theoretic games against me actually have an incentive to self-modify to be all-else-equal sadistic, so that their threats can look like offers. But then I can assume that they would not have so modified in the first place if they hadn’t modelled me as responding to incentives, etc. etc.
“an agent incapable of responding to incentives” is not a well-defined agent. What do you respond to? A random number generator? Subliminal messages? Pie?
I respond to pie. Are you offering pie?
Should you find yourself in the greater Boston area, drop me a line and I will give you some pie.
(I suspect that there is a context to this comment, and I might even find it interesting if I were to look it up, but I’m sort of enjoying the comment in isolation. Hopefully it isn’t profoundly embarrassing or anything.)
Can I take you up on that as well? You can never have too much pie.
Well, you’re certainly free to drop me a line if you’re in the area, but I’m far less likely to respond, let alone respond with pie.
sd
Which option for you is “not responding”, the “default”? Maybe you give away $1000 by default, and since that leads to children not drowning, the better-valued outcome, it looks more like “least effort”. How do you measure effort?