from what I understand, Yudowsky therefore doesn’t classify the original request (give me half your wheat or die) as a threat.
This seems like it weakens the “don’t give into threats” policy substantially, because it makes it much harder to tell what’s a threat-in-the-technical-sense, and the incentives push of exaggeration and dishonesty about what is or isn’t a threat-in-the-the-technical-sense.
The bandits should always act as if they’re willing to kill the farmers and take their stuff, even if they’re bluffing about their willingness to do violence. The farmers need to estimate whether the bandits are bluffing, and either call the bluff, or submit to the demand-which-is-not-technically-a-threat.
That policy has notably more complexity than just “don’t give in to threats.”
“Anytime someone credibly demands that you do X, otherwise they’ll do Y to you, you should not do X.” This is a simple reading of the “don’t give into threats” policy.
There’s a sort of quiet assumption that should be louder about the dath Ilan fiction: which is that it’s about a world where a bunch of theorems like “as systems of agents get sufficiently intelligent, they gain the ability to coordinate in prisoner’s dilemma like problems” have proofs. You could similarly write fiction set in a world where P=NP has a proof and all of cryptography collapses. I’m not sure whether EY would guess that sufficiently intelligent agents actually coordinate- Just like I could write the P=NP fiction while being pretty sure that P/=NP
This seems like it weakens the “don’t give into threats” policy substantially, because it makes it much harder to tell what’s a threat-in-the-technical-sense, and the incentives push of exaggeration and dishonesty about what is or isn’t a threat-in-the-the-technical-sense.
The bandits should always act as if they’re willing to kill the farmers and take their stuff, even if they’re bluffing about their willingness to do violence. The farmers need to estimate whether the bandits are bluffing, and either call the bluff, or submit to the demand-which-is-not-technically-a-threat.
That policy has notably more complexity than just “don’t give in to threats.”
What is the “don’t give in to threats” policy that this is more complex than? In particular, what are ‘threats’?
“Anytime someone credibly demands that you do X, otherwise they’ll do Y to you, you should not do X.” This is a simple reading of the “don’t give into threats” policy.
What are the semantics of “otherwise”? Are they more like:
X otherwise Y
↦ X → ¬Y, orX otherwise Y
↦ X ↔ ¬YPresumably you also want the policy to include that you don’t want “Y” and weren’t going to do “X” anyway?
Yes, to the first part, probably yes to the second part.
With a grain of salt,
There’s a sort of quiet assumption that should be louder about the dath Ilan fiction: which is that it’s about a world where a bunch of theorems like “as systems of agents get sufficiently intelligent, they gain the ability to coordinate in prisoner’s dilemma like problems” have proofs. You could similarly write fiction set in a world where P=NP has a proof and all of cryptography collapses. I’m not sure whether EY would guess that sufficiently intelligent agents actually coordinate- Just like I could write the P=NP fiction while being pretty sure that P/=NP