Eli Tyre comments on Eli’s shortform feed

Eli Tyre 29 Aug 2024 18:53 UTC
2 points
0
from what I understand, Yudowsky therefore doesn’t classify the original request (give me half your wheat or die) as a threat.
This seems like it weakens the “don’t give into threats” policy substantially, because it makes it much harder to tell what’s a threat-in-the-technical-sense, and the incentives push of exaggeration and dishonesty about what is or isn’t a threat-in-the-the-technical-sense.

The bandits should always act as if they’re willing to kill the farmers and take their stuff, even if they’re bluffing about their willingness to do violence. The farmers need to estimate whether the bandits are bluffing, and either call the bluff, or submit to the demand-which-is-not-technically-a-threat.

That policy has notably more complexity than just “don’t give in to threats.”
- kave 29 Aug 2024 20:16 UTC
  2 points
  0
  Parent
  What is the “don’t give in to threats” policy that this is more complex than? In particular, what are ‘threats’?
  - Eli Tyre 30 Aug 2024 17:43 UTC
    1 point
    0
    Parent
    “Anytime someone credibly demands that you do X, otherwise they’ll do Y to you, you should not do X.” This is a simple reading of the “don’t give into threats” policy.
    - kave 30 Aug 2024 20:01 UTC
      2 points
      0
      Parent
      What are the semantics of “otherwise”? Are they more like:
      X otherwise Y ↦ X → ¬Y, or
      X otherwise Y ↦ X ↔ ¬Y
    - kave 30 Aug 2024 17:46 UTC
      2 points
      0
      Parent
      Presumably you also want the policy to include that you don’t want “Y” and weren’t going to do “X” anyway?
      - Eli Tyre 30 Aug 2024 17:58 UTC
        2 points
        0
        Parent
        Yes, to the first part, probably yes to the second part.
- Hastings 29 Aug 2024 20:35 UTC
  1 point
  0
  Parent
  With a grain of salt,
  
  There’s a sort of quiet assumption that should be louder about the dath Ilan fiction: which is that it’s about a world where a bunch of theorems like “as systems of agents get sufficiently intelligent, they gain the ability to coordinate in prisoner’s dilemma like problems” have proofs. You could similarly write fiction set in a world where P=NP has a proof and all of cryptography collapses. I’m not sure whether EY would guess that sufficiently intelligent agents actually coordinate- Just like I could write the P=NP fiction while being pretty sure that P/=NP