mako yass comments on MakoYass’s Shortform

mako yass 17 Aug 2023 18:51 UTC
−8 points
Can you expand on what you mean by “demonic”?
Since acausal trade issues are basically spiritual, when the trade is bad I seek a word that means “spiritually bad.” You can read it as just “bad” if you want.
So, actual torture is the strongest signal of willingness and ability to torture. Building a torturizer shows capability, but only hints at willingness. Having materials that could build a torturizer or an orgasmatron is pretty weak, but not zero
Probable crux: Cognitive transparency is actually easy for advanced agencies. It’s difficult for a human to prove to a distant human that they have the means to build and deploy a torturizer without actually doing it. It wouldn’t be difficult for brains that were designed to be capable of proving the state of their beliefs, and AGI participating in a community with other AGI would want to be capable of that. (The contemporary analog is trusted computing. The number of coordination problems it could solve for us, today, if it were fully applied, is actually depressing.)
There would still be uncertainties as a result of mutual comprehensibility issues, but they could turn out to be of negligible importance, especially once nobody’s lying any more.
- lc 17 Aug 2023 19:18 UTC
  4 points
  Parent
  
  Since acausal trade issues are basically spiritual
  
  Obviously it is not clear why “acausal trade issues” would be “spiritual” or what you mean by those terms.
  - mako yass 17 Aug 2023 19:24 UTC
    −4 points
    Parent
    So what?
- Dagon 18 Aug 2023 0:29 UTC
  2 points
  Parent
  Ah, sorry—I missed the acausal assumption in the post. I generally ignore such those explorations, as I don’t think “decision” is the right word without causality and conditional probability.
  I think you’re right that cognitive transparency is a crux. I strongly doubt it’s possible to be mutual, or possible between agents near each other in cognitive power. It may be possible for a hyperintelligence to understand/predict a human-level intelligence, but in that case the human is so outclassed that “trade” is the wrong word, and “manipulation” or “slavery” (or maybe “absorption”) is a better model.
  - mako yass 18 Aug 2023 4:18 UTC
    2 points
    Parent
    You don’t have to be able to simulate something to trust it for this or that. EG, the specification of alphazero is much simpler than the final weights, and knowing its training process, without knowing its weights, you can still trust that it will never, say, take a bribe to throw a match. Even if it comprehended bribery, we know from its spec info that it’s solely interested in winning whatever match it’s currently playing, and no sum would be enough.
    To generalize, if we know something’s utility function, and if we know it had a robust design, even if we know nothing else about its history, we know what it’ll do.
    A promise-keeping capacity is a property utility functions can have.
    - Dagon 18 Aug 2023 5:24 UTC
      2 points
      Parent
      A promise-keeping capacity is a property utility functions can have.
      Yeah, definitely cruxy. It may be a property that utility functions could have, but it’s not a property that any necessarily do have. Moreover, we have zero examples of robust-designed agents with known utility functions, so it’s extremely unclear whether that will become the norm, let alone the universal assumption.