Since acausal trade issues are basically spiritual, when the trade is bad I seek a word that means “spiritually bad.” You can read it as just “bad” if you want.
So, actual torture is the strongest signal of willingness and ability to torture. Building a torturizer shows capability, but only hints at willingness. Having materials that could build a torturizer or an orgasmatron is pretty weak, but not zero
Probable crux: Cognitive transparency is actually easy for advanced agencies. It’s difficult for a human to prove to a distant human that they have the means to build and deploy a torturizer without actually doing it. It wouldn’t be difficult for brains that were designed to be capable of proving the state of their beliefs, and AGI participating in a community with other AGI would want to be capable of that. (The contemporary analog is trusted computing. The number of coordination problems it could solve for us, today, if it were fully applied, is actually depressing.)
There would still be uncertainties as a result of mutual comprehensibility issues, but they could turn out to be of negligible importance, especially once nobody’s lying any more.
Ah, sorry—I missed the acausal assumption in the post. I generally ignore such those explorations, as I don’t think “decision” is the right word without causality and conditional probability.
I think you’re right that cognitive transparency is a crux. I strongly doubt it’s possible to be mutual, or possible between agents near each other in cognitive power. It may be possible for a hyperintelligence to understand/predict a human-level intelligence, but in that case the human is so outclassed that “trade” is the wrong word, and “manipulation” or “slavery” (or maybe “absorption”) is a better model.
You don’t have to be able to simulate something to trust it for this or that. EG, the specification of alphazero is much simpler than the final weights, and knowing its training process, without knowing its weights, you can still trust that it will never, say, take a bribe to throw a match. Even if it comprehended bribery, we know from its spec info that it’s solely interested in winning whatever match it’s currently playing, and no sum would be enough.
To generalize, if we know something’s utility function, and if we know it had a robust design, even if we know nothing else about its history, we know what it’ll do.
A promise-keeping capacity is a property utility functions can have.
A promise-keeping capacity is a property utility functions can have.
Yeah, definitely cruxy. It may be a property that utility functions could have, but it’s not a property that any necessarily do have. Moreover, we have zero examples of robust-designed agents with known utility functions, so it’s extremely unclear whether that will become the norm, let alone the universal assumption.
Since acausal trade issues are basically spiritual, when the trade is bad I seek a word that means “spiritually bad.” You can read it as just “bad” if you want.
Probable crux: Cognitive transparency is actually easy for advanced agencies. It’s difficult for a human to prove to a distant human that they have the means to build and deploy a torturizer without actually doing it. It wouldn’t be difficult for brains that were designed to be capable of proving the state of their beliefs, and AGI participating in a community with other AGI would want to be capable of that. (The contemporary analog is trusted computing. The number of coordination problems it could solve for us, today, if it were fully applied, is actually depressing.)
There would still be uncertainties as a result of mutual comprehensibility issues, but they could turn out to be of negligible importance, especially once nobody’s lying any more.
Obviously it is not clear why “acausal trade issues” would be “spiritual” or what you mean by those terms.
So what?
Ah, sorry—I missed the acausal assumption in the post. I generally ignore such those explorations, as I don’t think “decision” is the right word without causality and conditional probability.
I think you’re right that cognitive transparency is a crux. I strongly doubt it’s possible to be mutual, or possible between agents near each other in cognitive power. It may be possible for a hyperintelligence to understand/predict a human-level intelligence, but in that case the human is so outclassed that “trade” is the wrong word, and “manipulation” or “slavery” (or maybe “absorption”) is a better model.
You don’t have to be able to simulate something to trust it for this or that. EG, the specification of alphazero is much simpler than the final weights, and knowing its training process, without knowing its weights, you can still trust that it will never, say, take a bribe to throw a match. Even if it comprehended bribery, we know from its spec info that it’s solely interested in winning whatever match it’s currently playing, and no sum would be enough.
To generalize, if we know something’s utility function, and if we know it had a robust design, even if we know nothing else about its history, we know what it’ll do.
A promise-keeping capacity is a property utility functions can have.
Yeah, definitely cruxy. It may be a property that utility functions could have, but it’s not a property that any necessarily do have. Moreover, we have zero examples of robust-designed agents with known utility functions, so it’s extremely unclear whether that will become the norm, let alone the universal assumption.