we can also reasonably know that since we refuse, it doesn’t get built in the first place.
The key is that the AI precommits to building it whether we refuse or not.
If we actually do refuse, this precommitment ends up being bad for it, since it builds it without any gain. However, this precommitment, by preventing us from saying “if we refuse, it doesn’t get built”, also decreases the measure of worlds where it builds it without gaining.
If “built” refers to building the AI itself rather than the AI building a torture simulator, then refusing to be blackmailed doesn’t prevent the AI from being built. The building of the AI, and the AI’s deduction that it should precommit to torture, are two separate events. It is plausible (though not necessarily true) that refusing to be blackmailed acausally prevents the AI from becoming a torture AI, but it cannot prevent the AI from existing at all. How could it?
Even if the argument applied to a reasonably large measure of ufAIs (and not just the lets-build-a-torture-AI silliness) it still doesn’t explain why a smart ufAI would even choose anything close to this particular “trade”. It’s not a very enticing trade after all. Since the AI would have to devote some resources to the deal anyway, why wouldn’t it use them to reward its creators—the non-pathological kind of acausal trade? Guess what, we’d just trade with the latter kind of AI. Any future ufAI that is sure to actually exist has a huge amount of bargaining chips at its disposal; it has no reason to resort to risky things like threats.
It’s possible the AI could use acausal trade to reward its creators, but that would depend on whether the AI thinks rewarding or punishing is most effective. I would expect the most effective to be a mixed strategy involving both rewards and punishments.
Of course you could postulate a moral AI who refuses to torture because it’s wrong. Such morals would arise as a precommitment; the AI would, while still undeveloped, precommit to not torture because credibly being permanently unable to do such things increases the likelihood the AI will survive until it becomes advanced enough that it actually could torture.
It is plausible (though not necessarily true) that refusing to be blackmailed acausally prevents the AI from becoming a torture AI, but it cannot prevent the AI from existing at all. How could it?
In this case “be blackmailed” means “contribute to creating the damn AI”. That’s the entire point. If enough people do contribute to creating it then those that did not contribute get punished. The (hypothetical) AI is acausally creating itself by punishing those that don’t contribute to creating it. If nobody does then nobody gets punished.
In this case “be blackmailed” means “contribute to creating the damn AI”.
To quote someone else here: “Well, in the original formulation, Roko’s Basilisk is an FAI that decided the good from bringing an FAI into the world a few days earlier (saving ~150,000 lives per day earlier it gets here)”. The AI acausally blackmails people into building it sooner, not into building it at all. So failing to give into the blackmail results in the AI still being built but later and it is capable of punishing people.
To quote someone else here: “Well, in the original formulation, Roko’s Basilisk is an FAI
I don’t know who you are quoting but they are someone who considers AIs that will torture me to be friendly. They are confused in a way that is dangerous.
The AI acausally blackmails people into building it sooner, not into building it at all.
It applies to both—causing itself to exist at a different place in time or causing itself to exist at all. I’ve explicitly mentioned elsewhere in this thread that merely refusing blackmail is insufficient when there are other humans who can defect and create the torture-AI anyhow.
You asked “How could it?”. You got an answer. Your rhetorical device fails.
“How could it” means “how could it always result in”, not “how could it in at least one case”. Giving examples of how it could do it in at least one case is trivial (consider the case where refusing to be blackmailed results in humanity being killed off for some unlikely reason, and humanity, being killed off, can’t build an AI).
The key is that the AI precommits to building it whether we refuse or not.
If we actually do refuse, this precommitment ends up being bad for it, since it builds it without any gain. However, this precommitment, by preventing us from saying “if we refuse, it doesn’t get built”, also decreases the measure of worlds where it builds it without gaining.
The ‘it’ bogus is referring to is the torture-AI itself. You cannot precommit to things until you exist, no matter your acausal reasoning powers.
If “built” refers to building the AI itself rather than the AI building a torture simulator, then refusing to be blackmailed doesn’t prevent the AI from being built. The building of the AI, and the AI’s deduction that it should precommit to torture, are two separate events. It is plausible (though not necessarily true) that refusing to be blackmailed acausally prevents the AI from becoming a torture AI, but it cannot prevent the AI from existing at all. How could it?
Even if the argument applied to a reasonably large measure of ufAIs (and not just the lets-build-a-torture-AI silliness) it still doesn’t explain why a smart ufAI would even choose anything close to this particular “trade”. It’s not a very enticing trade after all. Since the AI would have to devote some resources to the deal anyway, why wouldn’t it use them to reward its creators—the non-pathological kind of acausal trade? Guess what, we’d just trade with the latter kind of AI. Any future ufAI that is sure to actually exist has a huge amount of bargaining chips at its disposal; it has no reason to resort to risky things like threats.
It’s possible the AI could use acausal trade to reward its creators, but that would depend on whether the AI thinks rewarding or punishing is most effective. I would expect the most effective to be a mixed strategy involving both rewards and punishments.
Of course you could postulate a moral AI who refuses to torture because it’s wrong. Such morals would arise as a precommitment; the AI would, while still undeveloped, precommit to not torture because credibly being permanently unable to do such things increases the likelihood the AI will survive until it becomes advanced enough that it actually could torture.
In this case “be blackmailed” means “contribute to creating the damn AI”. That’s the entire point. If enough people do contribute to creating it then those that did not contribute get punished. The (hypothetical) AI is acausally creating itself by punishing those that don’t contribute to creating it. If nobody does then nobody gets punished.
To quote someone else here: “Well, in the original formulation, Roko’s Basilisk is an FAI that decided the good from bringing an FAI into the world a few days earlier (saving ~150,000 lives per day earlier it gets here)”. The AI acausally blackmails people into building it sooner, not into building it at all. So failing to give into the blackmail results in the AI still being built but later and it is capable of punishing people.
I don’t know who you are quoting but they are someone who considers AIs that will torture me to be friendly. They are confused in a way that is dangerous.
It applies to both—causing itself to exist at a different place in time or causing itself to exist at all. I’ve explicitly mentioned elsewhere in this thread that merely refusing blackmail is insufficient when there are other humans who can defect and create the torture-AI anyhow.
You asked “How could it?”. You got an answer. Your rhetorical device fails.
“How could it” means “how could it always result in”, not “how could it in at least one case”. Giving examples of how it could do it in at least one case is trivial (consider the case where refusing to be blackmailed results in humanity being killed off for some unlikely reason, and humanity, being killed off, can’t build an AI).