The basilisk get’s more compliance from the believers when he puts the innocents into heaven then when he puts them into hell.
Also the debate is not about an UFAI but a FAI that optimizes the utility function of general welfare with TDT.
This is also the point, where you might think about how Eliezer’s censorship had an effect. His censuring did lead you and Viliam_Bur to have an understanding of the issue where you think it’s about an UFAI.
The basilisk get’s more compliance from the believers when he puts the innocents into heaven then when he puts them into hell.
This is at best not clear. It depends on the specific nature of the insanity in the compliant. Note that brutally disincentivizing evangelism has… instrumental downsides.
the believers
Don’t be misled by the loose relationship with Pascal’s Wager. This isn’t about belief, it is about decisions (and counterfactual decisions).
Also the debate is not about an UFAI but
The use of the term uFAI is deliberate, and correct. We don’t need to define a torture-terrorist as Friendly just because of some sloppy utilitarian reasoning. Moreover, any actual risk from the scenario comes from AGI creators (or influencers) that make this assumption. That’s the only thing that can cause the torture to happen.
His censuring did lead you and Viliam_Bur to have an understanding of the issue where you think it’s about an UFAI.
You are overconfident in your mind reading skills. I was one of the few people who were familiar enough with the subject matter at the time when Roko was writing his (typically fascinating) posts that I categorised the agent as a plausible not-friendly AGI immediately, the scenario as an interesting twist on acausal extortion then went straight to thinking about the actual content of the post, which was about a new means of cooperation.
There is this problem with human psychology that threatening someone with torture doesn’t contribute to their better judgement.
If threatening someone with eternal torture would magically raise their intelligence over 9000 and give them ability to develop a correct theory of Friendliness and reliably make them build a Friendly AI in five years… then yes, under these assumptions, threatening people with eternal torture could be the morally correct thing to do.
But human psychology doesn’t work this way. If you start threatening people with torture, they are more likely to make mistakes in their reasoning. See: motivated reasoning, “ugh” fields, etc.
Therefore, the hypothetical AI threatening people with torture for… well, pretty much for not being perfectly epistemically and instrumentally rational… would decrease the probability of Friendly AI being built correctly. Therefore, I don’t consider this hypothetical AI to be Friendly.
The basilisk get’s more compliance from the believers when he puts the innocents into heaven then when he puts them into hell. Also the debate is not about an UFAI but a FAI that optimizes the utility function of general welfare with TDT.
This is also the point, where you might think about how Eliezer’s censorship had an effect. His censuring did lead you and Viliam_Bur to have an understanding of the issue where you think it’s about an UFAI.
This is at best not clear. It depends on the specific nature of the insanity in the compliant. Note that brutally disincentivizing evangelism has… instrumental downsides.
Don’t be misled by the loose relationship with Pascal’s Wager. This isn’t about belief, it is about decisions (and counterfactual decisions).
The use of the term uFAI is deliberate, and correct. We don’t need to define a torture-terrorist as Friendly just because of some sloppy utilitarian reasoning. Moreover, any actual risk from the scenario comes from AGI creators (or influencers) that make this assumption. That’s the only thing that can cause the torture to happen.
You are overconfident in your mind reading skills. I was one of the few people who were familiar enough with the subject matter at the time when Roko was writing his (typically fascinating) posts that I categorised the agent as a plausible not-friendly AGI immediately, the scenario as an interesting twist on acausal extortion then went straight to thinking about the actual content of the post, which was about a new means of cooperation.
Roko’s post explicitly mentioned trading with unfriendly AI’s.
yeah, the horror lies in the idea that it might be morally CORRECT for an FAI to engage in eternal torture of some people.
There is this problem with human psychology that threatening someone with torture doesn’t contribute to their better judgement.
If threatening someone with eternal torture would magically raise their intelligence over 9000 and give them ability to develop a correct theory of Friendliness and reliably make them build a Friendly AI in five years… then yes, under these assumptions, threatening people with eternal torture could be the morally correct thing to do.
But human psychology doesn’t work this way. If you start threatening people with torture, they are more likely to make mistakes in their reasoning. See: motivated reasoning, “ugh” fields, etc.
Therefore, the hypothetical AI threatening people with torture for… well, pretty much for not being perfectly epistemically and instrumentally rational… would decrease the probability of Friendly AI being built correctly. Therefore, I don’t consider this hypothetical AI to be Friendly.
[removed]