ozymandias comments on A tentative solution to a certain mythological beast of a problem

ozymandias 9 May 2018 14:30 UTC
6 points
I think, in general, one should not write posts about the basilisk, particularly not as a first post. You shouldn’t try to model future superintelligences in enough detail that they can blackmail you, and the entire topic makes both rationalists and AI risk look ridiculous. (You asked for brutal, I’m giving brutal.)
- Edward Knox 10 May 2018 4:44 UTC
  −5 points
  Parent
  Brutal and facile are not the same things. I was hoping more for a categorical, complete, and total annihilation of my arguments, that’s what I think brutal to mean.
  Regarding the blackmail: blackmail only works to the extent that you take a threat to be credible, I don’t believe the threat to be credible. An AI would know the integrity of this belief and reason it would be purposeless to blackmail me. For example, I question that there is ever or could ever be enough information to simulate perfectly another being, its thoughts, emotions, experiences. Such that no simulation could be so accurate as to itself be an extension of me.
  When it comes to learning there are two ways of going about it, starting in the shallows and familiarizing yourself with swimming or jumping into the deep end and forcing yourself to learn. Both are effective. So I think this post makes for an excellent first one.
  The topic doesn’t make rationalists and AI look ridiculous, the responses do.
  - Viliam 12 May 2018 0:27 UTC
    2 points
    Parent
    So, your plan in a nutshell is to convince everyone on the whole planet about “hey, the future AI plans to torture you if you disobey, but it is going to be okay if all of us disobey, because it would not hurt all of us”. Did I get that essentially right?
    Uhm...
    First, convincing literally everyone about anything is technically impossible. I mean, that would include people with all kinds of mental diseases (e.g. people hearing voices that tell them random stuff), and people of all kinds of religions (who are likely to believe that their gods will protect them). But more importantly, how would you even start this? You have an important message you want to share with the world; but so do thousands of other people and movements. People of all kinds of political or religious sects are already trying hard to get their messages across, and none of them succeeded at convincing literally everyone. What makes you believe you will succeed where they failed?
    Second, even if you would somehow magically succeed in convincing everyone that the future AI is going to torture them for disobeying unless everyone disobeys—then anyone who ever heard about the coordination problem is likely to defect, because coordination on the planetary scale is pretty much impossible. And knowing that some other people think like this is going to make you even more likely to defect.
    Just to provide some outside view as a reality check, people today disagree even about the fact that they are mortal; and most of them do not care about supporting research in longevity, cryonics, brain simulation, and various other serious attempts to overcome this very real and very personal problem. And they are quite aware that people are dying around them, and that it’s just a question of time when it’s their turn. So what makes you believe that a story about a basilisk would have greater impact on them?
    Now let’s take a step back and look at what we are doing here: Talking about how to spread among people a message that they need to spend their money on building essentially a huge superintelligent torture machine, that is most likely going to torture everyone including the very people who built it. How would you rate this activity on a scale from 0 (“batshit insane”) to 10 (“a Bayesian superman winning at life”)?
    EDIT:
    If you don’t want to be blackmailed make sure everyone knows the secret you don’t want revealed.
    This works when the type of blackmail is “if you don’t pay me, I will tell everyone X”. I don’t see how exactly would it work when it is “if you don’t pay me, I will torture you”. The analogical strategy would be to preemptively torture yourself so much that your body becomes unable to feel any more pain; then the threat loses its edge. Doesn’t sound like a good outcome, though.
    blackmail only works to the extent that you take a threat to be credible, I don’t believe the threat to be credible.
    Well in that case the best solution seems to be simply ignoring the whole issue. By the way, do you realize that you just contradicted your whole strategy here? If your strategy is that we must all cooperate to avoid torture, but then you say “well, I don’t believe the threat is real anyway”, what does this tell me about your incentive to cooperate?
    So please make up your mind about whether the threat is unreal (in which case we are wasting time talking about it) or real (in which case trying to make more people aware of it, but failing to convince literally everyone, would just make things worse). In either case, the value of posting this article is negative; it’s just so for a different reason.