rank-biserial comments on MIRI announces new “Death With Dignity” strategy

rank-biserial 2 Apr 2022 1:30 UTC
4 points
What about Hail Mary strategies that were previously discarded due to being too risky? I can think of a couple off the top of my head. A cornered rat should always fight.
- Eliezer Yudkowsky 2 Apr 2022 1:35 UTC
  36 points
  Parent
  Do they perchance have significant downsides if they fail? Just wildly guessing, here. I’m a lot more cheerful about Hail Mary strategies that don’t explode when the passes fail, and take out the timelines that still had hope in them after all.
  - CarlJ 20 Aug 2022 16:52 UTC
    2 points
    Parent
    As a Hail Mary-strategy, how about making a 100% effort into trying to become elected of a small democratic voting district?
    
    And, if that works, make a 100% effort to become elected by bigger and bigger districts—until all democratic countries support the [a stronger humanity can be reached by a systematic investigation of our surroundings, cooperation in the production of private and public goods, which includes not creating powerful aliens]-party?
    
    Yes, yes, politics is horrible. BUT. What if you could do this within 8 years? AND, you test it by only trying one or two districts....one or two months each? So, in total it would cost at the most four months.
    
    Downsides? Political corruption is the biggest one. But, I believe your approach to politics would be a continuation of what you do now, so if you succeeded it would only be by strengthening the existing EA/Humanitarian/Skeptical/Transhumanist/Libertarian-movements.
    
    There may be a huge downside for you personally, as you may have to engage in some appropriate signalling to make people vote for your party. But, maybe it isn’t necessary. And if the whole thing doesn’t work it would only be for four months, top.
  - rank-biserial 2 Apr 2022 1:58 UTC
    0 points
    0
    Parent
    Yeah, most of them do. I have some hope for the strategy-cluster that uses widespread propaganda^[1] as a coordination mechanism.
    
    Given the whole “brilliant elites” thing, and the fecundity of rationalist memes among such people, I think it’s possible to shift the world to a better Nash equilibrium.
    
    ↩︎
    Making more rationalists is all well and good, but let’s not shy away from no holds barred memetic warfare.
    - Eliezer Yudkowsky 2 Apr 2022 2:00 UTC
      13 points
      Parent
      Is it not obvious to you that this constitutes dying with less dignity, or is it obvious but you disagree that death with dignity is the correct way to go?
      - rank-biserial 2 Apr 2022 2:09 UTC
        −10 points
        0
        Parent
        Dignity exists within human minds. If human-descended minds go extinct, dignity doesn’t matter. Nature grades us upon what happens, not how hard we try. There is no goal I hold greater than the preservation of humanity.
        Rob Bensinger 2 Apr 2022 2:33 UTC
        12 points
        Parent
        Did you read the OP post? The post identifies dignity with reductions in existential risk, and it talks a bunch about the ‘let’s violate ethical injunctions willy-nilly’ strategy
        rank-biserial 2 Apr 2022 2:47 UTC
        −20 points
        0
        Parent
        The post assumes that there are no ethics-violating strategies that will work. I understand that people can just-world-fallacy their way into thinking that they will be saved if only they sacrifice their deontology. What I’m saying is that deontology-violating strategies should be adopted if they offer, say, +1e-5 odds of success.
        Jalex Stark 2 Apr 2022 3:13 UTC
        32 points
        Parent
        One of Eliezer’s points is that most people’s judgements about adding 1e-5 odds (I assume you mean log odds and not additive probability?) are wrong, and even systematically have the wrong sign.
        Shmi 2 Apr 2022 3:07 UTC
        15 points
        Parent
        The post talks about how most people are unable to evaluate these odds accurately, and that an indicator of you thinking you found a loophole actually being a sign that you are one of those people.
        Muad'Dib 7 Apr 2022 18:22 UTC
        −13 points
        0
        Parent
        Pretty telling IMHO to see such massive herding on the downvotes here, for such an obviously-correct point. Disappointing!
    - jbash 2 Apr 2022 2:13 UTC
      1 point
      Parent
      Coordination on what, exactly?
      - rank-biserial 2 Apr 2022 2:23 UTC
        4 points
        Parent
        
        Coordination (cartelization) so that AI capabilities are not a race to the bottom
        
        Coordination to indefinitely halt semiconductor supply chains
        
        Coordination to shun and sanction those who research AI capabilities (compare: coordination against embyronic human gene editing)
        
        Coordination to deliberately turn Moore’s Law back a few years (yes, I’m serious)
        
        Eliezer Yudkowsky 2 Apr 2022 2:29 UTC
        9 points
        Parent
        And do you think if you try that, you’ll succeed, and that the world will then be saved?
        rank-biserial 2 Apr 2022 2:32 UTC
        6 points
        Parent
        These are all strategies to buy time, so that alignment efforts may have more exposure to miracle-risk.
        Eliezer Yudkowsky 2 Apr 2022 2:47 UTC
        6 points
        Parent
        And what do you think are the chances that those strategies work, or that the world lives after you hypothetically buy three or six more years that way?
        rank-biserial 2 Apr 2022 2:59 UTC
        3 points
        −2
        Parent
        I’m not well calibrated on sub 1% probabilities. Yeah, the odds are low.
        
        There are other classes of Hail Mary. Picture a pair of reseachers, one of whom controls an electrode wired to the pleasure centers of the other. Imagine they have free access to methamphetamine and LSD. I don’t think research output is anywhere near where it could be.
        Eliezer Yudkowsky 2 Apr 2022 3:15 UTC
        43 points
        Parent
        So—just to be very clear here—the plan is that you do the bad thing, and then almost certainly everybody dies anyways even if that works?
        I think at that level you want to exhale, step back, and not injure the reputations of the people who are gathering resources, finding out what they can, and watching closely for the first signs of a positive miracle. The surviving worlds aren’t the ones with unethical plans that seem like they couldn’t possibly work even on the open face of things; the real surviving worlds are only injured by people who imagine that throwing away their ethics surely means they must be buying something positive.
        rank-biserial 2 Apr 2022 3:22 UTC
        27 points
        Parent
        Fine. What do you think about the human-augmentation cluster of strategies? I recall you thought along very similar lines circa ~2001.
        Expand this thread
        Eliezer Yudkowsky 2 Apr 2022 4:58 UTC
        46 points
        Parent
        I don’t think we’ll have time, but I’d favor getting started anyways. Seems a bit more dignified.
        What links here?
        How can a layman contribute to AI Alignment efforts, given shorter timeline/doomier scenarios? by AprilSR (2 Apr 2022 4:34 UTC; 12 points)
        rank-biserial 2 Apr 2022 5:15 UTC
        9 points
        Parent
        Great! If I recall correctly, you wanted genetically optimized kids to be gestated and trained.
        
        I suspect that akrasia is a much bigger problem than most people think, and to be truly effective, one must outsource part of their reward function. There could be massive gains.
        
        What do you think about the setup I outlined, where a pair of reseachers exist such that one controls an electrode embedded in the other’s reward center? Think Focus from Vinge’s A Deepness In The Sky.
        jayterwahl 2 Apr 2022 6:23 UTC
        8 points
        Parent
        (I predict that would help with AI safety, in that it would swiftly provide useful examples of reward hacking and misaligned incentives)
        Marvin 4 Apr 2022 9:47 UTC
        6 points
        Parent
        I think memetically ‘optimized’ kids (and adults?) might be an interesting alternative to explore. That is, more scalable and better education for the ‘consequentialists’ (I have no clue how to teach people that are not ‘consequentialist’, hopefully someone else can teach those) may get human thought-enhancement results earlier, and available to more people. There has been some work in this space and some successes, but I think that in general, the “memetics experts” and the “education experts” haven’t been cooperating properly as much as they should. I think it would seem dignified to me to try bridging this gap. If this is indeed dignified, then that would be good, because I’m currently in the early stages of a project trying to bridge this gap.
        Logan Riggs 5 Apr 2022 23:52 UTC
        1 point
        Parent
        The better version then reward hacking I can think of is inducing a state of jhana (basically a pleasure button) in alignment researchers. For example, use neuro-link to get the brain-process of ~1000 people going through the jhanas at multiple time-steps, average them in a meaningful way, induce those brainwaves in other people.
        The effect is people being satiated with the feeling of happiness (like being satiated with food/water), and are more effective as a result.
        rank-biserial 6 Apr 2022 0:09 UTC
        0 points
        Parent
        The “electrode in the reward center” setup has been proven to work in humans, whereas jhanas may not tranfer over Neuralink.
        Deep brain stimulation is FDA-approved in humans, meaning less (though nonzero) regulatory fuckery will be required.
        Happiness is not pleasure; wanting is not liking. We are after reinforcement.
        Logan Riggs 7 Apr 2022 18:40 UTC
        2 points
        Parent
        Could you link the proven part?
        Jhana’s seem much healthier, though I’m pretty confused imagining your setup so I don’t have much confidence. Say it works and gets past the problems of generalizing reward (eg the brain only rewards for specific parts of research and not others) and ignoring downward spiral effects of people hacking themselves, then we hopefully have people who look forward to doing certain parts of research.
        If you model humans as multi-agents, it’s making a certain type of agent (the “do research” one) have a stronger say in what actions get done. This is not as robust as getting all the agents to agree and not fight each other. I believe jhana gets part of that done because some sub-agents are pursuing the feeling of happiness and you can get that any time.
        rank-biserial 7 Apr 2022 19:41 UTC
        1 point
        Parent
        https://en.wikipedia.org/wiki/Brain_stimulation_reward
        
        https://doi.org/10.1126/science.140.3565.394
        
        https://sci-hub.hkvisa.net/10.1126/science.140.3565.394
        
        In our earliest work with a single lever it was noted that while the subject would lever-press at a steady rate for stimulation to various brain sites, the current could be turned off entirely and he would continue lever-pressing at the same rate (for as many as 2000 responses) until told to stop.
        
        It is of interest that the introduction of an attractive tray of food produced no break in responding, although the subject had been without food for 7 hours, was noted to glance repeatedly at the tray, and later indicated that he knew he could have stopped to eat if he wished. Even under these conditions he continued to respond without change in rate after the current was turned off, until finally instructed to stop, at which point he ate heartily.
        
        jayterwahl 2 Apr 2022 6:26 UTC
        0 points
        Parent
        Is the average human life experientially negative, such that buying three more years of existence for the planet is ethically net-negative?
        Richard_Kennaway 2 Apr 2022 18:57 UTC
        20 points
        −5
        Parent
        People’s revealed choice in tenaciously staying alive and keeping others alive suggests otherwise. This everyday observation trumps all philosophical argument that fire does not burn, water is not wet, and bears do not shit in the woods.
        Oliver Sourbut 12 Apr 2022 23:50 UTC
        2 points
        Parent
        I’m not immediately convinced (I think you need another ingredient).
        
        Imagine a kind of orthogonality thesis but with experiential valence on one axis and ‘staying aliveness’ on the other. I think it goes through (one existence proof for the experientially-horrible-but-high-staying-aliveness quadrant might be the complex of torturer+torturee).
        
        Another ingredient you need to posit for this argument to go through is that, as humans are constituted, experiential valence is causally correlated with behaviour in a way such that negative experiential valence reliably causes not-staying-aliveness. I think we do probably have this ingredient, but it’s not entirely clear cut to me.
        Expand this thread
        Richard_Kennaway 13 Apr 2022 11:00 UTC
        1 point
        Parent
        Unlike jayterwahl, I don’t consider experiential valence, which I take to mean mental sensations of pleasure and pain in the immediate moment, as of great importance in itself. It may be a sign that I am doing well or badly at life, but like the score on a test, it is only a proxy for what matters. People also have promises to keep, and miles to go before they sleep.
        Vaniver 2 Apr 2022 17:07 UTC
        18 points
        Parent
        I think many of the things that you might want to do in order to slow down tech development are things that will dramatically worsen human experiences, or reduce the number of them. Making a trade like that in order to purchase the whole future seems like it’s worth considering; making a trade like that in order to purchase three more years seems much more obviously not worth it.
        Vaniver 2 Apr 2022 17:34 UTC
        8 points
        Parent
        I will note that I’m still a little confused about Butlerian Jihad style approaches (where you smash all the computers, or restrict them to the capability available in 1999 or w/e); if I remember correctly Eliezer has called that a ‘straightforward loss’, which seems correct from a ‘cosmic endowment’ perspective but not from a ‘counting up from ~10 remaining years’ perspective.
        My guess is that the main response is “look, if you can coordinate to smash all of the computers, you can probably coordinate on the less destructive-to-potential task of just not building AGI, and the difficulty is primarily in coordinating at all instead of the coordination target.”
  - linkhyrule5 2 Apr 2022 4:33 UTC
    −9 points
    Parent
    Suppose they don’t? I have at least one that AFAICT doesn’t do anything worse than take researchers/resources away from AI alignment in most bad-ends and even in the worst case scenario “just” generates a paperclipper anyway. Which, to be clear, is bad, but not any worse than the current timeline.
    (Namely, actual literal time travel and outcome pumps. There is some reason to believe that an outcome pump with a sufficiently short time horizon is easier to safely get hypercompute out of than an AGI, and that a “time machine” that moves an electron back a microsecond is at least energetically within bounds of near-term technology.
    You are welcome to complain that time travel is completely incoherent if you like; I’m not exactly convinced myself. But so far, the laws of physics have avoided actually banning CTCs outright.)
    - TLW 2 Apr 2022 18:39 UTC
      1 point
      Parent
      a “time machine” that moves an electron back a microsecond is at least energetically within bounds of near-term technology.
      Do you have a pointer for this? Traversable wormholes tend to require massive amounts of energy^[1] (as in, amounts of energy that are easier to state in c^2 units).
      There is some reason to believe that an outcome pump with a sufficiently short time horizon is easier to safely get hypercompute out of than an AGI, and that a “time machine” that moves an electron back a microsecond [...]
      Note: this isn’t strictly hypercompute. Finite speed of light means that you can only address a finite number of bits within a fixed time, and your critical path is limited by the timescale of the CTC.
      That being said, figuring out the final state of a 1TB-state-vector^[2] FSM would itself be very useful. Just not strictly hypercomputation.
      ^
      Or negative energy density. Or massive amounts of negative energy density.
      ^
      Ballpark. Roundtrip to 1TB of RAM in 1us is doable.
- UHMWPE-UwU 2 Apr 2022 1:50 UTC
  34 points
  Parent
  Never even THINK ABOUT trying a hail mary if it also comes with an increased chance of s-risk. I’d much rather just die.
  What links here?
  - Vaniver's comment on MIRI announces new “Death With Dignity” strategy by Eliezer Yudkowsky (4 Apr 2022 15:40 UTC; 7 points)
  - Daniel Kokotajlo 2 Apr 2022 1:54 UTC
    51 points
    Parent
    Speaking of which, one thing we should be doing is keeping a lookout for opportunities to reduce s-risk (with dignity) … I haven’t yet been convinced that s-risk reduction is intractable.
    - jbash 2 Apr 2022 2:16 UTC
      2 points
      Parent
      The most obvious way to reduce s-risk would be to increase x-risk, but somehow that doesn’t sound very appealing...
      - Daniel Kokotajlo 2 Apr 2022 12:25 UTC
        16 points
        Parent
        This is an example of what EY is talking about I think—as far as I can tell all the obvious things one would do to reduce s-risk via increasing x-risk are the sort of supervillian schemes that are more likely to increase s-risk than decrease it once secondary effects and unintended consequences etc. are taken into account. This is partly why I put the “with dignity” qualifier in. (The other reason is that I’m not a utilitarian and don’t think our decision about whether to do supervillian schemes should come down to whether we think the astronomical long-term consequences are slightly more likely to be positive than negative.)
        jbash 2 Apr 2022 12:51 UTC
        5 points
        Parent
        Suppose, for example, that you’re going to try to build an AGI anyway. You could just not try to train it to care about human values, hoping that it would destroy the world, rather than creating some kind of crazy mind-control dystopia.
        
        I submit that, if your model of the universe is that AGI will, by default, be a huge x-risk and/or a huge s-risk, then the “supervillain” step in that process would be deciding to build it in the first place, and not necessarily not trying to “align” it. You lost your dignity at the first step, and won’t lose any more at the second.
        
        Also, I kind of hate to say it, but sometimes the stuff about “secondary effects and unintended consequences” sounds more like “I’m looking for reasons not to break widely-loved deontological rules, regardless of my professed ethical system, because I am uncomfortable with breaking those rules” than like actual caution. It’s very easy to stop looking for more effects in either direction when you reach the conclusion you want.
        
        I mean, yes, those deontological rules are useful time-tested heuristics. Yes, a lot of the time the likely consequences of violating them will be bad in clearly foreseeable ways. Yes, you are imperfect and should also be very, very nervous about consequences you do not foresee. But all of that can also act as convenient cover for switching from being an actual utilitarian to being an actual deontologist, without ever saying as much.
        
        Personally, I’m neither. And I also don’t believe that intelligence, in any actually achievable quantity, is a magic wand that automatically lets you either destroy the world or take over and torture everybody. And I very much doubt that ML-as-presently-practiced, without serious structural innovations and running on physically realizable computers, will get all that smart anyway. So I don’t really have an incentive to get all supervillainy to begin with. And I wouldn’t be good at it anyhow.
        
        … but if faced with a choice between a certainty of destroying the world, and a certainty of every living being being tortured for eternity, even I would go with the “destroy” option.
        Daniel Kokotajlo 2 Apr 2022 17:15 UTC
        12 points
        Parent
        I think we are on the same page here. I would recommend not creating AGI at all in that situation, but I agree that creating a completely unaligned one is better than creating an s-risky one. https://arbital.com/p/hyperexistential_separation/
      - rank-biserial 2 Apr 2022 2:25 UTC
        −2 points
        Parent
        I can imagine a plausible scenario in which WW3 is a great thing, because both sides brick each other’s datacenters and bomb each other’s semiconductor fabs. Also, all the tech talent will be spent trying to hack the other side and will not be spent training bigger and bigger language models.
        jayterwahl 2 Apr 2022 2:50 UTC
        16 points
        Parent
        I imagine that WW3 would be an incredibly strong pressure, akin to WW2, which causes governments to finally sit up and take notice of AI.
        And then spend several trillion dollars running Manhattan Project Two: Manhattan Harder, racing each other to be the first to get AI.
        And then we die even faster, and instead of being converted into paperclips, we’re converted into tiny American/Chinese flags
        Celarix 3 Apr 2022 19:19 UTC
        3 points
        Parent
        Missed opportunity to call it Manhattan Project Two: The Bronx.
        jbash 2 Apr 2022 2:29 UTC
        −1 points
        Parent
        That only gives you a brief delay on a timeline which could, depending on the horizons you adopt, be billions of years long. If you really wanted to reduce s-risk in an absolute sense, you’d have to try to sterilize the planet, not set back semiconductor manufacturing by a decade. This, I think, is a project which should give one pause.
  - UHMWPE-UwU 2 Apr 2022 3:24 UTC
    6 points
    Parent
    The downvotes on my comment reflect a threat we all need to be extremely mindful of: people who are so terrified of death that they’d rather flip the coin on condemning us all to hell, than die. They’ll only grow ever more desperate & willing to resort to more hideously reckless hail marys as we draw closer.
    What links here?
    Droopyhammock's comment on Droopyhammock’s Shortform by Droopyhammock (22 Mar 2023 15:07 UTC; 9 points)
    - Yitz 3 Apr 2022 3:53 UTC
      7 points
      Parent
      Upvoting you because I think this is an important point to be made, even if I’m unsure how much I agree with it. We need people pushing back against potentially deeply unethical schemes, even if said schemes also have the potential to be extremely valuable (not that I’ve seen very many of those at all; most proposed supervillain schemes would pretty obviously be a Bad Idea™). Having the dialogue is valuable, and it’s disappointing to see unpopular thoughts downvoted here.