joraine comments on MIRI announces new “Death With Dignity” strategy

joraine 18 May 2022 16:29 UTC
1 point
I and two of my friends are on the precipice of our careers right now. We are senior CS majors at MIT, and next year we’re all doing our Master’s here and have been going back and forth on what to pick.
All of us have heavily considered AI of course. I’m torn between that and Distributed Systems/Cryptography things to do Earn to Give. I’ve been mostly on the AI side of things until today.
This post has singlehandedly convinced two of us (myself included) to not work on AI or help with AI alignment, as if Eliezer, an expert in that field. is correct, then it’s hopeless for us to try. So our only hope is that he’s wrong and the field is somewhat ridiculous or easily solved.
We’re just two people, so it might not mean the world of difference. But I’m sure we’re not the only ones who this /type/ of message will reach.
I made an account just to post this, so I’m not sure if it will abide by the rules (I didn’t mean for this to be an arguing against people thing) or be deleted. But thought I should post the actual real world impact this piece had on two young AI-prospectives.
- paulfchristiano 18 May 2022 16:50 UTC
  21 points
  Parent
  I don’t think “Eliezer is wrong about something” implies that the field is ridiculous or easily solved. Many people in the field (including the plurality with meaningful AI experience) disagree with Eliezer’s basic perspective and think he is wildly overconfident.
  I basically agree with Eliezer that if arguments for doom look convincing then you should focus on something more like improving the log odds of survival, and preparing to do the hard work to take advantage of the actual kind of miracles that might actually occur, rather than starting to live in fantasy-land where things are not as they seem. But “let’s just assume this whole alignment thing isn’t real” is a particularly extreme instance of the behavior that Eliezer is criticizing.
  So to get there, I think you’d have to both reject Eliezer’s basic perspective, and take Eliezer as representing the entire field so strongly that rejecting his perspective means rejecting the whole field. I don’t think that’s reasonable, unless your interest in AI safety was primarily driven by deference to Eliezer. If so, I do think it makes sense to stop deferring and just figure out how you want to help the world, and it’s reasonable if that’s not AI safety.
  As an aside, if your goal is earning to give, I think it currently looks unwise to go into distributed systems or cryptography rather than AI (though I don’t know your situation, this is just based on similarly-skilled people making much more money right now in AI and that looking like it’s going to hold up or get even more extreme over the next 20 years).
  - Rob Bensinger 28 May 2022 22:33 UTC
    9 points
    Parent
    But “let’s just assume this whole alignment thing isn’t real” is a particularly extreme instance of the behavior that Eliezer is criticizing.
    I endorse this response, and want to highlight it. Specifically, there are two separate errors in “if Eliezer, an expert in that field. is correct, then it’s hopeless for us to try. So our only hope is that he’s wrong and the field is somewhat ridiculous or easily solved.”:
    1. The post doesn’t say it’s hopeless to try (‘low probability of success’ is not equivalent to ‘negligible probablity of success’).
    2. Per my comment in another thread, the exact mental move the OP is criticizing is ‘if you’re right then we’re screwed regardless, so we might as well adopt a bunch of more-optimistic assumptions’.
    What I’d recommend is just to start building your own mental models of AI risk, especially if your initial reason for concern was mainly based on deference to others.
    If you get enough evidence to decide that alignment is difficult and necessary for an awesome future (based on a mix of your more-fleshed-out models of the object level and your more-fleshed-out models of which people tend to be right about which things), then work on it insofar as this is higher-EV than whatever you’d be doing instead, even if you end up agreeing with Eliezer that the absolute odds of success are low.
    In the process, be wary of flailing and of rushing to epistemic closure; and see also https://www.lesswrong.com/posts/st7DiQP23YQSxumCt/on-doing-the-improbable. That post was about people preemptively giving up even in the face of moderate difficulties, where it’s especially obvious that the psychological processes at work are maladaptive; but the same processes are also in play when there are larger difficulties, even if they’re harder to spot because the problem genuinely is harder and less obviously tractable.
    I think you’d have to both reject Eliezer’s basic perspective, and take Eliezer as representing the entire field so strongly that rejecting his perspective means rejecting the whole field.
    I think this is mostly false, in that it’s not sufficient for ‘I shouldn’t work on AI alignment’ (Eliezer can be wrong about a lot of core things and it still be the case that AI alignment is important and difficult), and in that it doesn’t connect up to joraine’s ‘I infer from this post that I shouldn’t work on AI alignment’.
    I don’t think that’s reasonable, unless your interest in AI safety was primarily driven by deference to Eliezer.
    Likewise, this seems false to me because it suggests that past Eliezer-deference is sufficient to justify ‘alignment isn’t important if Eliezer is wrong’ (which it’s not), and/or it suggests that Eliezer-deferential people should conclude from the OP ‘ah, Eliezer is wrong about AI’ (without appeal to your own personal model of AI that disagrees with Eliezer).
    ‘Our odds of success were pretty low and have gotten lower’ is just a possible state of the world; it isn’t a ridiculous-on-its-face view. So while there are reasonable paths to arriving at the conclusion ‘I shouldn’t work on alignment’, I don’t think any of them have much to do with this post.
    I guess the most reasonable would be ‘I don’t care that much about longevity or the long-term future, so I’ll work on alignment if I think it’s easy but not if I think it’s hard; and the OP updated me toward thinking it’s hard; so now I’m going to go play video games and chill instead of worrying about the long-term future.’
    If so, I do think it makes sense to stop deferring
    I don’t know what specific things ‘If so’ is referring back to, and I also recommended trying to build one’s own models of the space. But it again seems to me like this can be misinterpreted as saying ‘the OP contains an obvious reasoning error such that if you came to the post thinking EY was right about p(doom), you should leave it thinking he’s wrong about p(doom)’. That seems obviously wrong to me, whether or not EY is actually correct about p(doom).
    The reason you should build your own models of this space is because there’s high expected information value (both for you and for others) in figuring out what’s up here. If you’re a smart technical person who hasn’t done much of their own model-building about AI, then IMO that’s very likely to be true regardless of what your current p(doom) is.
  - habryka 29 May 2022 7:15 UTC
    6 points
    Parent
    As an aside, if your goal is earning to give, I think it currently looks unwise to go into distributed systems or cryptography rather than AI
    True, though the other two are much less likely to cause an increase in existential risk. I would pretty strongly prefer we don’t randomly send people who are interested in helping the world to work on the single most dangerous technology that humanity is developing, without a plan for how to make it go better. My current guess is it’s worth it to just buy out marginal people, especially people with somewhat of an interest in AGI, from working in AI at all, so going into AI seems pretty net-negative to me, even if you make a good amount of money.
    (I think one could make an argument that general non-AGI related AI work isn’t that bad for the world, but my sense is that a lot of the highest-paying AI jobs have some pretty decent impact on AI capabilities, either by directly involving research, or by substantially increasing the amount of commercialization and therefore funding and talent that goes into AI. I bet there are some AI jobs you can find that won’t make any substantial difference on AGI, so if you keep that in mind and are pretty conservative with regards to improving commercialization of AI technologies or attracting more talent to the field, you might be fine.)
  - joraine 19 May 2022 0:55 UTC
    3 points
    Parent
    Apologize for long wall of text, at the bottom I dived into your aside more as that’s highly relevant to deciding the course of my next 10 years and would appreciate your weighing-in.
    Pre-Lesswrong/my entire life I’ve been really interested in longevity, and I would do anything to help people have more time with their loved ones (and as a child I thought solving this was the only worthy kind of fame I’d ever want.)
    I didn’t know how to get there, but it was probably somewhere in math and science so I decided I had to do anything to get into MIT.
    My hobbies ended up being CS-y instead of biology-y, and I realized that not only was CS profitable for earn to give, but it also might be the best shot for longevity since AI was just infinitely better at problem solving.
    So that’s where my AI interest comes from. Not in being afraid of it but in using it to solve mortal problems. But the AI safety thing is something that I of course just hear smart people like Eliezer mention and then I think to myself “hmm well they know more about AI than me and I can’t use it to cure aging without the AI also maybe destroying us so I should look into that.”
    
    Your crypto comment is surprising though and I’d like to go further on that. I should be more clear, I’m pretty interested in cryptocurrency not just cryptography and so far trading it has been really profitable, and this summer I’m essentially trying to decide if I’ll stop my schooling to do a crypto startup or if I’ll do my Masters in AI (or potentially also a crypto thing).
    Startups seem like the best thing to do for profit and people are falling over themselves to fund them nowadays so I assumed given how many people have offered me funding to do so, that the crypto startup thing would be far easier to profit from than an ML startup (with ML maybe overtaking it in 7 years or so)
    If this isn’t the case, or we’re a year away from the flip to ML being the easier startup, I’d love to know, because I’m right on the precipice between pursuing as much ML knowledge as I can and trying to get a pHd (probably eventually do an ML spin-off), versus trying to crypto startup earn-to-give ala FTX Sam.
    - paulfchristiano 19 May 2022 1:11 UTC
      5 points
      Parent
      My claim about AI vs crypto was just a misunderstanding. I still think of “cryptography” and “distributed systems” with their historical meaning rather than “cryptocurrency startup” or “cryptocurrency trading,” but in the context of earning to give I think that should have been clear to me :)
      I’d still guess an AI career is generally the better way to make money, but I don’t have a strong take / think it depends on the person and situation / am no longer confused by your position.
      - joraine 19 May 2022 1:28 UTC
        1 point
        Parent
        Yeah I saw this post:
        https://www.lesswrong.com/posts/MR6cJKy2LE6kF24B7/why-hasn-t-deep-learning-generated-significant-economic
        So I’m somewhat confused on how profitable AI is, but maybe I can just start asking random experts and researching AI startups
    - jacob_cannell 19 May 2022 19:48 UTC
      2 points
      Parent
      Crypto generally has been unusually profitable (as a career or investment), but much of this simply stems from the rising tide raising all boats. Given that crypto market caps can only grow about another 10x or so before approaching major currency status and financial system parity at which point growth will slow—it seems likely that much of the glory days of crypto are behind us.
      
      AI/ML is pretty clearly the big thing happening on Earth.
      
      Admittedly in the short term it may be easier to cash in on a quick crypto career, but since most startups fail, consider whether you’d rather try but fail to have an impact on AI vs trying but failing at crypto wealth.
- Not Relevant 18 May 2022 16:59 UTC
  6 points
  Parent
  This is a real shame—there are lots of alignment research directions that could really use productive smart people.
  I think you might be trapped in a false dichotomy of “impossible” or “easy”. For example, Anthropic/Redwood Research’s safety directions will succeed or fail in large part based on how much good interpretability/adversarial auditing/RLHF-and-its-limitations/etc. work smart people do. Yudkowsky isn’t the only expert, and if he’s miscalibrated then your actions have extremely high value.
  - Rob Bensinger 28 May 2022 21:54 UTC
    7 points
    Parent
    This comment is also falling for a version of the ‘impossible’ vs. ‘easy’ false dichotomy. In particular:
    For example, Anthropic/Redwood Research’s safety directions will succeed or fail in large part based on how much good interpretability/adversarial auditing/RLHF-and-its-limitations/etc. work smart people do.
    Eliezer has come out loudly and repeatedly in favor of Redwood Research’s work as worth supporting and helping with. Your implied ‘it’s only worth working at Redwood if Eliezer is wrong’ is just false, and suggests a misunderstanding of Eliezer’s view.
    Yudkowsky isn’t the only expert, and if he’s miscalibrated then your actions have extremely high value.
    The relevant kind of value for decision-making is ‘expected value of this option compared to the expected value of your alternative values’, not ‘guaranteed value’. The relative expected value of alignment research, if you’re relatively good at it, is almost always extremely high. Adding ‘but only if Eliezer is wrong’ is wrong.
    Specifically, the false dichotomy here is ‘everything is either impossible or not-highly-difficult’. Eliezer thinks alignment is highly difficult, but not impossible (nor negligibly-likely-to-be-achieved). Conflating ‘highly difficult’ with ‘impossible’ is qualitatively the same kind of error as conflating ‘not easy’ with ‘impossible’.
    - Not Relevant 29 May 2022 3:21 UTC
      2 points
      Parent
      You’re right, and my above comment was written in haste. I didn’t mean to imply Eliezer thought those directions were pointless, he clearly doesn’t. I do think he’s stated, when asked on here by incoming college students what they should do, something to the effect of “I don’t know, I’m sorry”. But I think I did mischaracterize him in my phrasing, and that’s my bad, I’m sorry.
      
      My only note is that, when addressing newcomers to the AI safety world, the log-odds perspective of the benefit of working on safety requires several prerequisites that many of those folks don’t share. In particular, for those not bought into longtermism/pure utilitarianism, “dying with dignity” by increasing humanity’s odds of survival from 0.1% to 0.2% at substantial professional and emotional cost to yourself during the ~10 years you believe you still have, is not prima facie a sufficiently compelling reason to work on AI safety. In that case, arguing that from an outside view the number might not actually be so low seems an important thing to highlight to people, even if they happen to eventually update down that far upon forming an inside view.