Zack_M_Davis comments on Quick takes on “AI is easy to control”

Zack_M_Davis 3 Dec 2023 13:34 UTC
33 points
15
I agree that it would be terrible for people to form tribal identities around “optimism” or “pessimism” (and have criticized Belrose and Pope’s “AI optimism” brand name on those grounds). However, when you say

don’t think of themselves as participating in ‘optimist’ or ‘pessimist’ communities, and would not use the term to describe their community. So my sense is that this is a false description of the world

I think you’re playing dumb. Descriptively, the existing “EA”/”rationalist” so-called communities are pessimistic. That’s what the “AI optimists” brand is a reaction to! We shouldn’t reify pessimism as an identity (because it’s supposed to be a reflection of reality that responds to the evidence), but we also shouldn’t imagine that declining to reify a description as a tribal identity makes it “a false description of the world”.
- niplav 3 Dec 2023 20:45 UTC
  40 points
  6
  Parent
  I think the words “optimism” and “pessimism” are really confusing, because they conflate the probability, utility and steam of things:
  
  You can be “optimistic” if you believe a good event is likely (or a bad one unlikely), you can be optimistic because you believe a future event (maybe even unlikely) is good, or you have a plan or idea or stance for which you have a high recursive self-trust/recursive reflectively stable prediction that you will engage in it.
  
  So you could be “pessimistic” in the sense that extinction due to AI is unlikely (say, <1%) but you find it super bad and you currently don’t have anything concrete that you can latch onto to decrease it.
  
  Or (in the case of e.g. MIRI) you might have (“indefinitely optimistic”?) steam for reducing AI risk, find it moderately to extremely likely, and think it’s going to be super bad.
  
  Or you might think that extinction would be super bad, and believe it’s unlikely (as Belrose and Pope do) and have steam for both AI and AI alignment.
  
  But the terms are apparently confusing to many people, and I think using these terminologies can “leak” optimism or pessimism from one category into another, and can lead to worse decisions and incorrect beliefs.
  What links here?
  - niplav's comment on Steam by abramdemski (5 Dec 2023 14:30 UTC; 4 points)
- Ben Pace 4 Dec 2023 2:16 UTC
  20 points
  11
  Parent
  It’s correct that there’s a distinction between whether people identify as pessimistic and whether they are pessimistic in their outlook. I think the first claim is false, and I actually also think the second claim is false, though I am less confident in that.
  Interview with Rohin Shah in Dec ’19
  Rohin reported an unusually large (90%) chance that AI systems will be safe without additional intervention. His optimism was largely based on his belief that AI development will be relatively gradual and AI researchers will correct safety issues that come up.
  Paul Christiano in Dec ’22
  ...without AI alignment, AI systems are reasonably likely to cause an irreversible catastrophe like human extinction. I think most people can agree that this would be bad, though there’s a lot of reasonable debate about whether it’s likely. I believe the total risk is around 10–20%, which is high enough to obsess over.
  Scott Alexander, in Why I Am Not (As Much Of) A Doomer (As Some People) in March ’23
  I go back and forth more than I can really justify, but if you force me to give an estimate it’s probably around 33%; I think it’s very plausible that we die, but more likely that we survive (at least for a little while).
  John Wentworth in Dec ’21 (also see his to-me-inspiring stump speech from a month later):
  What’s your plan for AI alignment?
  Step 1: sort out our fundamental confusions about agency
  Step 2: ambitious value learning (i.e. build an AI which correctly learns human values and optimizes for them)
  Step 3: …
  Step 4: profit!
  … and do all that before AGI kills us all.
  That sounds… awfully optimistic. Do you actually think that’s viable?
  Better than a ⁵⁰⁄₅₀ chance of working in time.
  Davidad also feels to me like an optimist to me about the world — someone who is excited about solving the problems and finding ways to win, and is excited about other people and ready to back major projects to set things on a good course. I don’t know his probability of an AI takeover but I stand by that he doesn’t seem pessimistic in personality.
  On occasion when talking to researchers, I talk to someone who is optimistic that their research path will actually work. I won’t name who but I recently spoke with a long-time researcher who believes that they have a major breakthrough and will be able to solve alignment. I think researchers can trick themselves into thinking they have a breakthrough when they don’t, and this field is unusually lacking in feedback, so I’m not saying I straightforwardly buy their claims, but I think it’s inaccurate to describe them all as pessimistic.
  - Ben Pace 4 Dec 2023 2:17 UTC
    9 points
    3
    Parent
    A few related thoughts:
    One story we could tell is that the thing these people have in common is that they take alignment seriously, not that they are generally pessimists.
    I think alignment is unsolved in the general case and so this makes it harder to strongly argue that it will get solved for future systems, but I don’t buy that people would not update on seeing a solution or strong arguments for that conclusion, and I think that some of Quintin’s and Nora’s arguments have caused people I know to rethink their positions and update some in that direction.
    I think the rationalist and EA spaces have been healthy enough for people to express quite extreme positions of expecting an AI-takeover-slash-extinction. I think it would be a strongly negative sign for everyone in these spaces to have identical views or for everyone to give up all hope on civilization’s prospects; but in the absence of that I think it’s a sign of health that people are able to be open about having very strong views. I also think the people who most confidently anticipate an AI takeover sometimes feel and express hope.
    I don’t think everyone is starting with pessimism as their bottom line, and I think it’s inaccurate to describe the majority of people in these ecosystems as temperamentally pessimistic or epistemically pessimistic.
- leogao 4 Dec 2023 4:14 UTC
  9 points
  0
  Parent
  I think there are at least two definitions of optimistic/pessimistic that are often conflated:
  - Epistemic: an optimist is someone who thinks doom is unlikely, a pessimist someone who thinks doom is likely
  - Dispositional: an optimist is someone who is hopeful and glass-half-full, a pessimist is someone who is despondent and fatalistic
  Certainly these are correlated to some extent: if you believe there’s a high chance of everyone dying, probably this is not great for your mental health. Also probably people who are depressed are more likely to have negatively distorted epistemics. This would explain why it’s tempting to use the same term to refer to both.
  However, I think using the same term to refer to both leads to some problems:
  - Being cheerful and hopeful is generally a good trait to have. However, this often bleeds into also believing it is desirable to have epistemic beliefs that doom is unlikely, rather than trying to figure out whether doom is actually likely.
  - Because “optimism” feels morally superior to “pessimism” (due to the dispositional definition), it’s inevitable that using the terms for tribal affiliation even for the epistemic definition causes tension.
  I personally strive to be someone with an optimistic disposition and also to try my best to have my beliefs track the truth. I also try my best to notice and avoid the tribal pressures.

Zack_M_Davis comments on Quick takes on “AI is easy to control”

What’s your plan for AI alignment?

That sounds… awfully optimistic. Do you actually think that’s viable?