Richard_Ngo comments on AI #32: Lie Detector

Richard_Ngo 14 Oct 2023 16:43 UTC
20 points
5
Any regulation that reduces OpenAI/DeepMind/Anthropic’s ability to train big models will also affect Meta or Mistral’s ability to train big models. So for most purposes we can just ignore the open-source part and focus on the “are you training big models at all” part.
What does focusing on open-source in particular get us? Mostly the perception that alignment people (historically associated with big tech) are “punching down” against small scrappy open-source communities. Those communities will seem less like underdogs when the models they produce have actually been used to physically harm people, as they inevitably will. The question is whether open source is going to be so important that it’s worth paying the costs of “we punch underdogs” signaling to lay groundwork right now, or whether people are mostly focusing on open source because it allows them to draw clearer tribalist battle lines (look, Meta/Mistral/etc are obviously our enemies, and it feels so good to dunk on Yann!). The latter seems more likely to me.
The debate is largely tribal because (it seems to me) the open source advocates are (mostly) highly tribal and ideological and completely unopen to compromise or nuance or to ever admit a downside, and attack the motives of everyone involved as their baseline move. I don’t know what to do about that. Also, they punch far above their weight in us-adjacent circles.
This sure seems like a reason to wait until they come around, rather than drawing battle lines now. Which I predict they will, because they’re not actually that unreasonable, they’re just holding on to a very strong norm which has worked very well for basically every technology so far, and which is (even now) leading to a bunch of valuable alignment research.
(In general, when your highly competent opponents seem crazy, you’re very likely failing their ITT.)
- Zvi 20 Oct 2023 20:31 UTC
  6 points
  2
  Parent
  I agree that any regulation that hits OA/DM/AN (CS) hits OS. If we could actually pass harsh enough restrictions on CS that we’d be fine with OS on the same level, then that would be great, but I don’t see how that happens? Or how the restrictions we put on CS in that scenario don’t amount to including a de facto OS ban?
  That seems much harder, not easier, than getting OS dealt with alone? And also, OS needs limits that are stricter than CS needs, and if we try to hit CS with the levels OS requires we make things so much harder. Yes, OS people are tribal opposition, but they’ve got nothing on all of CS. And getting incremental restrictions done (e.g. on OS) helps you down the line in my model, rather than hurting you. Also, OS will be used as justifications for why we can’t restrict CS, and helps fuel competition that will do the same, and I do think there’s a decent chance it matters in the end. Certainly the OS people think so.
  Meanwhile, do we think that if we agree to treat OS=CS, that OS would moderate their position at all? I think no. Their position is to oppose all restrictions on principle. They might be slightly less mad if they’re not singled out, but I doubt very much so if it would still have the necessary teeth. I’ve never seen an OS advocate call for restrictions or even fail to oppose restrictions on CS. Unless that restriction was to require them to be OS. Nor should they, given their other beliefs.
  On the part after the quote, I notice I am confused. Why do you expect these highly tribal people standing on a principle to come around? What would make them come around? I see them as only seeing every release that does not cause catastrophe as more evidence OS is great, and hardening their position. I am curious what you think would be evidence that would bring the bulk of OS to agree to turn against OS AI scaling enough to support laws against it. I can’t think of an example of a big OS advocate who has said ‘if X then I would change my mind on that’ where X is something that leaves most of us alive.
  - Richard_Ngo 21 Oct 2023 2:41 UTC
    6 points
    6
    Parent
    What would make them come around?
    Taking AGI more seriously; seeing warning shots; etc. Like I said, I think these people are reasonable, but even the most reasonable people have a strong instinct to rally around their group’s flag when it’s being attacked. I don’t think most OS people are hardcore libertarians, I just think they don’t take the risks seriously enough right now to abandon the thing that has historically worked really well (especially when they’re probably disproportionately seeing the most unreasonable arguments from alignment people, because that’s how twitter works).
    In general there’s a strong tendency amongst rationalists to assume that if people haven’t come around on AI risk yet, they’ll never come around. But this is just bad modeling of how other people work. You should model most people in these groups as, say, 5x less open to abstract arguments than you, and 5x more responsive to social consensus. Once the arguments start seeming less abstract (and they start “feeling the AGI”, as Ilya puts it), and the social consensus builds, there’s plenty of scope for people to start caring much more about alignment.