AnthonyC comments on How useful is “AI Control” as a framing on AI X-Risk?

AnthonyC 15 Mar 2024 13:57 UTC
2 points
0
I often think about this as “it’s hard to compete with future AI researchers on moving beyond this early regime”. (That said, we should of course have some research bets for what to do if control doesn’t work for the weakest AIs which are very useful.)
I see this kind of argument a lot, but to my thinking, the next iteration of AI researchers will only have the tools today’s researchers build for them. You’re not trying to compete with them. You’re trying to empower them. The industrial revolution wouldn’t have involved much faster growth rates if James Watt (and his peers) had been born a century later. They would have just gotten a later start at figuring out how to build steam engines that worked well. (Or at least, growth rates may have been faster for various reasons, but at no single point would the state of steam engines in that counterfactual world be farther along than it was historically in ours).
(I hesitate to even write this next bit for fear of hijacking in a direction I don’t want to go, and I’d put it in a spoiler tag if I knew how. But, I think it’s the same form of disagreement I see in discussions of whether we can ‘have free will’ in a deterministic world, which to my viewpoint hinges on whether the future state can be predicted without going through the process itself.)
Who are these future AI researchers, and how did they get here and get better if not by the efforts of today’s AI researchers? And in a world where Sam Altman is asking for $7 trillion and not being immediately and universally ridiculed, are we so resource constrained that putting more effort into whatever alignment research we can try today is actually net-negative?
- ryan_greenblatt 15 Mar 2024 16:53 UTC
  6 points
  0
  Parent
  
  I see this kind of argument a lot, but to my thinking, the next iteration of AI researchers will only have the tools today’s researchers build for them. You’re not trying to compete with them. You’re trying to empower them.
  
  Sure, this sounds like what I was saying. I was trying to say something like “We should mostly focus on ensuring that future AI safety researchers can safely and productively use these early transformative AIs and ensuring these early transformative AIs don’t pose other direct risks and then safety researchers in this period can worry about safety for the next generation of more powerful models.”
  
  Separately, it’s worth noting that many general purpose tools for productively using AIs (for research) will be built with non-safety motivations, so safety researchers don’t necessarily need to invest in building general purpose tools.
  
  are we so resource constrained that putting more effort into whatever alignment research we can try today is actually net-negative
  
  I’m confused about what you’re responding to here.
  - AnthonyC 17 Mar 2024 19:19 UTC
    4 points
    2
    Parent
    To the latter: my point is that except to the extent we’re resource constrained, I’m not sure why anyone (and I’m not saying you are necessarily) would argue against any safe line of research even if they thought it was unlikely to work.
    To the former: I think one of the things we can usefully bestow on future researchers (in any field) is a pile of lines of inquiry, including ones that failed and ones we realized we couldn’t properly investigate yet, and ones where we made even a tiny bit of headway.
    - ryan_greenblatt 17 Mar 2024 21:38 UTC
      4 points
      0
      Parent
      
      my point is that except to the extent we’re resource constrained, I’m not sure why anyone (and I’m not saying you are necessarily) would argue against any safe line of research even if they thought it was unlikely to work.
      
      I mean, all claims that research X is good are claims that X is relatively good compared to the existing alternatives Y. That doesn’t mean that you should only do X, probably should diversify in many cases.
      
      We absolutely do have resource contraints: many good directions aren’t currently being explored because there are even better directions.