Raemon comments on Would catching your AIs trying to escape convince AI developers to slow down or undeploy?

Raemon 30 Aug 2024 19:27 UTC
12 points
8
This feels like it’s ignoring a ton of context and being weirdly dismissive or something in the process.
Buck doesn’t say we should “treat all AI like it is a single unique thing with a unique answer.” Also, of the various people on LessWrong, Buck is maybe the person who has most been appreciating and arguing for the “actually building fences might totally be a practically useful approach in some cases”, compared to other alignment researchers.
But, like, that paradigm will probably only work for a few years. In some ways AI will likely be like hypersmart children who may indeed can (for some window of time) be gently herded around or have better fences built around them. But, we’re likely putting thousands of instances of those children in important positions of critical infrastructure. We only have a narrow window of time to figure out how to handle those children when they are suddenly adults.
- Logan Zoellner 30 Aug 2024 21:56 UTC
  2 points
  0
  Parent
  It sounds like you agree that we need a more nuanced approach.
  I guess I’m just frustrated because many of the policy proposals coming out of the EA/rationalist memeplex (six month pause, arbitrary limits on compute, AGI island, shut it all down) don’t sound nuanced to me.
  Maybe Buck has better ideas than “persuade people to freak out”, and if so I’d love to hear about those.
  - Raemon 30 Aug 2024 22:20 UTC
    3 points
    1
    Parent
    I mean, you just… do need some kind of policy proposal that actually will stop superintelligence. All the examples you gave of keeping things under control were things that weren’t overwhelmingly smarter than you.
    Pause/Compute-Control/Shut-it-all-down aren’t meant to be complete solutions, we’re just pretty desperate for anything that could give us the time we need to come up with a solution that’d actually work well longterm, and it seems like a decent chance that we’re on a tight deadline to find a solution that doesn’t result in someone in the world building something overwhelmingly smarter than us without a plan for controlling or aligning it.
    It sounds like this is more of an ongoing frustration than something unique to this post, and not sure if this is the place for it, but, interested in knowing what the shapes of your cruxes are here. Do you basically not believe “superintelligence is going to be substantially different than other complex systems’s we’ve contended with?”.
    - Logan Zoellner 31 Aug 2024 1:50 UTC
      2 points
      0
      Parent
      Quite the opposite, I think superintelligence is going to be substantially different than other complex systems, so regulating GPT-5 like it’s superintelligence is a non-starter for me.
      - Raemon 31 Aug 2024 1:57 UTC
        2 points
        0
        Parent
        What sort of things do you expect to work better?
        Logan Zoellner 31 Aug 2024 7:36 UTC
        4 points
        −3
        Parent
        What we really need at the moment are smart people deeply engaging the details of how current models work. In terms of large labs, Anthropic has probably done the best (e.g. Golden Gate Claude). But I also think there’s a ton of value coming from people like Janus who are genuinely curious about how these models behave.
        
        If I had a magic policy wand I would probably wish for something like Anthropic’s RSPs as an early warning system combined with tons of micro grants to anyone willing with current SOTA models in an empirically guided way. Given that the Transformer architecture seems inherently myopic/harmless, I also think we should open source much more than we have (certainly up to and including GPT-5).
        The fact that we don’t know how to solve alignment means that we don’t know where a solution will come from, so we should be making as many bets as possible (especially while the technology is still passively safe).
        I’m much happier that someone is building e.g. chaosGPT now rather than in 3-5 years when we will have wide-scale deployment of potentially lethal robots in every home/street in America.
        Raemon 31 Aug 2024 17:27 UTC
        2 points
        0
        Parent
        If I had a magic policy wand I would probably wish for something like Anthropic’s RSPs as an early warning system combined with tons of micro grants to anyone willing with current SOTA models in an empirically guided way.
        Do you have a way to do that that doesn’t route through compute governance?
        I don’t necessarily disagree with these things (I don’t have a super strong opinion), but the thing that seems very likely to me is that we need more time to make lots of bets and see research play out. The point of pauses, and compute governance, is to get time for those bets to play out. (I think it’s plausibly reasonable position that “shut it all down” would be counterproductive, but the other things you listed frustration with seem completely compatible with everything you said)
        Logan Zoellner 31 Aug 2024 19:52 UTC
        2 points
        0
        Parent
        the PauseAI people have been trying to pause since GPT2. It’s not “buying time” if you freeze research at some state where it’s impossible to make progress. It’s also not “buying time” if you ban open-sourcing models (like llama-4) that are obviously not existentially dangerous and have been a huge boon for research.
        
        Obviously once we have genuinely dangerous models (e.g. capable of building nuclear weapons undetected) they will need to be restricted but the actual limits being proposed are arbitrary and way too low.
        Limits need to be based on contact with reality, which means engineers making informed decisions, not politicians making arbitrary ones.