RussellThor comments on Against “argument from overhang risk”

RussellThor 16 May 2024 21:19 UTC
3 points
0
Another perspective.
If you believe like me that it is >90% that the current LLM approach is plateauing then your cost/benefit for pausing large training runs is different. I believe that the current AI lacks something like the generalization power of the human brain, this can be seen where Tesla auto-pilot has needed >10,000* the training data as a person and is still not human level. This could potentially be overcome by a better architecture, or could require different hardware as well because of the Von Neumann Bottleneck. If this is the case then a pause on large training runs can hardly be helpful. I believe that if LLM are not X-risk, then their capabilities should be fully explored and integrated fast into society to provide defense against more dangerous AI. It is a radically improved architecture or hardware that you should be worried about.
Three potential sources of danger
1. Greatly improved architecture
2. Large training run with current arch
3. Greatly improved HW
We are paying more attention to (2) when to me it is the least impactful of the three and could even hurt. There are obvious ways this can hurt the cause.
1. If such training runs are not dangerous then the AI safety group loses credibility.
2. It could give a false sense of security when a different arch requiring much less training appears and is much more dangerous than the largest LLM.
3. It removes the chance to learn alignment and safety details from such large LLM
A clear path to such a better arch is studying neurons. Whether this is Dishbrain, through progress in neural interfaces, brain scanning or something else, I believe it is very likely by 2030 we will have understood the brain/neural algorithm, characterized it pretty well and of course have the ability to attempt to implement it in our hardware.
So in terms of pauses, I think one targeted towards chip factories is better. It is achievable and it is clear to me that if you delay a large factory opening by 5 years, then you can’t make up the lost time in anything like the same way for software.
Stopping (1) seems impossible i.e. “Don’t study the human brain” seems likely to backfire. We would of course like some agreement that if a much better arch is discovered, it isn’t immediately implemented.
- RobertM 17 May 2024 4:47 UTC
  5 points
  0
  Parent
  This seems to be arguing that the big labs are doing some obviously-inefficient R&D in terms of advancing capabilities, and that government intervention risks accidentally redirecting them towards much more effective R&D directions. I am skeptical.
  If such training runs are not dangerous then the AI safety group loses credibility.
  It could give a false sense of security when a different arch requiring much less training appears and is much more dangerous than the largest LLM.
  It removes the chance to learn alignment and safety details from such large LLM
  1. I’m not here for credibility. (Also, this seems like it only happens, if it happens, after the pause ends. Seems fine.)
  2. I’m generally unconvinced by arguments of the form “don’t do [otherwise good thing x]; it might cause people to let their guard down and get hurt by [bad thing y]” that don’t explain why they aren’t a fully-general counterargument.
  3. If you think LLMs are hitting a wall and aren’t likely to ever lead to dangerous capabilities then I don’t know why you expect to learn anything particularly useful from the much larger LLMs that we don’t have yet, but not from those we do have now.
  - RussellThor 17 May 2024 10:02 UTC
    1 point
    0
    Parent
    In terms of the big labs being inefficient, with hindsight perhaps. Anyway I have said that I can’t understand why they aren’t putting much more effort into Dishbrain etc. If I had ~$1B and wanted to get ahead on a 5 year timescale I would give it more probability expectation etc.
    For
    I am here for credibility. I am sufficiently highly confident they are not X-risk to not want to recommend stopping. I want the field to have credibility for later.
    Yes, but I don’t think stopping the training runs is much of an otherwise good thing if at all. To me it seems more like inviting a fire safety expert and they recommend a smoke alarm in your toilet but not kitchen. If we can learn alignment stuff from such training runs, then stopping is an otherwise bad thing.
    OK I’m not up with the details but some experts sure think we learnt a lot from 3.5/4.0. Also my belief about it often being a good idea to deploy the most advanced non X-risk AI as defense. (This is somewhat unclear, usually what doesn’t kill makes stronger, but I am concerned about AI companion/romantic partner etc. That could weaken society in a way to make it more likely to make bad decisions later. But that seems to have already happened and very large models being centralized could be secured against more capable/damaging versions.)