Some things like that already happened—bigger models are better at utilizing tools such as in-context learning and chain of thought reasoning. But again, whenever people plot any graph of such reasoning capabilities as a function of model compute or size (e.g., Big Bench paper) the X axis is always logarithmic. For specific tasks, the dependence on log compute is often sigmoid-like (flat for a long time but then starts going up more sharply as a function of log. compute) but as mentioned above, when you average over many tasks you get this type of linear dependence.
boazbarak
Reflections on “Making the Atomic Bomb”
Ok drew it on the back now :)
Ok drew it on the back now :)
One can make all sorts of guesses but based on the evidence so far, AIs have a different skill profile than humans. This means if we think of any job a which requires a large set of skills, then for a long period of time, even if AIs beat the human average in some of them, they will perform worse than humans in others.
I always thought the front was the other side, but looking at Google images you are right.… don’t have time now to redraw this but you’ll just have to take it on faith that I could have drawn it on the other side 😀
>On the other hand, if one starts creating LLM-based “artificial AI researchers”, one would probably create diverse teams of collaborating “artificial AI researchers” in the spirit of multi-agent LLM-based architectures,.. So, one would try to reproduce the whole teams of engineers and researchers, with diverse participants.
I think this can be an approach to create a diversity of styles, but not necessarily of capabilities. A bit of prompt engineering telling the model to pretend to be some expert X can help in some benchmarks but the returns diminish very quickly. So you can have a model pretending to be this type of person and that but they will suck at Tic-Tac-Toe. (For example, GPT4 doesn’t know to recognize a winning move even when I tell it to play like Terence Tao.)
Regarding the existence of compact ML programs, I agree that it is not known. I would say however that the main benefit of architectures like transformers hasn’t been so much to save in the total number of FLOPs as much as to organize these FLOPs so they are best suited for modern GPUs—that is ensure that the majority of the FLOPs are spent multiplying dense matrices.
I agree that self-improvement is an assumption that probably deserves its own blog post. If you believe exponential self improvement will kick in at some point, then you can consider this discussion as pertaining until the point that it happens.
My own sense is that:
While we might not be super close to them, there are probably fundamental limits to how much intelligence you can pack per FLOP. I don’t believe there is a small C program that is human-level intelligent. In fact, since both AI and evolution seem to have arrived at roughly similar magnitude, maybe we are not that far off? If there are such limits, then no matter how smart the “AI AI-researchers” are, they still won’t be able to get more intelligence per FLOP than these limits.
I do think that AI AI-researchers will be incomparable to human AI-researchers in a similar manner to other professions. The simplistic view that AI research or any form of research as one-dimensional, where people can be sorted by an ELO-like scale, is dead wrong based on my 25 years of experience. Yes, some aspects of AI research might be easier to automate, and we will certainly use AI to automate them and make AI researchers more productive. But, like the vast majority of human professions (with all due respect to elevator operators :) ), I don’t think human AI researchers will be obsolete any time soon.
p.s. I also noticed this “2 comments”—not sure what’s going on. Maybe my footnotes count as comments?
The shape of AGI: Cartoons and back of envelope
I agree that there is much to do to improve AI reliability, and there are a lot of good reasons (in particular to make AI more useful for us) to do so. So I agree reliability will improve. In fact, I very much hope this happens! I believe faster progress on reliability would go a long way toward enabling positive applications of AI.
I also agree that a likely path to do so is by adjusting the effort based on estimates of reliability and the stakes involved. At the moment, systems such as ChatGPT spend the same computational effort if someone asks them to say a joke or if someone asks them for medical advice. I suspect this will change, and variable inference-time computation will become more standard. (Things like “chain of thought” already spend more time on inference compute to get better performance, but they don’t really have a “knob” we can turn so we can control the computation/reliability tradeoff.)
Regarding the deception issue, it might still be the case that such extra effort is observable, and also could come at the cost of solving the original task. (So your performance per compute is worse if you are not merely trying to just solve the task but to deceive in a highly reliable way.)
In particular, even if we apply inference time compute, unfortunately I don’t think we know of a path to get a overhead in inference time to achieve a failure probability of . It seems that we are still stuck in the regime. So if you wanted to get 99.9% probability of not getting caught, then you would incur a very noticeable effort.
Note all capabilities / tasks correspond to trying to maximize a subjective human response. If you are talking about finding software vulnerabilities, design some system, there may well be objective measures of success. In such a case, you can fine tune a system to maximize these measures and so extract capabilities without the issue of deception/manipulation.
Regarding “escapes”, the traditional fear was that because that AI is essentially code, it can spread and escape more easily. But I think that in some sense modern AI has a physical footprint that is more significant than humans. Think of trying to get superhuman scientific capabilities by doing something like simulating a collection of a1000 scientists using a 100T or so parameter model. Even if you already have the pre-trained weights, just running the model requires highly non-trivial computing infrastructure. (Which may be possible to track and detect.) So. it might be easier for a human to escape a prison and live undetected, than for a superhuman AI to “escape”.
We can of course define “intelligence” in a way that presumes agency and coherence. But I don’t want to quibble about definition.
Generally when you have uncertainty, this corresponds to a potential “distribution shift” between your beliefs/knowledge and reality. When you have such a shift then you want to reglularize which means not optimizing to the maximum.
This is not about the definition of intelligence. It’s more about usefulness. Like a gun without a safety, an optimizer without constraints or regularizarion is not very useful.
Maybe it will be possible to build it, just like today it’s possible to hook up our nukes to an automatic launching device. But it’s not necessary that people will do something so stupid.
The notion of a piece of code that maximizes a utility without any constraints doesn’t strike me as very “intelligent “.
if people really wanted to, they may be able to build such programs, but my guess is that they would be not very useful even before they become dangerous, as overfitting optimizers usually are.
at least some humans (e.g. most transhumanists), are “fanatical maximizers”: we want to fill the lightcone with flourishing sentience, without wasting a single solar system to burn in waste.
I agree that humans have a variety of objectives, which I think is actually more evidence for the hot mess theory?
the goals of an AI don’t have to be simple to not be best fulfilled by keeping humans around.
The point is not about having simple goals, but rather about optimizing goals to the extreme.
I think there is another point of disagreement. As I’ve written before, I believe the future is inherently chaotic. So even a super-intelligent entity would still be limited in predicting it. (Indeed, you seem to concede this, by acknowledging that even super-intelligent entities don’t have exponential time computation and hence need to use “sophisticated heuristics” to do tree search.)
What it means is that there is an inherent uncertainty in the world, and whenever there is uncertainty, you want to “regularize” and not go all out in exhausting a resource which you might not know if you’ll need it later on in the future.
Just to be clear, I think a “hot mess super-intelligent AI” could still result in an existential risk for humans. But that would probably be the case if humans were an actual threat to it, and there was more of a conflict. (E.g., I don’t see it as a good use of energy for us to hunt down every ant and kill it, even if they are nutrituous.)
Metaphors for AI, and why I don’t like them
I actually agree! As I wrote in my post, “GPT is not an agent, [but] it can “play one on TV” if asked to do so in its prompt.” So yes, you wouldn’t need a lot of scaffolding to adapt a goal-less pretrained model (what I call an “intelligence forklift”) into an agent that does very sophisticated things.
However, this separation into two components—the super-intelligent but goal-less “brain”, and the simple “will” that turns it into an agent can have safety implications. For starters, as long as you didn’t add any scaffolding, you are still OK. So during most of the time you spend training, you are not worrying about the system itself developing goals. (Though you could still worry about hackers.) Once you start adapting it, then you need to start worrying about this.
The other thing is that, as I wrote there, it does change some of the safety picture. The traditional view of a super-intelligent AI is of the “brains and agency” tightly coupled together, just like they are in a human. For example, a human is super-good at finding vulnerabilities and breaking into systems, they have the capability to also help fix systems, but I can’t just take their brain and fine-tune it on this task. I have to convince them to do it.However, things change if we don’t think of the agent’s “brain” as belonging to them, but rather as some resource that they are using. (Just like if I use a forklift to lift something heavy.) In particular it means that capabilities and intentions might not be tightly coupled—there could be agents using capabilities to do very bad things, but the same capabilities could be used by other agents to do good things.
At the moment at least, progress on reliability is very slow compared to what we would want. To get a sense of what I mean, consider the case of randomized algorithms. If you have an algorithm that for every input computes some function with probability at least 2⁄3 (i.e. ) then if we spend times more the computation, we can do majority voting and using standard bounds show that the probability of error drops exponentially with (i.e. or something like that where is the algorithm obtained by scaling up to compute it times and output the plurality value).
This is not something special to randomized algorithms. This also holds in the context of noisy communication and error correcting codes, and many other settings. Often we can get to success at a price of , which is why we can get things like “five nines reliability” in several engineering fields.
In contrast, so far all our scaling laws show that when we scale our neural networks by spending a factor of more computation, we only get a reduction in the error that looks like so it’s polynomial rather than exponential, and even the exponent of the polynomial is not that great (and in particular smaller than one).
So while I agree that scaling up will yield progress on reliability as well, at least with our current methods, it seems that we would do things that are 10 or 100 times more impressive than what we do now, before we get to the type of 99.9% and better reliability on the things that we currently do. Getting to do something that is both super-human in capability as well as has such a tiny probability of failure that it would not be detected seems much further off.
I agree that there is a difference between strong AI that has goals and one that is not an agent. This is the point I made here https://www.lesswrong.com/posts/wDL6wiqg3c6WFisHq/gpt-as-an-intelligence-forklift
But this has less to do with the particular lab (eg DeepMind trained Chinchilla) and more with the underlying technology. If the path to stronger models goes through scaling up LLMs then it does seem that they will be 99.9% non agentic (measured in FLOPs https://www.lesswrong.com/posts/f8joCrfQemEc3aCk8/the-local-unit-of-intelligence-is-flops )
There is a general phenomenon in tech that has been expressed many times of people over-estimating the short-term consequences and under-estimating the longer term ones (e.g., “Amara’s law”).
I think that often it is possible to see that current technology is on track to achieve X, where X is widely perceived as the main obstacle for the real-world application Y. But once you solve X, you discover that there is a myriad of other “smaller” problems Z_1 , Z_2 , Z_3 that you need to resolve before you can actually deploy it for Y.
And of course, there is always a huge gap between demonstrating you solved X on some clean academic benchmark, vs. needing to do so “in the wild”. This is particularly an issue in self-driving where errors can be literally deadly but arises in many other applications.
I do think that one lesson we can draw from self-driving is that there is a huge gap between full autonomy and “assistance” with human supervision. So, I would expect we would see AI be deployed as (increasingly sophisticated) “assistants’ way before AI systems actually are able to function as “drop-in” replacements for current human jobs. This is part of the point I was making here.