I don’t dispute that you can build agent AIs, and that they can be useful.
I don’t claim that it is possible to get the same economic benefits by restricting to tool AIs. Indeed, in my previous post with Edelman, we explicitly said that we do consider AIs that are agentic in the sense that they can take action, including self-driving, writing code, executing trades etc..
I don’t dispute that one way to build those is to take a next-token predictor such as pretrained GPT3, and then use fine-tuning, RHLF, prompt engineering or other methods to turn it into an agent AI. (Indeed, I explicitly say so in the current post.)
My claim is that it is a useful abstraction to (1) separate intelligence from agency, and (2) intelligence in AI is a monotone function of the computational resources (FLOPs, data, model size, etc.) invested into building the model.
Now if you want to take 3.6 Trillion gradient steps in a model, then you simply cannot do it by having it take actions and wait to get some reward. So I do claim that if we buy the scaling hypothesis that intelligence scales with compute, the bulk of the intelligence of models such as GPT-n, PALM-n, etc. comes from the non agentic next-token predictor.
So, I believe it is useful and more accurate to think of (for example) a stock trading agent that is built on top of GPT-4 as consisting of an “intelligence forklift” which accounts for 99.9% of the computational resources, plus various layers of adaptations, including supervised fine-tuning, RL from human feedback, and prompt engineering, to obtain the agent.
The above perspective does not mean that the problem of AI safety or alignment is solved. But I do think it is useful to think of intelligence as belonging to a system rather than an individual agent, and (as discussed briefly above) that considering it in this way changes somewhat the landscape of both problems and solutions.
Ah. Well, if that’s what you are saying then you are preaching to the choir. :) See e.g. the “pretrained LLMs are simulators / predictors / oracles” discourse on LW.
I feel like there is probably still some disagreement between us though. For example I think the “bulk of the intelligence comes from non-agentic next-token predictor” claim you make probably is either less interesting or less true than you think it is, depending on what kinds of conclusions you think follow from it. If you are interested in discussing more sometime I’d be happy to have a video call!
Agree that we still disagree and (in my biased opinion) that claim is either more interesting or more true than you realize :)
Not free for a call soon but hope eventually there is an opportunity to discuss more.
I discussed Gwern’s article in another comment. My point (which also applies to Gwern’s essay on GPT3 and scaling hypothesis) is the following:
I don’t dispute that you can build agent AIs, and that they can be useful.
I don’t claim that it is possible to get the same economic benefits by restricting to tool AIs. Indeed, in my previous post with Edelman, we explicitly said that we do consider AIs that are agentic in the sense that they can take action, including self-driving, writing code, executing trades etc..
I don’t dispute that one way to build those is to take a next-token predictor such as pretrained GPT3, and then use fine-tuning, RHLF, prompt engineering or other methods to turn it into an agent AI. (Indeed, I explicitly say so in the current post.)
My claim is that it is a useful abstraction to (1) separate intelligence from agency, and (2) intelligence in AI is a monotone function of the computational resources (FLOPs, data, model size, etc.) invested into building the model.
Now if you want to take 3.6 Trillion gradient steps in a model, then you simply cannot do it by having it take actions and wait to get some reward. So I do claim that if we buy the scaling hypothesis that intelligence scales with compute, the bulk of the intelligence of models such as GPT-n, PALM-n, etc. comes from the non agentic next-token predictor.
So, I believe it is useful and more accurate to think of (for example) a stock trading agent that is built on top of GPT-4 as consisting of an “intelligence forklift” which accounts for 99.9% of the computational resources, plus various layers of adaptations, including supervised fine-tuning, RL from human feedback, and prompt engineering, to obtain the agent.
The above perspective does not mean that the problem of AI safety or alignment is solved. But I do think it is useful to think of intelligence as belonging to a system rather than an individual agent, and (as discussed briefly above) that considering it in this way changes somewhat the landscape of both problems and solutions.
Ah. Well, if that’s what you are saying then you are preaching to the choir. :) See e.g. the “pretrained LLMs are simulators / predictors / oracles” discourse on LW.
I feel like there is probably still some disagreement between us though. For example I think the “bulk of the intelligence comes from non-agentic next-token predictor” claim you make probably is either less interesting or less true than you think it is, depending on what kinds of conclusions you think follow from it. If you are interested in discussing more sometime I’d be happy to have a video call!
Agree that we still disagree and (in my biased opinion) that claim is either more interesting or more true than you realize :) Not free for a call soon but hope eventually there is an opportunity to discuss more.