Scaffolding generative models to create agents resembles the long tradition of neuro-symbolic A.I. that never really works; I think it is based on the fantasy that if we don’t know how to build capability X into system M, we can fake it by adding capability X on top of system M. This is consistently a kludge, and I don’t know of any significant progress arising from it.
I mostly agree with you here, but I think in this case the details do matter a bit. I know that the main topic of your post is how to improve human rationality, but I’m going to go on a side tangent to talk about Recursive Self Improvement.
In the case where we have a generative model which can write code, generate simple hypothesis from data, utilize an API to initiate tests based on those hypotheses, analyze the resulting data, repeat.… We have a sort of dumb brute-force scientist. If we have a lot of those, and throw them at the problem of improving AI (which is a plausible thing which could come to pass), then we might see sudden-seeming progress on developing an improved algorithm which incorporates more of the thought process into the model itself.
I believe we are just shy of the point where the frontier LLMs are sufficient to act even as dumb versions of a brute-force scientist. I think we are close enough that a temporary kludgey hack might get us over the edge. Specifically, I think that scaffolding such as is described in this recent paper from DeepMind, combined with a stronger successor to GPT-4 (e.g. GPT-5), would probably be enough.
I think that part of the parcel of algorithmic progress which incorporates more of the meta-level reasoning into the model and training process itself will also lead to efficiency gains in training.
Thus, I have made this prediction market to express my expectation that GPT-5, used in this brute-force scientist way with adequate scaffolding, will result in finding advances sufficient to train a next generation of superior models with only the same amount of compute used to train GPT-5.
Yes, there is also an earlier Microsoft Research paper which is, in some sense, more modest, but is more directly pointing towards recursive self-improvement via scaffolding+LLM generating a better scaffolding for the same LLM, and then repeating this operation several times with better and better scaffolding.
One particularly interesting piece of data there is Figure 4 on page 6 which shows the dependency on the quality of the underlying LLM (the process actually does not work and leads to degradation with GPT-3.5, and the same process successfully self-improves for a few iterations (but then saturates) with GPT-4).
So one might ask how big this self-improvement might be with a better underlying LLM.
Orthogonally to the quality of the underlying LLM, I think it is not too difficult to improve methods for scaffolding generation quite a bit (there are various ways to make it much better than in these papers, even with the current LLMs). So one indeed wonders how soon this becomes a major contributor to the take-off speed...
It is probably possible to make some form of scaffolding work, I’m just skeptical that it’s going to be as effective as training an agent directly. Depending on timelines, scaffolding might still feed progress towards superintelligence.
I mostly agree with you here, but I think in this case the details do matter a bit. I know that the main topic of your post is how to improve human rationality, but I’m going to go on a side tangent to talk about Recursive Self Improvement.
In the case where we have a generative model which can write code, generate simple hypothesis from data, utilize an API to initiate tests based on those hypotheses, analyze the resulting data, repeat.… We have a sort of dumb brute-force scientist. If we have a lot of those, and throw them at the problem of improving AI (which is a plausible thing which could come to pass), then we might see sudden-seeming progress on developing an improved algorithm which incorporates more of the thought process into the model itself.
I believe we are just shy of the point where the frontier LLMs are sufficient to act even as dumb versions of a brute-force scientist. I think we are close enough that a temporary kludgey hack might get us over the edge. Specifically, I think that scaffolding such as is described in this recent paper from DeepMind, combined with a stronger successor to GPT-4 (e.g. GPT-5), would probably be enough.
I think that part of the parcel of algorithmic progress which incorporates more of the meta-level reasoning into the model and training process itself will also lead to efficiency gains in training.
Thus, I have made this prediction market to express my expectation that GPT-5, used in this brute-force scientist way with adequate scaffolding, will result in finding advances sufficient to train a next generation of superior models with only the same amount of compute used to train GPT-5.
Yes, there is also an earlier Microsoft Research paper which is, in some sense, more modest, but is more directly pointing towards recursive self-improvement via scaffolding+LLM generating a better scaffolding for the same LLM, and then repeating this operation several times with better and better scaffolding.
One particularly interesting piece of data there is Figure 4 on page 6 which shows the dependency on the quality of the underlying LLM (the process actually does not work and leads to degradation with GPT-3.5, and the same process successfully self-improves for a few iterations (but then saturates) with GPT-4).
So one might ask how big this self-improvement might be with a better underlying LLM.
Orthogonally to the quality of the underlying LLM, I think it is not too difficult to improve methods for scaffolding generation quite a bit (there are various ways to make it much better than in these papers, even with the current LLMs). So one indeed wonders how soon this becomes a major contributor to the take-off speed...
It is probably possible to make some form of scaffolding work, I’m just skeptical that it’s going to be as effective as training an agent directly. Depending on timelines, scaffolding might still feed progress towards superintelligence.