It’s unclear to me that general-purpose search works “out of the box”. To be clear – you could certainly apply it to anything, but I can imagine it being computationally expensive to the point where it’s not what you use in most situations.
With respect to the second point: I think there exists something sufficiently like search that’s just short of general-purpose search (whatever the exact definition is here) that a language model could carry out and still function approximately the same.
Agreed with the first part, but not sure I agree with the second. Could you give an example of something that’s “just short” of general purpose search which, if a LLM posessed it, would not result in a clear increase in capabilities? I’m thinking you mean something like: GPT-3, upon being fine tuned on chess, gains an abstract model of the game of chess which it searches over using some simple heuristics to find a good move to play upon being fed in a board state. That seems like it would function approximately the same, but I’m not sure if I would call that “just short” of general purpose search. It shares some properties with general purpose search, but the ones it is missing seem pretty darn important.
So my second point is mostly in response to this part of the OP:
I would be quite impressed if you showed it could do general purpose search.
I guess the argument is something like: we don’t know what general purpose search would look like as implemented by an LM + it’s possible that an LM does something functionally similar to search that we don’t recognise as search + it’s possible to get pretty far capability-wise with just bags of heuristics. I think I’m least confident in the last point, because I think that with more & more varied data the pressure is to move from memorisation to generalisation. I’m not sure where the cutoff is, or if there even is one.
It seems more likely that with more powerful models you get a spectrum from pure heuristics to general-purpose search, where there are “searchy” things in the middle. As a model moves along this spectrum it gets less use out of its heuristics – they just don’t apply as well – and more and more out of using search, so it expands what it uses search for, and in what ways. At some point, it might converge to just use search for everything. It’s this latter configuration that I imagine you mean by general-purpose search, and I’m basically gesturing that there searchy things that come before it (which are not exclusively using search to perform inference).
It’s unclear to me that general-purpose search works “out of the box”. To be clear – you could certainly apply it to anything, but I can imagine it being computationally expensive to the point where it’s not what you use in most situations.
With respect to the second point: I think there exists something sufficiently like search that’s just short of general-purpose search (whatever the exact definition is here) that a language model could carry out and still function approximately the same.
Agreed with the first part, but not sure I agree with the second. Could you give an example of something that’s “just short” of general purpose search which, if a LLM posessed it, would not result in a clear increase in capabilities? I’m thinking you mean something like: GPT-3, upon being fine tuned on chess, gains an abstract model of the game of chess which it searches over using some simple heuristics to find a good move to play upon being fed in a board state. That seems like it would function approximately the same, but I’m not sure if I would call that “just short” of general purpose search. It shares some properties with general purpose search, but the ones it is missing seem pretty darn important.
So my second point is mostly in response to this part of the OP:
I guess the argument is something like: we don’t know what general purpose search would look like as implemented by an LM + it’s possible that an LM does something functionally similar to search that we don’t recognise as search + it’s possible to get pretty far capability-wise with just bags of heuristics. I think I’m least confident in the last point, because I think that with more & more varied data the pressure is to move from memorisation to generalisation. I’m not sure where the cutoff is, or if there even is one.
It seems more likely that with more powerful models you get a spectrum from pure heuristics to general-purpose search, where there are “searchy” things in the middle. As a model moves along this spectrum it gets less use out of its heuristics – they just don’t apply as well – and more and more out of using search, so it expands what it uses search for, and in what ways. At some point, it might converge to just use search for everything. It’s this latter configuration that I imagine you mean by general-purpose search, and I’m basically gesturing that there searchy things that come before it (which are not exclusively using search to perform inference).