So my second point is mostly in response to this part of the OP:
I would be quite impressed if you showed it could do general purpose search.
I guess the argument is something like: we don’t know what general purpose search would look like as implemented by an LM + it’s possible that an LM does something functionally similar to search that we don’t recognise as search + it’s possible to get pretty far capability-wise with just bags of heuristics. I think I’m least confident in the last point, because I think that with more & more varied data the pressure is to move from memorisation to generalisation. I’m not sure where the cutoff is, or if there even is one.
It seems more likely that with more powerful models you get a spectrum from pure heuristics to general-purpose search, where there are “searchy” things in the middle. As a model moves along this spectrum it gets less use out of its heuristics – they just don’t apply as well – and more and more out of using search, so it expands what it uses search for, and in what ways. At some point, it might converge to just use search for everything. It’s this latter configuration that I imagine you mean by general-purpose search, and I’m basically gesturing that there searchy things that come before it (which are not exclusively using search to perform inference).
So my second point is mostly in response to this part of the OP:
I guess the argument is something like: we don’t know what general purpose search would look like as implemented by an LM + it’s possible that an LM does something functionally similar to search that we don’t recognise as search + it’s possible to get pretty far capability-wise with just bags of heuristics. I think I’m least confident in the last point, because I think that with more & more varied data the pressure is to move from memorisation to generalisation. I’m not sure where the cutoff is, or if there even is one.
It seems more likely that with more powerful models you get a spectrum from pure heuristics to general-purpose search, where there are “searchy” things in the middle. As a model moves along this spectrum it gets less use out of its heuristics – they just don’t apply as well – and more and more out of using search, so it expands what it uses search for, and in what ways. At some point, it might converge to just use search for everything. It’s this latter configuration that I imagine you mean by general-purpose search, and I’m basically gesturing that there searchy things that come before it (which are not exclusively using search to perform inference).