I was asked about this on Twitter. Gwern’s essay deserves a fuller response than a comment but I’m not arguing for the position Gwern argues against.
I don’t argue that agent AI are not useful or won’t be built. I am not arguing that humans must always be in the loop.
My argument is that tool vs agent AI is not so much about competition but specialization. Agent AIs have their uses but if we consider the “deep learning equation” of turning FLOPs into intelligence, then it’s hard to beat training for predictions on static data. So I do think that while RL can be used forAI agents, the intelligence “heavy lifting” (pun intended) would be done by non-agentic tool but very large static models.
Even “hybrid models” like GPT3.5 can best be understood as consisting of an “intelligence forklift”—the pretrained next-token predictor on which 99.9% of the FLOPs were spent on building—and an additional light “adapter” that turns this forklift into a useful Chatbot etc
https://gwern.net/tool-ai
I was asked about this on Twitter. Gwern’s essay deserves a fuller response than a comment but I’m not arguing for the position Gwern argues against.
I don’t argue that agent AI are not useful or won’t be built. I am not arguing that humans must always be in the loop.
My argument is that tool vs agent AI is not so much about competition but specialization. Agent AIs have their uses but if we consider the “deep learning equation” of turning FLOPs into intelligence, then it’s hard to beat training for predictions on static data. So I do think that while RL can be used forAI agents, the intelligence “heavy lifting” (pun intended) would be done by non-agentic tool but very large static models.
Even “hybrid models” like GPT3.5 can best be understood as consisting of an “intelligence forklift”—the pretrained next-token predictor on which 99.9% of the FLOPs were spent on building—and an additional light “adapter” that turns this forklift into a useful Chatbot etc