Never Fight The Last War
Lieutenant Colonel J. L. Schley, of the Corps of Engineers wrote:
“There is a tendency in many armies to spend the peace time studying how to fight the last war”
After the Chinchilla paper, everyone got focused on the idea that large language models can be more effective.
The GPT3.5 we saw in ChatGPT was scary and got people to think about how we can prevent bigger models from being trained. There was a lot of capability gain when OpenAI released GPT4 but if we can trust Sam Altman, most of it was not because of an increase in model size but by doing a lot of different improvements.
From that perspective, many people think about whether or not GPT5 or GPT6 will be scary and dangerous by improving along the same axis as the GPT3.5 ->GPT4 improvement.
Right now, it seems that most of the danger is not in model improvement but in model utilization.
Palantir argues that models are commodities and released an AI product without providing any models that go beyond the existing open-source models. The value that Palantir does provide is that it provides interfaces of how a model can interact with the existing data a company has and how to create multi-step chains of data transformation. It provides a system to manage agents that use models to make decisions.
I read some people argue that Palantir’s product is worse than GPT4 and thus should not be seen as an advance in AI, but that’s flawed thinking. Autonomous agents are more dangerous than anything you can do in ChatGPT.
- 21 Jun 2023 18:53 UTC; 2 points) 's comment on Scaffolded LLMs: Less Obvious Concerns by (
I agree with the principle. I don’t know anything about Palantir’s recent product, but going by your description:
What Palantir seems to have done was get the value from A to B. Naturally, this is much more valuable/dangerous than generically raising the possibility of value. Extending the war metaphor, this isn’t an appeal to plan for the next war, this is an appeal to things that are true in all the wars.
I completely agree. This is why a large part of alignment research should be, and sometimes is, predicting new capabilities work.
Aligning AGI successfully requires having a practical alignment approach that applies to the type of AGI that people actually develop. Coming up with new types of AGI that are easier to align is pointless unless you can convince people to actually develop that type of AGI fast enough to make it relevant, or convince everyone to not develop AGI we don’t know how to align. So far, people haven’t succeeded at doing that sort of convincing, and I haven’t even really seen alignment people try to do that practical work.
This is why I’ve switched my focus to language model agents like AutoGPT. They seem like the most likely path to AGI at this point.
AutoGPT is an interesting bare-bones version, but I think the strongest agents will likely be more complex than that.
Palantir developed techniques of how to let the model interact with the rest of the data of an enterprise. Understanding how models interact with what Palantir calls ontology is likely valuable.
That post I linked includes a bunch of thinking about how language model agents will be made into more complex cognitive architectures and thereby more capable.