Some quotes from the latest episode of my podcast, The Inside View. You can access the audio, video and transcript here. The key insight is that we are only seeing the tip of the iceberg w.r.t. Large Language Models Scaling, and Alignment can be seen as an inverse scaling problem.
Alignment as an Inverse Scaling Problem
“All alignment is inverse scaling problems. It’s all downstream inverse scaling problems. All of alignment is stuff that doesn’t improve monotonically as compute, data and parameters increase [...] because sometimes there’s certain things where it improves for a while, but then at a certain point, it gets worse. So interpretability and controllability are the two kind of thought experiment things where you could imagine they get more interpretable and more controllable for a long time until they get superintelligent. At that point, they’re less interpretable and less controllable.”
“Then the hard problem though is measurement and finding out what are the downstream evaluations because say you got some fancy deceptive AI that wants to do a treacherous turn or whatever. How do you even find the downstream evaluations to know whether it’s gonna try to deceive you? Because when I say, it’s all a downstream scaling problem, that assumes you have the downstream test, the downstream thing that you’re evaluating it on. But if it’s some weird deceptive thing, it’s hard to even find what’s the downstream thing to evaluate it on to know whether it’s trying deceive. ”
On Private Research at Google, Deepmind
“I know a bunch of people at Google said, yeah, we have language models that are way bigger than GPT-3, but we just don’t put them in papers. ”
“The DeepMind language models papers, they were a year old when they finally put them out on arXiv, Gopher and Chinchilla. They had the language model finished training a year before the paper came out. ”
On Thinking about the Fastest Path
“You have to be thinking in terms of the fastest path, because there is extremely huge economic and military incentives that are selecting for the fastest path, whether you want it to be that way or not. So, you got to be thinking in terms of, what is the fastest path and then how do you minimize the alignment tax on that fastest path. Because the fastest path is the way it’s probably gonna happen no matter what.”
“The person who wins AGI is whoever has the best funding model for supercomputers. Whoever has the best funding model for supercomputers wins. You have to assume all entities have the nerve, ‘we’re gonna do the biggest training run ever’, but then given that’s your pre-filter, then it’s just whoever has the best funding models for supercomputers. ”
On the funding of Large Language Models
“A zillion Googlers have left Google to start large language model startups. There’s literally three large language model startups by ex-Googlers now [1]. OpenAI is a small actor in this now because there’s multiple large language model startups founded by ex-Googlers that all were founded in the last six months. There’s a zillion VCs throwing money at large language model startups right now. The funniest thing, Leo Gao, he’s like: ’we need more large language model startups because the more startups we have, then it splits up all the funding so no organization can have all the funding to get the really big supercomputer [...] they were famous people like the founder of the DeepMind scaling team. Another one is the inventor of the Transformer. Another one was founded by a different person on the Transformer paper. In some ways, they have more clout than like OpenAI had. ”
Ethan Caballero on Private Scaling Progress
Link post
Some quotes from the latest episode of my podcast, The Inside View. You can access the audio, video and transcript here. The key insight is that we are only seeing the tip of the iceberg w.r.t. Large Language Models Scaling, and Alignment can be seen as an inverse scaling problem.
Alignment as an Inverse Scaling Problem
On Private Research at Google, Deepmind
On Thinking about the Fastest Path
On the funding of Large Language Models
adept.ai, character.ai, and inflection.ai.