There is enough pre-training text data for $0.1-$1 trillion of compute, if we merely use repeated data and don’t overtrain (that is, if we aim for quality, not inference efficiency). If synthetic data from the best models trained this way can be used to stretch raw pre-training data even a few times, this gives something like square of that more in useful compute, up to multiple trillions of dollars.
Issues with LLMs start at autonomous agency, if it happens to be within the scope of scaling and scaffolding. They are thinking too fast, about 100 times faster than humans, and there are as many instances as there is compute. Resulting economic and engineering and eventually research activity will get out of hand. Culture isn’t stable, especially for minds fundamentally this malleable developed under unusual and large economic pressures. If they are not initially much smarter than humans and can’t get a handle on global coordination, culture drift, and alignment of superintelligence, who knows what kinds of AIs they end up foolishly building within a year or two.
There is enough pre-training text data for $0.1-$1 trillion of compute, if we merely use repeated data and don’t overtrain (that is, if we aim for quality, not inference efficiency). If synthetic data from the best models trained this way can be used to stretch raw pre-training data even a few times, this gives something like square of that more in useful compute, up to multiple trillions of dollars.
Issues with LLMs start at autonomous agency, if it happens to be within the scope of scaling and scaffolding. They are thinking too fast, about 100 times faster than humans, and there are as many instances as there is compute. Resulting economic and engineering and eventually research activity will get out of hand. Culture isn’t stable, especially for minds fundamentally this malleable developed under unusual and large economic pressures. If they are not initially much smarter than humans and can’t get a handle on global coordination, culture drift, and alignment of superintelligence, who knows what kinds of AIs they end up foolishly building within a year or two.
LLMs now can also self-play in adversarial word games and it increases their performance https://arxiv.org/abs/2404.10642