In the context of deep learning, the source code consists of:
The code defining the architecture
The code for collecting data (it can likely just hard code all of the training data if it is smart enough, but this isn’t strictly necessary)
The code for training
The code utilizing the neural network (this includes things like prompt engineering, the interface to the outside world, sampling, quantization, etc...)
So the question is if a deep learning model could improve any of this code. The question of if it can improve its “compiled code” (the weights) is also probably yes, but isn’t what the argument is based on.
By “not that much gain”, I mean that no amount of algorithmic improvements would change the sublinear scaling of intelligence as a function of compute.
Until AI is at least as sample-efficient and energy-efficient as humans are at learning, there are significant algorithmic gains that are possible. This may not be possible under the current deep-learning paradigm but we know it’s possible under some paradigm since evolution has already accomplished it blindly.
I do share your skepticism that something like an LLM alone could recursively improve itself quickly. Assuming FOOM, my model of how it happened has deep learning as only part of the answer. It’s part of the recursive loop but is used mostly as a general heuristic module, much like the neural net of a chess engine is only a piece of the puzzle; you still need a fast search algorithm that uses the heuristics efficiently.
A misaligned model might not want to do that, though, since it would be difficult for it to ensure that the output of the new training process is aligned to its goals.
I think you forgot one critical thing. Why does the normal argument for RSI’s inevitability fail? The answer is: it doesn’t.
Even though there is some research in the direction of a neural network changing each of its weights directly, this isn’t important to the main argument because it is about improving source code. The weights are more like compiled code.
In the context of deep learning, the source code consists of:
The code defining the architecture
The code for collecting data (it can likely just hard code all of the training data if it is smart enough, but this isn’t strictly necessary)
The code for training
The code utilizing the neural network (this includes things like prompt engineering, the interface to the outside world, sampling, quantization, etc...)
So the question is if a deep learning model could improve any of this code. The question of if it can improve its “compiled code” (the weights) is also probably yes, but isn’t what the argument is based on.
Then this runs into the issue that I challenge there’s just not that much gain to be made from such source code improvements.
This seems highly unlikely.
By “not that much gain”, I mean that no amount of algorithmic improvements would change the sublinear scaling of intelligence as a function of compute.
Until AI is at least as sample-efficient and energy-efficient as humans are at learning, there are significant algorithmic gains that are possible. This may not be possible under the current deep-learning paradigm but we know it’s possible under some paradigm since evolution has already accomplished it blindly.
I do share your skepticism that something like an LLM alone could recursively improve itself quickly. Assuming FOOM, my model of how it happened has deep learning as only part of the answer. It’s part of the recursive loop but is used mostly as a general heuristic module, much like the neural net of a chess engine is only a piece of the puzzle; you still need a fast search algorithm that uses the heuristics efficiently.
A misaligned model might not want to do that, though, since it would be difficult for it to ensure that the output of the new training process is aligned to its goals.