The steelman of this argument would be something like “experimentally, we find that investigators who take experimental approaches tend to do better than those who take theoretical approaches”. But first, this isn’t obviously true… mathematicians, for instance, have found theoretical approaches to be more powerful. (I’d guess that the developer of Bitcoin took a theoretical rather than an empirical approach to creating a secure cryptocurrency, for instance.)
This example actually proves the opposite. Bitcoin was described in a white paper that wasn’t very impressive by academic crypto standards—few if anyone became interested in Bitcoin from first reading the paper in the early days. It’s success was proven by experimentation, not pure theoretical investigation.
My impression is that MIRI thinks most possible AGI architectures wouldn’t meet its standards for safety, so given that their ideal architecture is so safety-constrained, they’re focused on developing the safety stuff first before working on constructing thought models etc. This seems like a pretty reasonable approach for an organization with limited resources, if it is in fact MIRI’s approach. But I could believe that value could be added by looking at lots of budding AGI architectures and trying to figure out how one might make them safer on the margin.
It’s hard to investigate safety if one doesn’t know the general shape that AGI will finally take. MIRI has focused on a narrow subset of AGI space—namely transparent math/logic based AGI. Unfortunately it is becoming increasingly clear that the Connectionists were more or less absolutely right in just about every respect . AGI will likely take the form of massive brain-like general purpose ANNs. Most of MIRI’s research thus doesn’t even apply to the most likely AGI candidate architecture.
if intelligence is a complicated, heterogeneous process where computation is spread relatively evenly among many modules, then improving the performance of an AGI gets tougher, because upgrading an individual module does little to improve the performance of the system as a whole.
I’m guessing this is likely to be true of general-purpose ANNs, meaning recursive self-improvement would be more difficult for a brain-like ANN than it might be for some other sort of AI? (This would be somewhat reassuring if it was true.)
meaning recursive self-improvement would be more difficult for a brain-like ANN than it might be for some other sort of AI?
It’s not clear that there is any other route to AGI—all routes lead to “brain-like ANNs”, regardless of what linguistic label we use (graphical models, etc).
General purpose RL—in ideal/optimal theoretical form—already implements recursive self-improvement in the ideal way. If you have an ideal/optimal general RL system running, then there are no remaining insights you could possibly have which could further improve its own learning ability.
The evidence is accumulating that general Bayesian RL can be efficiently approximated, that real brains implement something like this, and that very powerful general purpose AI/AGI can be built on the same principles.
Now, I do realize that by “recursive self-improvement” you probably mean a human level AGI consciously improving its own ‘software design’, using slow rule based/logic thinking of the type suitable for linguistic communication. But there is no reason to suspect that the optimal computational form of self-improvement should actually be subject to those constraints.
The other, perhaps more charitable view of “recursive self-improvement” is the more general idea of the point in time where AGI engineers/researchers takeover most of the future AGI engineering/research work. Coming up with new learning algorithms will probably be only a small part of the improvement work at that point. Implementations however can always be improved, and there is essentially an infinite space of better hardware designs. Coming up with new model architectures and training environments will also have scope for improvement.
Also, it doesn’t really appear to matter much how many modules the AGI has, because improvement doesn’t rely much on human insights into how each module works. Even with zero new ‘theoerical’ insights, you can just run the AGI on better hardware and it will be able to think faster or split into more copies. Either way, it will be able to speed up the rate at which it soaks up knowledge and automatically rewires itself (self-improves).
This example actually proves the opposite. Bitcoin was described in a white paper that wasn’t very impressive by academic crypto standards—few if anyone became interested in Bitcoin from first reading the paper in the early days. It’s success was proven by experimentation, not pure theoretical investigation.
By experimentation, do you mean people running randomized controlled trials on Bitcoin or otherwise empirically testing hypotheses on the software? Just because your approach is collaborative and incremental doesn’t mean that it’s empirical.
By experimentation, do you mean people running randomized controlled trials on Bitcoin or otherwise empirically testing hypotheses on the software?
Not really—by experimentation I meant proving a concept by implementing it and then observing whether the implementation works or not, as contrasted to the pure math/theory approach where you attempt to prove something abstractly on paper.
For context, I was responding to your statement:
But first, this isn’t obviously true… mathematicians, for instance, have found theoretical approaches to be more powerful. (I’d guess that the developer of Bitcoin took a theoretical rather than an empirical approach to creating a secure cryptocurrency, for instance.)
Bitcoin is an example of typical technological development, which is driven largely by experimentation/engineering rather than math/theory. Theory is important mainly as a means to generate ideas for experimentation.
This example actually proves the opposite. Bitcoin was described in a white paper that wasn’t very impressive by academic crypto standards—few if anyone became interested in Bitcoin from first reading the paper in the early days. It’s success was proven by experimentation, not pure theoretical investigation.
It’s hard to investigate safety if one doesn’t know the general shape that AGI will finally take. MIRI has focused on a narrow subset of AGI space—namely transparent math/logic based AGI. Unfortunately it is becoming increasingly clear that the Connectionists were more or less absolutely right in just about every respect . AGI will likely take the form of massive brain-like general purpose ANNs. Most of MIRI’s research thus doesn’t even apply to the most likely AGI candidate architecture.
In this essay I wrote:
I’m guessing this is likely to be true of general-purpose ANNs, meaning recursive self-improvement would be more difficult for a brain-like ANN than it might be for some other sort of AI? (This would be somewhat reassuring if it was true.)
It’s not clear that there is any other route to AGI—all routes lead to “brain-like ANNs”, regardless of what linguistic label we use (graphical models, etc).
General purpose RL—in ideal/optimal theoretical form—already implements recursive self-improvement in the ideal way. If you have an ideal/optimal general RL system running, then there are no remaining insights you could possibly have which could further improve its own learning ability.
The evidence is accumulating that general Bayesian RL can be efficiently approximated, that real brains implement something like this, and that very powerful general purpose AI/AGI can be built on the same principles.
Now, I do realize that by “recursive self-improvement” you probably mean a human level AGI consciously improving its own ‘software design’, using slow rule based/logic thinking of the type suitable for linguistic communication. But there is no reason to suspect that the optimal computational form of self-improvement should actually be subject to those constraints.
The other, perhaps more charitable view of “recursive self-improvement” is the more general idea of the point in time where AGI engineers/researchers takeover most of the future AGI engineering/research work. Coming up with new learning algorithms will probably be only a small part of the improvement work at that point. Implementations however can always be improved, and there is essentially an infinite space of better hardware designs. Coming up with new model architectures and training environments will also have scope for improvement.
Also, it doesn’t really appear to matter much how many modules the AGI has, because improvement doesn’t rely much on human insights into how each module works. Even with zero new ‘theoerical’ insights, you can just run the AGI on better hardware and it will be able to think faster or split into more copies. Either way, it will be able to speed up the rate at which it soaks up knowledge and automatically rewires itself (self-improves).
By experimentation, do you mean people running randomized controlled trials on Bitcoin or otherwise empirically testing hypotheses on the software? Just because your approach is collaborative and incremental doesn’t mean that it’s empirical.
Not really—by experimentation I meant proving a concept by implementing it and then observing whether the implementation works or not, as contrasted to the pure math/theory approach where you attempt to prove something abstractly on paper.
For context, I was responding to your statement:
Bitcoin is an example of typical technological development, which is driven largely by experimentation/engineering rather than math/theory. Theory is important mainly as a means to generate ideas for experimentation.