It is the very same rationale that stands behind assumptions like “why Stockfish won’t execute losing set of moves”—it is just that good at chess. Or better—it is just that smart when it come down to chess.
In this thought experiment the way to go is not to “i see that AGI could likely fail at this step, therefore it will fail” but to keep thinking and inventing better moves for AGI to execute, which won’t be countered as easily. It is an important part of “security mindset” and probably major reason why Eliezer speaks about lack of pessimism in the field.
There exists a diminishing returns to thinking about moves versus performing the moves and seeing the results that the physics of the universe imposes on the moves as a consequence.
Think of it like AlphaGo—if it only ever could train itself by playing Go against actual humans, it would never have become superintelligent at Go. Manufacturing is like that—you have to play with the actual world to understand bottlenecks and challenges, not a hypothetical artificially created simulation of the world. That imposes rate-of-scaling limits that are currently being discounted.
Think of it like AlphaGo—if it only ever could train itself by playing Go against actual humans, it would never have become superintelligent at Go.
This is obviously untrue in both the model-free and model-based RL senses. There are something like 30 million human Go players who can play a game in two hours. AlphaGo was trained on policy gradients from, as it happens, on the order of 30m games; so it could accumulate a similar order of games in under a day; the subset of pro games can be upweighted to provide most of the signal—and when they stop providing signal, well then, it must have reached superhuman… (For perspective, a good 0.5m or whatever professional games used to imitation-train AG came from a single Go server, which was not the most popular, and that’s why AlphaGo Master ran its pro matches on a different larger server.) Do this for a few days or weeks, and you will likely have exactly that, in a good deal less time than ‘never’, which is a rather long time. More relevantly, because you’re not making a claim about the AG architecture specifically but about all learning agents in general: with no exploration, MuZero can bootstrap its model-based self-play from somewhere in the neighborhood of hundreds/thousands of ‘real’ games (as should not be a surprise as Go rules are simple), and achieves superhuman gameplay easily by self-play inside the learned model, with little need for any good human opponents at all; even if that is 3 orders of magnitude off, it’s still within a day of human gameplay sample-size. Or consider meta-learning sim2real like Dactyl which are trained exclusively in silico on unrealistic simulations, and adapt within seconds to reality. So either way. The sample-inefficiency of DL robotics, DL, or R&D, is more of a fact about our compute-poverty than it is about the inherent necessity of interacting with the real world (which is both highly parallelizable, learnable offline, and far smaller than existing methods).
Lack of clarity when i think about this limits makes hard for me to see how end result will change if we could somehow “stop discounting” them. It seems to my that we will have to be much more elaborete in describing parameters of this thought experiment. In particular we will have to agree on deeds and real world achivments that hypothetical AI has, so we will both agree to call it AGI (like writing interesting story and making illustrations so this particular research team now have a new revenue strem from selling it online—will this make AI an AGI?). And security conditions (air-gapped server-room?). This will get us closer to understanding “the rationale”. But then your question is not about AGI but “superintelligent AI” so we will have to do elaborate describing again with new parameters. And that is what i expect Eliezer (alone and with other people) had done a lot. And look what it did to him (this is a joke but at the same time—not). So i will not be an active participant further. It is not even about a single SAI in some box: compeeting teams, people running copies (legal and not) and changing code, corporate espionage, dirty code...
It is the very same rationale that stands behind assumptions like “why Stockfish won’t execute losing set of moves”—it is just that good at chess. Or better—it is just that smart when it come down to chess.
In this thought experiment the way to go is not to “i see that AGI could likely fail at this step, therefore it will fail” but to keep thinking and inventing better moves for AGI to execute, which won’t be countered as easily. It is an important part of “security mindset” and probably major reason why Eliezer speaks about lack of pessimism in the field.
There exists a diminishing returns to thinking about moves versus performing the moves and seeing the results that the physics of the universe imposes on the moves as a consequence.
Think of it like AlphaGo—if it only ever could train itself by playing Go against actual humans, it would never have become superintelligent at Go. Manufacturing is like that—you have to play with the actual world to understand bottlenecks and challenges, not a hypothetical artificially created simulation of the world. That imposes rate-of-scaling limits that are currently being discounted.
This is obviously untrue in both the model-free and model-based RL senses. There are something like 30 million human Go players who can play a game in two hours. AlphaGo was trained on policy gradients from, as it happens, on the order of 30m games; so it could accumulate a similar order of games in under a day; the subset of pro games can be upweighted to provide most of the signal—and when they stop providing signal, well then, it must have reached superhuman… (For perspective, a good 0.5m or whatever professional games used to imitation-train AG came from a single Go server, which was not the most popular, and that’s why AlphaGo Master ran its pro matches on a different larger server.) Do this for a few days or weeks, and you will likely have exactly that, in a good deal less time than ‘never’, which is a rather long time. More relevantly, because you’re not making a claim about the AG architecture specifically but about all learning agents in general: with no exploration, MuZero can bootstrap its model-based self-play from somewhere in the neighborhood of hundreds/thousands of ‘real’ games (as should not be a surprise as Go rules are simple), and achieves superhuman gameplay easily by self-play inside the learned model, with little need for any good human opponents at all; even if that is 3 orders of magnitude off, it’s still within a day of human gameplay sample-size. Or consider meta-learning sim2real like Dactyl which are trained exclusively in silico on unrealistic simulations, and adapt within seconds to reality. So either way. The sample-inefficiency of DL robotics, DL, or R&D, is more of a fact about our compute-poverty than it is about the inherent necessity of interacting with the real world (which is both highly parallelizable, learnable offline, and far smaller than existing methods).
Lack of clarity when i think about this limits makes hard for me to see how end result will change if we could somehow “stop discounting” them.
It seems to my that we will have to be much more elaborete in describing parameters of this thought experiment. In particular we will have to agree on deeds and real world achivments that hypothetical AI has, so we will both agree to call it AGI (like writing interesting story and making illustrations so this particular research team now have a new revenue strem from selling it online—will this make AI an AGI?). And security conditions (air-gapped server-room?). This will get us closer to understanding “the rationale”.
But then your question is not about AGI but “superintelligent AI” so we will have to do elaborate describing again with new parameters. And that is what i expect Eliezer (alone and with other people) had done a lot. And look what it did to him (this is a joke but at the same time—not). So i will not be an active participant further.
It is not even about a single SAI in some box: compeeting teams, people running copies (legal and not) and changing code, corporate espionage, dirty code...