What counts as self-improvement in the scenario governed by data?
You can grab the whole internet, including scihub and library genesis, and then maybe hack all “smart” appliances worldwide… and after that I guess you need to construct some machines that will perform experiments for you.
But none of this improves the machine’s “self”. With algorithms, the idea is that the machine would replace its own algorithms by better ones, once it gets the ability to invent and evaluate algorithms. With hardware, the idea is that the machine would replace its own hardware by faster ones, once it gets the ability to design and produce hardware. But replacing your data with better data, that… we usually don’t call self-improvement.
Also, what kind of data are we talking about? Data about the real world, they have to come from the outside, by definition. (Unless they are data about physics that you can obtain by observing the physical properties of your own circuits, or something like that.) But there is also data in sense of precomputed cached results, like playing zillions of games of chess against yourself, and remembering which strategies were most successful. If this was the limiting factor… I guess it would be something like a bounded AIXI which hypothetically already has enough hardware to simulate a universe, it only need to make zillions of computations to find the one that is consistent with the observed data.
In the scenario governed by data, the part that counts as self-improvement is where the AI puts itself through a process of optimisation by stochastic gradient descent with respect to that data.
You don’t need that much hardware for data to be a bottleneck. For example, I think that there are plenty of economically valuable tasks that are easier to learn than StarCraft. But we get StarCraft AIs instead because games are the only task where we can generate arbitrarily large amounts of data.
What counts as self-improvement in the scenario governed by data?
You can grab the whole internet, including scihub and library genesis, and then maybe hack all “smart” appliances worldwide… and after that I guess you need to construct some machines that will perform experiments for you.
But none of this improves the machine’s “self”. With algorithms, the idea is that the machine would replace its own algorithms by better ones, once it gets the ability to invent and evaluate algorithms. With hardware, the idea is that the machine would replace its own hardware by faster ones, once it gets the ability to design and produce hardware. But replacing your data with better data, that… we usually don’t call self-improvement.
Also, what kind of data are we talking about? Data about the real world, they have to come from the outside, by definition. (Unless they are data about physics that you can obtain by observing the physical properties of your own circuits, or something like that.) But there is also data in sense of precomputed cached results, like playing zillions of games of chess against yourself, and remembering which strategies were most successful. If this was the limiting factor… I guess it would be something like a bounded AIXI which hypothetically already has enough hardware to simulate a universe, it only need to make zillions of computations to find the one that is consistent with the observed data.
In the scenario governed by data, the part that counts as self-improvement is where the AI puts itself through a process of optimisation by stochastic gradient descent with respect to that data.
You don’t need that much hardware for data to be a bottleneck. For example, I think that there are plenty of economically valuable tasks that are easier to learn than StarCraft. But we get StarCraft AIs instead because games are the only task where we can generate arbitrarily large amounts of data.