So it can then “trust” the 12kg agent, hand its 10kg of empirical data over, and shut itself down, and then “come back online” as the new 12kg agent and learn from the remaining 8kg of data, thus being a smarter, self-improved agent.
I think this process is pointless: if the 10kg+10kg agent can reason and prove faithfully things about the 12kg agent, it means that it’s already more powerful than the 12kg agent, and so there’s no reason to upgrade to heavier agents. Indeed, if the 10kg agent already believes the empirical knowledge enough to trust derivations made with it, it means that it has already upgraded to a 20kg agent, only the upgrade phase was hidden under the sentence ‘through learning’.
You can already see this in Bayesian updating: if you have prior knowledge constraining a parameter to an interval 1⁄10 long, and you acquire empirical data that ends up constraining the parameter to 1⁄20, you are exactly in the same spot that you would have been if you had started from a prior knowledge constraining the interval to 1⁄20.
I think this process is pointless: if the 10kg+10kg agent can reason and prove faithfully things about the 12kg agent, it means that it’s already more powerful than the 12kg agent, and so there’s no reason to upgrade to heavier agents.
Within the metaphor, that is true, but the metaphor of mass is very leaky here. The original 10kg agent, no matter how much empirical data it learns, can only learn at a certain rate (at least, if it works like current AI designs), because it only possesses so-and-so good a compression algorithm. It is limited in the speed with which it can learn from data and in its ability to generalize, and to prove things as logical theorems rather than believe them as empirically supported.
Hence why it wants to build the 12kg agent in the first place: the “larger” agent can compress data more efficiently, thus requiring a lower sample complexity to learn further correct conclusions (on top of the existing 8kg of empirical data).
I can think of other ways for AI to work (and am pursuing those, which includes building increasingly sophisticated little Prisoners’ Dilemma bots that achieve reliable cooperation in the one-shot game), but even those, if we wanted them to self-improve, would eventually have to reason about smaller versions of themselves, which leaves us right back at algorithmic information theory, trying to avoid the paradox theorems that arise when you reason about something about as complex as yourself.
I think this process is pointless: if the 10kg+10kg agent can reason and prove faithfully things about the 12kg agent, it means that it’s already more powerful than the 12kg agent, and so there’s no reason to upgrade to heavier agents.
Indeed, if the 10kg agent already believes the empirical knowledge enough to trust derivations made with it, it means that it has already upgraded to a 20kg agent, only the upgrade phase was hidden under the sentence ‘through learning’.
You can already see this in Bayesian updating: if you have prior knowledge constraining a parameter to an interval 1⁄10 long, and you acquire empirical data that ends up constraining the parameter to 1⁄20, you are exactly in the same spot that you would have been if you had started from a prior knowledge constraining the interval to 1⁄20.
Within the metaphor, that is true, but the metaphor of mass is very leaky here. The original 10kg agent, no matter how much empirical data it learns, can only learn at a certain rate (at least, if it works like current AI designs), because it only possesses so-and-so good a compression algorithm. It is limited in the speed with which it can learn from data and in its ability to generalize, and to prove things as logical theorems rather than believe them as empirically supported.
Hence why it wants to build the 12kg agent in the first place: the “larger” agent can compress data more efficiently, thus requiring a lower sample complexity to learn further correct conclusions (on top of the existing 8kg of empirical data).
I can think of other ways for AI to work (and am pursuing those, which includes building increasingly sophisticated little Prisoners’ Dilemma bots that achieve reliable cooperation in the one-shot game), but even those, if we wanted them to self-improve, would eventually have to reason about smaller versions of themselves, which leaves us right back at algorithmic information theory, trying to avoid the paradox theorems that arise when you reason about something about as complex as yourself.