Turing machines and Carnot engines are abstractions that manifestly can apply to real systems.
But just because something is an abstraction, doesn’t mean it’s properties apply to anything.
Consider a new abstraction called a “Quring Machine”. It is like a Turing machine, but for any starting tape that it gets, it sends that tape off to a planet where there is lots of primordial soup, then waits for the planet to evolve lifeforms which discover the tape and then invent a Macintosh-Plus-Equivalent computer and then write a version of the original tape that runs on that computer, and then the Quring Machine outputs the symbols that come from that alien Mac Plus. If the first planet fails to evolve appropriately, it tries another one, and keeps trying until the right response comes back.
Now, is that worth studying?
Reinforcement Learning, when assumed as a control mechanism for a macroscopic intelligent system, contains exactly the sort of ridiculous mechanism inside it, as the Quring Machine. (Global RL requires staggering amounts of computation and long wait times, to accumulate enough experience for the competing policies to develop enough data for meaningful calculations about their relationship to rewards).
Turing machines and Carnot engines are abstractions that manifestly can apply to real systems.
But just because something is an abstraction, doesn’t mean it’s properties apply to anything.
Agreed with both points, but I’m unclear on whether or not you still endorse the claim that we can’t get transfer from formalisms that do not actually work.
Global RL requires staggering amounts of computation and long wait times, to accumulate enough experience for the competing policies to develop enough data for meaningful calculations about their relationship to rewards
I agree with this, and I also agree with your point earlier that most of the work in modern ML systems that have a “RL core” is in the non-core parts that are doing the interesting pieces.
But it’s still not clear to me why this makes you think that RL, because it’s not a complete solution, won’t be a part of whatever the complete solution ends up being. I don’t think you could run a human on just reinforcement learning, as it seems likely that some other things are going on (like brain regions that seem hardwired to learn a particular thing), but I would also be surprised by a claim that no reinforcement learning is going on in humans.
Or maybe to put this a different way, I think there are problems probably inherent in all motivation systems, which you see with utility maximization and reinforcement learning and others. If we figure out a way to get around that problem with one system—say, finding a correction to a utility function that makes it corrigible—I also suspect that the solution will suggest equivalent solutions for other motivation mechanisms. (That is, given a utility function correction, it’s probably easy to come up with a reinforcement learning update correction.)
This makes me mostly uninterested in the reasons why particular motivation mechanisms are intractable or unlikely to be what we actually use, unless those reasons are also reasons to expect that any solutions designed for that mechanism will be difficult to transfer to other mechanisms.
I have been struggling to find a way to respond, here.
When discussing this, we have to be really careful not to slip back and forth between “global RL”, in the sense that the whole system learns through RL, and “micro-RL”, where bits of the system using something like RL. I do keep trying to emphasize that I have no problem with the latter, if it proves feasible. I would never “claim that no reinforcement learning is going on in humans” because, quite the contrary, I believe it really IS going on there.
So where does that leave my essay, and this discussion? Well, a few things are important.
1 -- The incredible minuteness of the feasible types of RL must be kept in mind. In pure form, it explodes or becomes infeasible if the micro-domain gets above the reflex (or insect) level.
2 -- We need to remember that plain old “adaptation” is not RL. So, is there an adaptation mechanism that builds (e.g.) low level feature detectors in the visual system? I bet there is. Does it work by trying to optimize a single parameter? Maybe. Should we call that parameter a “reward” signal? Well, I guess we could. But it is equally possible that such mechanisms are simultaneously optimizing a few parameters, not just one. And it is also just as likely that such mechanisms are following rules that cannot be shoehorned into the architecture of RL (there being many other kinds of adaptation). Where am I going with this? Well, why would we care to distinguish the “RL style of adaptation mechanism” from other kinds of adaptation, down at that level? Why make a special distinction? When you think about it, those micro-RL mechanisms are boring and unremarkable …… RL only becomes worth remarking on IF it is the explanation for intelligence as a whole. The behaviorists thought they were the Isaac Newtons of psychology, because they though that something like RL could explain everything. And it is only when it is proposed at that global level that it has dramatic significance, because then you could imagine an RL-controlled AI building and amplifying its own intelligence without programmer intervention.
3 -- Most importantly, if there do exist some “micro-RL” mechanisms somewhere in an intelligence, at very low levels where RL is feasible, those instances do not cause any of their properties to bleed upward to higher levels. This is the same as a really old saw ….. that, just because computers do all their basic computation with in binary, that does not mean that the highest levels of the computer must use binary numbers. Sometimes you say things that sort of imply that because RL could exist somewhere, therefore we could learn “maybe something” from those mechanisms, when it comes to other, higher aspects of the system. That really, really does not follow, and it is a dangerous mistake to make.
So, at the end of the day, my essay was targeting the use of the RL idea ONLY in those cases where it was assumed to be global. All other appearances of something RL-like just do not have any impact on arguments about AI motivation and goals.
Turing machines and Carnot engines are abstractions that manifestly can apply to real systems.
But just because something is an abstraction, doesn’t mean it’s properties apply to anything.
Consider a new abstraction called a “Quring Machine”. It is like a Turing machine, but for any starting tape that it gets, it sends that tape off to a planet where there is lots of primordial soup, then waits for the planet to evolve lifeforms which discover the tape and then invent a Macintosh-Plus-Equivalent computer and then write a version of the original tape that runs on that computer, and then the Quring Machine outputs the symbols that come from that alien Mac Plus. If the first planet fails to evolve appropriately, it tries another one, and keeps trying until the right response comes back.
Now, is that worth studying?
Reinforcement Learning, when assumed as a control mechanism for a macroscopic intelligent system, contains exactly the sort of ridiculous mechanism inside it, as the Quring Machine. (Global RL requires staggering amounts of computation and long wait times, to accumulate enough experience for the competing policies to develop enough data for meaningful calculations about their relationship to rewards).
Agreed with both points, but I’m unclear on whether or not you still endorse the claim that we can’t get transfer from formalisms that do not actually work.
I agree with this, and I also agree with your point earlier that most of the work in modern ML systems that have a “RL core” is in the non-core parts that are doing the interesting pieces.
But it’s still not clear to me why this makes you think that RL, because it’s not a complete solution, won’t be a part of whatever the complete solution ends up being. I don’t think you could run a human on just reinforcement learning, as it seems likely that some other things are going on (like brain regions that seem hardwired to learn a particular thing), but I would also be surprised by a claim that no reinforcement learning is going on in humans.
Or maybe to put this a different way, I think there are problems probably inherent in all motivation systems, which you see with utility maximization and reinforcement learning and others. If we figure out a way to get around that problem with one system—say, finding a correction to a utility function that makes it corrigible—I also suspect that the solution will suggest equivalent solutions for other motivation mechanisms. (That is, given a utility function correction, it’s probably easy to come up with a reinforcement learning update correction.)
This makes me mostly uninterested in the reasons why particular motivation mechanisms are intractable or unlikely to be what we actually use, unless those reasons are also reasons to expect that any solutions designed for that mechanism will be difficult to transfer to other mechanisms.
I have been struggling to find a way to respond, here.
When discussing this, we have to be really careful not to slip back and forth between “global RL”, in the sense that the whole system learns through RL, and “micro-RL”, where bits of the system using something like RL. I do keep trying to emphasize that I have no problem with the latter, if it proves feasible. I would never “claim that no reinforcement learning is going on in humans” because, quite the contrary, I believe it really IS going on there.
So where does that leave my essay, and this discussion? Well, a few things are important.
1 -- The incredible minuteness of the feasible types of RL must be kept in mind. In pure form, it explodes or becomes infeasible if the micro-domain gets above the reflex (or insect) level.
2 -- We need to remember that plain old “adaptation” is not RL. So, is there an adaptation mechanism that builds (e.g.) low level feature detectors in the visual system? I bet there is. Does it work by trying to optimize a single parameter? Maybe. Should we call that parameter a “reward” signal? Well, I guess we could. But it is equally possible that such mechanisms are simultaneously optimizing a few parameters, not just one. And it is also just as likely that such mechanisms are following rules that cannot be shoehorned into the architecture of RL (there being many other kinds of adaptation). Where am I going with this? Well, why would we care to distinguish the “RL style of adaptation mechanism” from other kinds of adaptation, down at that level? Why make a special distinction? When you think about it, those micro-RL mechanisms are boring and unremarkable …… RL only becomes worth remarking on IF it is the explanation for intelligence as a whole. The behaviorists thought they were the Isaac Newtons of psychology, because they though that something like RL could explain everything. And it is only when it is proposed at that global level that it has dramatic significance, because then you could imagine an RL-controlled AI building and amplifying its own intelligence without programmer intervention.
3 -- Most importantly, if there do exist some “micro-RL” mechanisms somewhere in an intelligence, at very low levels where RL is feasible, those instances do not cause any of their properties to bleed upward to higher levels. This is the same as a really old saw ….. that, just because computers do all their basic computation with in binary, that does not mean that the highest levels of the computer must use binary numbers. Sometimes you say things that sort of imply that because RL could exist somewhere, therefore we could learn “maybe something” from those mechanisms, when it comes to other, higher aspects of the system. That really, really does not follow, and it is a dangerous mistake to make.
So, at the end of the day, my essay was targeting the use of the RL idea ONLY in those cases where it was assumed to be global. All other appearances of something RL-like just do not have any impact on arguments about AI motivation and goals.