Wait wait wait. You didn’t head to the dinner, drink some fine wine, and start raucously debating the same issue over again?
Bah, humbug!
Also, how do I get invited to these conferences again ;-)?
It is a scandalously unjustified assumption, made very hard to attack by the fact that it is repeated so frequently that everyone believes it be true just because everyone else believes it.
Very true, at least regarding AI. Personally, my theory is that the brain does do reinforcement learning, but the “reward function” isn’t a VNM-rational utility function, it’s just something the body signals to the brain to say, “Hey, that world-state was great!” I can’t imagine that Nature used something “mathematically coherent”, but I can imagine it used something flagrantly incoherent but really dead simple to implement. Like, for instance, the amount of some chemical or another coming in from the body, to indicate satiety, or to relax after physical exertion, or to indicate orgasm, or something like that.
Hey, ya pays yer money and walk in the front door :-) AGI conferences run about $400 a ticket I think. Plus the airfare to Berlin (there’s one happening in a couple of weeks, so get your skates on).
Re the possibility that the human system does do reinforcement learning …. fact is, if one frames the meaning of RL in a sufficiently loose way, the human cogsys absolutely DOES do RL, no doubt about it. Just as you described above.
But if you sit down and analyze what it means to make the claim that a system uses RL, it turns out that there is a world of difference between the two positions:
The system CAN BE DESCRIBED in such a way that there is reinforcement of actions/internal constructs that lead to positive outcomes in some way,
and
The system is controlled by a mechanism that explicitly represents (A) actions/internal constructs, (B) outcomes or expected outcomes, and (C) scalar linkages between the A and B entities …. and behavior is completely dominated by a mechanism that browses the A, B and C in such a way as to modify one of the C linkages according to the cooccurrence of a B with an A.
The difference is that the second case turns the descriptive mechanism into an explicit mechanism.
It’s like Ptolemy’s Epicycle model of the solar system. Was Ptolemy’s fancy little wheels-within-wheels model a good descriptive model of planetary motion? You bet ya! Would it have been appropriate to elevate that model and say that the planets actually DID move on top of some epicycle-like mechanism? Heck no! As a functional model it was garbage, and it held back a scientific understanding of what was really going on for over a thousand years.
Same deal with RL. Our difficulty right now is that so many people slip back and forth between arguing for RL as a descriptive model (which is fine) and arguing for it as a functional model (which is disastrous, because that was tried in psychology for 30 years, and it never worked).
Wait wait wait. You didn’t head to the dinner, drink some fine wine, and start raucously debating the same issue over again?
Bah, humbug!
Also, how do I get invited to these conferences again ;-)?
Very true, at least regarding AI. Personally, my theory is that the brain does do reinforcement learning, but the “reward function” isn’t a VNM-rational utility function, it’s just something the body signals to the brain to say, “Hey, that world-state was great!” I can’t imagine that Nature used something “mathematically coherent”, but I can imagine it used something flagrantly incoherent but really dead simple to implement. Like, for instance, the amount of some chemical or another coming in from the body, to indicate satiety, or to relax after physical exertion, or to indicate orgasm, or something like that.
Hey, ya pays yer money and walk in the front door :-) AGI conferences run about $400 a ticket I think. Plus the airfare to Berlin (there’s one happening in a couple of weeks, so get your skates on).
Re the possibility that the human system does do reinforcement learning …. fact is, if one frames the meaning of RL in a sufficiently loose way, the human cogsys absolutely DOES do RL, no doubt about it. Just as you described above.
But if you sit down and analyze what it means to make the claim that a system uses RL, it turns out that there is a world of difference between the two positions:
and
The difference is that the second case turns the descriptive mechanism into an explicit mechanism.
It’s like Ptolemy’s Epicycle model of the solar system. Was Ptolemy’s fancy little wheels-within-wheels model a good descriptive model of planetary motion? You bet ya! Would it have been appropriate to elevate that model and say that the planets actually DID move on top of some epicycle-like mechanism? Heck no! As a functional model it was garbage, and it held back a scientific understanding of what was really going on for over a thousand years.
Same deal with RL. Our difficulty right now is that so many people slip back and forth between arguing for RL as a descriptive model (which is fine) and arguing for it as a functional model (which is disastrous, because that was tried in psychology for 30 years, and it never worked).