Steven Byrnes comments on [Intro to brain-like-AGI safety] 5. The “long-term predictor”, and TD learning

Steven Byrnes 11 Jun 2024 14:19 UTC
2 points
0
I mean, that particular discussion has some issues—see for example “UPDATE JAN 2023” (and I’ve learned more since then too).
But there is an actual old oversimplified hypothesis from the 1990s where there’s an RL system and a reward signal and dopamine is the TD learning signal, and that hypothesis would seem to imply that there should be one dopamine signal doing one clear thing, as opposed to what’s actually observed, which is a giant mess of dozens of signals that correlate with all kinds of stuff.
Yes it’s easy to come up with hypotheses where there are dozens of signals that correlate with all kinds of stuff, and indeed there are many many such hypotheses in the literature, and nobody serious actually believes that old theory from the 1990s anymore. But still, I thought it was worth mentioning how I would have explained the fact that there’s dozens of signals that correlate with all kinds of stuff, and indeed that subsection wound up leading to some fruitful discussions and feedback in the months after I published it.
I’ve read a ton of experimental reports and theoretical models of the basal ganglia and dopamine neurons, and have lots of idiosyncratic opinions, very few of which are spelled out in this old post. (Or anywhere else. If I have something important to say about AGI safety, and I could only say it by explaining something about my opinions on the basal ganglia & dopamine as background information, then I would probably do so. But in the absence of that, writing down all my opinions about the basal ganglia & dopamine would seem to me to be possibly-helpful for AGI capabilities but unhelpful for AGI safety.)