Lorxus comments on [Intro to brain-like-AGI safety] 5. The “long-term predictor”, and TD learning

Lorxus 11 Jun 2024 0:48 UTC
3 points
0
Thus, pretty much any instance where an experimenter has measured that a dopamine neuron is correlated with some behavioral variable, it’s probably consistent with my picture too.
I don’t think this is nearly as good a sign as you seem to think. Maybe I haven’t read closely enough, but surely we shouldn’t be excited by the fact that your model doesn’t constrain its expectation of dopaminergic neuronal firing any more or any differently than existing observations have? Like, I’d expect to have plausible-seeming neuronal firing that your model predicts not to happen, or something deeply weird about the couple of exceptional cases of dopaminergic neuronal firing that your model doesn’t predict, or maybe some weird second-order effect where yes actually it looks like my model predicts this perfectly but actually it’s the previous two distributional-overlap-failures “cancelling out”, but “my model can totally account for all the instances of dopaminergic neuronal firing we’ve observed” makes me worried.
- Steven Byrnes 11 Jun 2024 14:19 UTC
  2 points
  0
  Parent
  I mean, that particular discussion has some issues—see for example “UPDATE JAN 2023” (and I’ve learned more since then too).
  But there is an actual old oversimplified hypothesis from the 1990s where there’s an RL system and a reward signal and dopamine is the TD learning signal, and that hypothesis would seem to imply that there should be one dopamine signal doing one clear thing, as opposed to what’s actually observed, which is a giant mess of dozens of signals that correlate with all kinds of stuff.
  Yes it’s easy to come up with hypotheses where there are dozens of signals that correlate with all kinds of stuff, and indeed there are many many such hypotheses in the literature, and nobody serious actually believes that old theory from the 1990s anymore. But still, I thought it was worth mentioning how I would have explained the fact that there’s dozens of signals that correlate with all kinds of stuff, and indeed that subsection wound up leading to some fruitful discussions and feedback in the months after I published it.
  I’ve read a ton of experimental reports and theoretical models of the basal ganglia and dopamine neurons, and have lots of idiosyncratic opinions, very few of which are spelled out in this old post. (Or anywhere else. If I have something important to say about AGI safety, and I could only say it by explaining something about my opinions on the basal ganglia & dopamine as background information, then I would probably do so. But in the absence of that, writing down all my opinions about the basal ganglia & dopamine would seem to me to be possibly-helpful for AGI capabilities but unhelpful for AGI safety.)