Indeed, this is exactly the kind of thing I am gesturing at. Certainly, all our repertoires of sexual behaviour are significantly shaped by RL. My point is that evolution has somehow in this case mostly solved some pointers-like problem to get the reward model to suddenly include rewards for sexual behaviour, can do so robustly, and can do so a long time after birth after a decade or so of unsupervised learning and RL has already occurred. Moreover, this reward model leads to people robustly pursuing this goal even fairly off-distribution from the ancestral environment.
Indeed, this is exactly the kind of thing I am gesturing at. Certainly, all our repertoires of sexual behaviour are significantly shaped by RL. My point is that evolution has somehow in this case mostly solved some pointers-like problem to get the reward model to suddenly include rewards for sexual behaviour, can do so robustly, and can do so a long time after birth after a decade or so of unsupervised learning and RL has already occurred. Moreover, this reward model leads to people robustly pursuing this goal even fairly off-distribution from the ancestral environment.