Steven Byrnes comments on The Computational Anatomy of Human Values

Steven Byrnes 10 Apr 2023 14:03 UTC
LW: 2 AF: 2
0
AF
Model-based RL, at least to me, is … using that model to do some kind of explicit rollouts for planning
Seems like just terminology then. I’m using the term “model-based RL” more broadly than you.
I agree with you that (1) explicit one-timestep-at-a-time rollouts is very common (maybe even universal) in self-described “model-based RL” papers that you find on arxiv/cs today, and that (2) these kinds of rollouts are not part of the brain “source code” (although they might show up sometimes as a learned metacognitive strategy).
I think you’re taking (1) to be evidence that “the term ‘model-based RL’ implies one-timestep-at-a-time rollouts”, whereas I’m taking (1) to be evidence that “AI/CS people have some groupthink about how to construct effective model-based RL algorithms”.
I don’t think there is a huge amount of difference between meta-RL and ‘learning highly abstract actions/habits’
Hmm, I think the former is a strict subset of the latter. E.g. I think “learning through experience that I should suck up to vain powerful people” is the latter but not the former.
I don’t completely agree with Robin Hanson that almost all human behaviour can be explained by this drive directly though.
Yeah I agree with the “directly” part. For example, I think some kind of social drives + the particular situations I’ve been in, led to me thinking that it’s good to act with integrity. But now that desire / value is installed inside me, not just a means to an end, so I feel some nonzero motivation to “act with integrity” even when I know for sure that I won’t get caught etc. Not that it’s always a sufficient motivation …