This is a neat distillation of Steve’s piece, and also a helpful and persuasive extension. I appreciated the arguments 2, 3, and 4 in particular (‘2. We have more total evidence from human outcomes’, ‘3. Human learning trajectories represent a broader sampling of the space of possible learning processes’, ‘4. Evidence from humans are more accessible than evidence from evolution’).
I wanted to raise two counterpoints, without a strong opinion on how much weight they deserve.
You’re contrasting human learning → human values with evolution → human values. But without here strongly justifying, I think it makes sense to look more broadly at evolution → animal values (and potentially even broader than that). After all, the field of RL had its beginnings as a descriptive theory for animal and human behaviour more broadly.
a counter counter is that humans are the most obvious (maybe only) general-ish and reflective intelligence we know about
There might be an explanatory gap here for people caring quite so much about things like lineage, legacy, having children and descendants
maybe this is explained by within-lifetime reinforcement due to social norms, which are themselves subject to selection? (but this is just another evolution, right?) I don’t know if this is enough.
so I’m not sure it’s entirely true to say ‘5. Evolution could not have succeeded anyways’
unless you think the original optimization daemon point is completely inevitable
This is a neat distillation of Steve’s piece, and also a helpful and persuasive extension. I appreciated the arguments 2, 3, and 4 in particular (‘2. We have more total evidence from human outcomes’, ‘3. Human learning trajectories represent a broader sampling of the space of possible learning processes’, ‘4. Evidence from humans are more accessible than evidence from evolution’).
I wanted to raise two counterpoints, without a strong opinion on how much weight they deserve.
You’re contrasting human learning → human values with evolution → human values. But without here strongly justifying, I think it makes sense to look more broadly at evolution → animal values (and potentially even broader than that). After all, the field of RL had its beginnings as a descriptive theory for animal and human behaviour more broadly.
a counter counter is that humans are the most obvious (maybe only) general-ish and reflective intelligence we know about
There might be an explanatory gap here for people caring quite so much about things like lineage, legacy, having children and descendants
maybe this is explained by within-lifetime reinforcement due to social norms, which are themselves subject to selection? (but this is just another evolution, right?) I don’t know if this is enough.
so I’m not sure it’s entirely true to say ‘5. Evolution could not have succeeded anyways’
unless you think the original optimization daemon point is completely inevitable