hairyfigment comments on Optimistic Assumptions, Longterm Planning, and “Cope”

hairyfigment 18 Jul 2024 8:29 UTC
2 points
0
https://arxiv.org/abs/1712.05812
It’s directly about inverse reinforcement learning, but that should be strictly stronger than RLHF. Seems incumbent on those who disagree to explain why throwing away information here would be enough of a normative assumption (contrary to every story about wishes.)