I would disagree that it is an assumption. That same draft talks about the outsized role of self-supervised learning on determining particular ordering and kinds of concepts that humans desires latch onto. Learning from reinforcement is a core component in value formation (under shard theory), but not the only one.
I would disagree that it is an assumption. That same draft talks about the outsized role of self-supervised learning on determining particular ordering and kinds of concepts that humans desires latch onto. Learning from reinforcement is a core component in value formation (under shard theory), but not the only one.