As I understand Vivek’s framework, human value shards explain away the need to posit alignment to an idealized utility function. A person is not a bunch of crude-sounding subshards (e.g. “If food nearby and hunger>15, then be more likely to go to food”) and then also a sophisticated utility function (e.g. something like CEV). It’s shards all the way down, and all the way up.[10]
This read to me like you were saying “In Vivek’s framework, value shards explain away ..” and I was confused. I now think you mean “My take on Vivek’s is that value shards explain away ..”. Maybe reword for clarity?
This read to me like you were saying “In Vivek’s framework, value shards explain away ..” and I was confused. I now think you mean “My take on Vivek’s is that value shards explain away ..”. Maybe reword for clarity?
(Might have a substantive reply later)
Reworded, thanks.