at this juncture i interpret the shard theory folk as arguing something like “well the shards that humans build their values up around are very proximal to minds
In the spirit of pointing out subtle things that seem wrong: My understanding of the ST position is that shards are values. There’s no “building values around” shards; the idea is that shards are what implements values and values are implemented as shards.
At least, I’m pretty sure that’s what the position was a ~year ago, and I’ve seen no indications the ST folk moved from that view.
most humans (with fully-functioning brains) have in some sense absorbed sufficiently similar values and reflective machinery that they converge to roughly the same place
The way I would put it is “it’s plausible that there is an utility function such that the world-state maximizing it is ranked as very high by the standards of most humans’ preferences, and we could get that utility function by agglomerating and abstracting over individual humans’ values”.
Like, if Person A loves seafood and hates pizza, and Person B loves pizza and hates seafood, then no, agglomerating these individual people’s preferences into Utility Function A and Utility Function B won’t result in the same utility function (and more so for more important political/philosophical stuff). But if we abstract up from there, we get “people like to eat tasty-according-to-them food”, and then a world in which both A and B are allowed to do that would rank high by the preferences of both of them.
Similarly, it seems plausible that somewhere up there at the highest abstraction levels, most humans’ preferences (stripped of individual nuance on their way up) converge towards the same “maximize eudaimonia” utility function, whose satisfaction would make ~all of us happy. (And since it’s highly abstract, its maximal state would be defined over an enormous equivalence class of world-states. So it won’t be a universe frozen in a single moment of time, or tiled with people with specific preferences, or anything like that.)
In the spirit of pointing out subtle things that seem wrong: My understanding of the ST position is that shards are values. There’s no “building values around” shards; the idea is that shards are what implements values and values are implemented as shards.
At least, I’m pretty sure that’s what the position was a ~year ago, and I’ve seen no indications the ST folk moved from that view.
The way I would put it is “it’s plausible that there is an utility function such that the world-state maximizing it is ranked as very high by the standards of most humans’ preferences, and we could get that utility function by agglomerating and abstracting over individual humans’ values”.
Like, if Person A loves seafood and hates pizza, and Person B loves pizza and hates seafood, then no, agglomerating these individual people’s preferences into Utility Function A and Utility Function B won’t result in the same utility function (and more so for more important political/philosophical stuff). But if we abstract up from there, we get “people like to eat tasty-according-to-them food”, and then a world in which both A and B are allowed to do that would rank high by the preferences of both of them.
Similarly, it seems plausible that somewhere up there at the highest abstraction levels, most humans’ preferences (stripped of individual nuance on their way up) converge towards the same “maximize eudaimonia” utility function, whose satisfaction would make ~all of us happy. (And since it’s highly abstract, its maximal state would be defined over an enormous equivalence class of world-states. So it won’t be a universe frozen in a single moment of time, or tiled with people with specific preferences, or anything like that.)