I came here from the pedophile discussion. This comment interests me more so I’m replying to it.
To preface, here is what I currently think: Preferences are in a hierarchy. You make a list of possible universes (branching out as a result of your actions) and choose the one you prefer the most—so I’m basically coming from VNM. The terminal value lies in which universe you choose. The instrumental stuff lies in which actions you take to get there.
So I’m reading your line of thought...
But just because values exist in a mutually referential network doesn’t mean they exist in a hierarchy with certain values at the root. Maybe I have (V3) wanting to marry my boyfriend and (V4) wanting to make my boyfriend happy. Here, too, these are different values, and failing to distinguish between them is a problem, and there’s a causal link that matters. But it’s not strictly hierarchical: if the causal link is severed (e.g., marrying my boyfriend isn’t a way to make him happy) I still have both goals. Worse, if the causal link is reversed (e.g., marrying my boyfriend makes him less happy, because he has V5: don’t get married), I still have both goals. Now what?
I’m not sure how this line of thought suggests that terminal values do not exist. It simply suggests that some values are terminal, while others are instrumental. To simplify, you can compress all these terminal goals into a single goal called “Fulfill my preferences”, and do utilitarian game theory from there. This need not involve arranging the preferences in any hierarchy—it only involves balancing them against each other. Speaking of multiple terminal values just decomposes whatever function you use to pick your favorite universe into multiple functions.
maybe humanity’s values simply aren’t coherent; maybe some of our post-Singularity descendents will be varelse to one another.
This seems unrelated to the surrounding points. Of course two agents can diverge—no one said that humans intrinsically shared the same preferences.
(Of course, platonic agents don’t exist, living things don’t actually have VNM preferences, etc, etc)
You might enjoy Arrow’s impossibility theorem though—it seems to relate to your concerns. (i’ts relevant for questions like: Can we compromise between multiple agents? What happens if we conceptualize one human as multiple agents?)
...treating preferences as identifying a sort order for universes.
...treating “values” and “preferences” and “goals” as more or less interchangeable terms.
...aggregating multiple goals into a single complex “fulfill my preferences (insofar as they are not mutually exclusive)” goal, at least in principle. (To the extent that we can actually do this, the fact that preferences might have hierarchical dependencies where satisfying preference A also partially satisfies preference B becomes irrelevant; all of that is factored into the complex goal. Of course, actually doing this might prove too complicated for any given computationally bounded mind,so such dependencies might still be important in practice.)
...balancing preferences against one another to create some kind of weighted aggregate in cases where they are mutually exclusive, in principle. (As above, that’s not to say in practice that all minds can actually do that. Different strategies may be appropriate for less capable minds.)
...drawing a distinction between which universe(s) I choose, on the one hand, and what steps I take to get there, on the other. (And if we want to refer to steps as “instrumental values” and universes as “terminal values”, that’s OK with me. That said, what I see people doing a lot is mis-identifying steps as universes, simply because we haven’t thought enough about the internal structure and intended results of those steps, so in practice I am skeptical of claims about “terminal values.” In practice, I treat the term as referring to instrumental values I haven’t yet thought enough about to understand in detail.)
no one said that humans intrinsically shared the same preferences.
I’m not sure that’s true. IIRC, a lot of the Fun Theory Sequence and the stuff around CEV sounded an awful lot like precisely this claim. That said, it’s been three years, and I don’t remember details. In any case, if we agree that humans don’t necessarily share the same preferences, that’s cool with me, regardless of what someone else might or might not have said.
I came here from the pedophile discussion. This comment interests me more so I’m replying to it.
To preface, here is what I currently think: Preferences are in a hierarchy. You make a list of possible universes (branching out as a result of your actions) and choose the one you prefer the most—so I’m basically coming from VNM. The terminal value lies in which universe you choose. The instrumental stuff lies in which actions you take to get there.
So I’m reading your line of thought...
I’m not sure how this line of thought suggests that terminal values do not exist. It simply suggests that some values are terminal, while others are instrumental. To simplify, you can compress all these terminal goals into a single goal called “Fulfill my preferences”, and do utilitarian game theory from there. This need not involve arranging the preferences in any hierarchy—it only involves balancing them against each other. Speaking of multiple terminal values just decomposes whatever function you use to pick your favorite universe into multiple functions.
This seems unrelated to the surrounding points. Of course two agents can diverge—no one said that humans intrinsically shared the same preferences.
(Of course, platonic agents don’t exist, living things don’t actually have VNM preferences, etc, etc)
You might enjoy Arrow’s impossibility theorem though—it seems to relate to your concerns. (i’ts relevant for questions like: Can we compromise between multiple agents? What happens if we conceptualize one human as multiple agents?)
I’m on board with:
...treating preferences as identifying a sort order for universes.
...treating “values” and “preferences” and “goals” as more or less interchangeable terms.
...aggregating multiple goals into a single complex “fulfill my preferences (insofar as they are not mutually exclusive)” goal, at least in principle. (To the extent that we can actually do this, the fact that preferences might have hierarchical dependencies where satisfying preference A also partially satisfies preference B becomes irrelevant; all of that is factored into the complex goal. Of course, actually doing this might prove too complicated for any given computationally bounded mind,so such dependencies might still be important in practice.)
...balancing preferences against one another to create some kind of weighted aggregate in cases where they are mutually exclusive, in principle. (As above, that’s not to say in practice that all minds can actually do that. Different strategies may be appropriate for less capable minds.)
...drawing a distinction between which universe(s) I choose, on the one hand, and what steps I take to get there, on the other. (And if we want to refer to steps as “instrumental values” and universes as “terminal values”, that’s OK with me. That said, what I see people doing a lot is mis-identifying steps as universes, simply because we haven’t thought enough about the internal structure and intended results of those steps, so in practice I am skeptical of claims about “terminal values.” In practice, I treat the term as referring to instrumental values I haven’t yet thought enough about to understand in detail.)
I’m not sure that’s true. IIRC, a lot of the Fun Theory Sequence and the stuff around CEV sounded an awful lot like precisely this claim. That said, it’s been three years, and I don’t remember details. In any case, if we agree that humans don’t necessarily share the same preferences, that’s cool with me, regardless of what someone else might or might not have said.
And, yes, AIT is relevant.