Charlie Steiner comments on How Would an Utopia-Maximizer Look Like?

Charlie Steiner 21 Dec 2023 11:18 UTC
LW: 16 AF: 10
5
AF
Sure, every time you go more abstract there are fewer degrees of freedom. But there’s no free lunch—there are degrees of freedom in how the more-abstract variables are connected to less-abstract ones.
People who want different things might make different abstractions. E.g. if you’re calling some high level abstraction “eat good food,” it’s not that this is mathematically the same abstraction made by someone who thinks good food is pizza and someone else who thinks good food is fish. Not even if those people independently keep going higher in the abstraction hierarchy—they’ll never converge to the same object, because there’s always that inequivalence in how they’re translated back to the low level description.
Yes, at high levels of abstraction, humans can all recommend the same abstract action. But I don’t care about abstract actions, I care about real-world actions.
E.g. suppose we abstract the world to an ontology where there are two states, “good” and “bad,” and two actions—stay or swap. Lo and behold, ~everyone who abstracts the world to this ontology will converge to the same policy in terms of abstract actions: make the world good rather than bad. But if two people disagree utterly about which low-level states get mapped onto the “good” state, they’ll disagree utterly about which low-level actions get mapped onto the “swap from bad to good” action, and this abstraction hasn’t really bought us anything.
- Thane Ruthenis 21 Dec 2023 13:53 UTC
  LW: 4 AF: 2
  0
  AF Parent
  People who want different things might make different abstractions
  That’s a direct rejection of the natural abstractions hypothesis. And some form of it increasingly seems just common-sensically true.
  It’s indeed the case that one’s choice of what system to model is dependent on what they care about/where their values are housed (whether I care to model the publishing industry, say). But once the choice to model a given system is made, the abstractions are in the territory. They fall out of noticing to which simpler systems a given system can be reduced.
  (Imagine you have a low-level description of a system defined in terms of individual gravitationally- and electromagnetically-interacting particles. Unbeknownst to you, the system describes two astronomical objects orbiting each other. Given some abstracting-up algorithm, we can notice that this system reduces to these two bodies orbiting each other (under some definition of approximation).
  It’s not value-laden at all: it’s simply a true mathematical fact about the system’s dynamics.
  The NAH is that this generalizes, very widely.)
  Not even if those people independently keep going higher in the abstraction hierarchy—they’ll never converge to the same object, because there’s always that inequivalence in how they’re translated back to the low level description.
  I mean, that’s clearly not how it works in practice? Take the example in the post literally: two people disagree on food preferences, but can agree on the “food” abstraction and on both of them having a preference for subjectively tasty ones.
  suppose we abstract the world to an ontology where there are two states, “good” and “bad,”
  If your model is assumed, i. e. that abstractions are inherently value-laden, then yes, this is possible. But that’s not how it’d work under the NAH and on my model, because “good” and “bad” are not objective high-level states a given system could be in.
  It’d be something like State A and State B. And then the “human values converge” hypothesis is that all human values would converge to preferring one of these states.
  - Charlie Steiner 21 Dec 2023 14:52 UTC
    LW: 4 AF: 3
    0
    AF Parent
    Not even if those people independently keep going higher in the abstraction hierarchy—they’ll never converge to the same object, because there’s always that inequivalence in how they’re translated back to the low level description.
    I mean, that’s clearly not how it works in practice? Take the example in the post literally: two people disagree on food preferences, but can agree on the “food” abstraction and on both of them having a preference for subjectively tasty ones.
    I agree with the part of what you just said that’s the NAH, but disagree with your interpretation.
    Both people can recognize that there’s a good abstraction here, where what they care about is subjectively tasty food. But this interpersonal abstraction is no longer an abstraction of their values, it simply happens to be about their values, sometimes. It can no longer be cashed out into specific recommendations of real-world actions in the way someone’s values can^[1].
    ^
    For certain meanings of “values,” ofc.
    - Thane Ruthenis 21 Dec 2023 15:08 UTC
      LW: 2 AF: 1
      0
      AF Parent
      Okay, let’s build a toy model.
      We have some system with a low-level state $l$ , which can take on one of six values: ${a, b, c, d, e, f}$ .
      We can abstract over this system’s state and get a high-level state $h$ , which can take on one of two states: ${x, y} .$
      We have an objective abstracting-up function $f (l) = h$ .
      We have the following mappings between states:
      $\forall l \in {a, b, c} : f (l) = x$
      $\forall l \in {d, e, f} : f (l) = y$
      We have an utility function $U_{A} (l)$ , with a preference ordering of $a > b > c ≫ d \approx e \approx f$ , and an utility function $U_{B} (l)$ , with a preference ordering of $c > b > a ≫ d \approx e \approx f$ .
      We translate both utility functions to $h$ , and get the same utility function: $U (h)$ whose preference ordering is $x > y$ .
      Thus, both $U_{A} (l)$ and $U_{B} (l)$ can agree on which high-level state they would greatly prefer. No low-level state would maximally satisfy both of them, but they both would be happy enough with any low-level state that gets mapped to the high-level state of $x$ . ( $b$ is the obvious compromise.)
      Which part of this do you disagree with?
      - Charlie Steiner 21 Dec 2023 16:05 UTC
        LW: 2 AF: 1
        0
        AF Parent
        I disagree that translating to x and y let you “reduce the degrees of freedom” or otherwise get any sort of discount lunch. At the end you still had to talk about the low level states again to say they should compromise on b (or not compromise and fight it out over c vs. a, that’s always an option).
        Thane Ruthenis 21 Dec 2023 16:10 UTC
        LW: 2 AF: 1
        0
        AF Parent
        At the end you still had to talk about the low level states again to say they should compromise on b
        “Compromising on $b$ ” is a more detailed implementation that can easily be omitted. The load-bearing part is “both would be happy enough with any low-level state that gets mapped to the high-level state of $x$ ”.
        For example, the policy of randomly sampling any $l$ such that $f (l) = x$ is something both utility functions can agree on, and doesn’t require doing any additional comparisons of low-level preferences, once the high-level state has been agreed upon. Rising tide lifts all boats, etc.
        Charlie Steiner 21 Dec 2023 21:16 UTC
        LW: 2 AF: 1
        0
        AF Parent
        Suppose the two agents are me and a flatworm.
        a = ideal world according to me
        b = status quo
        c = ideal world according to the flatworm
        d, e, f = various deliberately-bad-to-both worlds
        
        I’m not going to stop trying to improve the world just because the flatworm prefers the status quo, and I wouldn’t be “happy enough” if we ended up in flatworm utopia.
        
        What bargains I would agree to, and how I would feel about them, are not safe to abstract away.
        Thane Ruthenis 21 Dec 2023 21:49 UTC
        LW: 2 AF: 1
        0
        AF Parent
        I wouldn’t be “happy enough” if we ended up in flatworm utopia
        You would, presumably, be quite happy compared to “various deliberately-bad-to-both worlds”.
        I’m not going to stop trying to improve the world just because the flatworm prefers the status quo
        Because you don’t care about the flatworm and the flatworm is not perceived by you as having much bargaining power for you to bend to its preferences.
        In addition, your model rules out more fine-grained ideas like “the cubic mile of terrain around the flatworm remains unchanged while I get the rest of the universe”. Which is plausibly what CEV would result in: everyone gets their own safe garden, with the only concession the knowledge that everyone else’s safe gardens also exist.