Joe Collman comments on Considerations on interaction between AI and expected value of the future

Joe Collman 8 Dec 2021 2:23 UTC
LW: 1 AF: 1
AF
Sure, that’s possible (and if so I agree it’d be importantly dystopic) - but do you see a reason to expect it?
It’s not something I’ve thought about a great deal, but my current guess is that you probably don’t get moral patients without aiming for them (or by using training incentives much closer to evolution than I’d expect).
- Beth Barnes 8 Dec 2021 4:15 UTC
  LW: 10 AF: 2
  AF Parent
  I guess I expect there to be a reasonable amount of computation taking place, and it seems pretty plausible a lot of these computations will be structured like agents who are taking part in the Malthusian competition. I’m sufficiently uncertain about how consciousness works that I want to give some moral weight to ‘any computation at all’, and reasonable weight to ‘a computation structured like an agent’.
  I think if you have malthusian dynamics you *do* have evolution-like dynamics.
  I assume this isn’t a crux, but fwiw I think it’s pretty likely most vertebrates are moral patients
  - Joe Collman 8 Dec 2021 6:50 UTC
    LW: 1 AF: 1
    AF Parent
    I agree with most of this. Not sure about how much moral weight I’d put on “a computation structured like an agent”—some, but it’s mostly coming from [I might be wrong] rather than [I think agentness implies moral weight].
    Agreed that malthusian dynamics gives you an evolution-like situation—but I’d guess it’s too late for it to matter: once you’re already generally intelligent, can think your way to the convergent instrumental goal of self-preservation, and can self-modify, it’s not clear to me that consciousness/pleasure/pain buys you anything.
    Heuristics are sure to be useful as shortcuts, but I’m not sure I’d want to analogise those to qualia (??? presumably the right kind would be—but I suppose I don’t expect the right kind by default).
    The possibilities for signalling will also be nothing like that in a historical evolutionary setting—the utility of emotional affect doesn’t seem to be present (once the humans are gone).
    [these are just my immediate thoughts; I could easily be wrong]
    I agree with its being likely that most vertebrates are moral patients.
    Overall, I can’t rule out AIs becoming moral patients—and it’s clearly possible.
    I just don’t yet see positive reasons to think it has significant probability (unless aimed for explicitly).
- Beth Barnes 8 Dec 2021 4:20 UTC
  LW: 8 AF: 1
  AF Parent
  some relevant ideas here maybe: https://reducing-suffering.org/what-are-suffering-subroutines/
  - Joe Collman 8 Dec 2021 6:41 UTC
    LW: 1 AF: 1
    AF Parent
    Thanks, that’s interesting, though mostly I’m not buying it (still unclear whether there’s a good case to be made; fairly clear that he’s not making a good case).
    Thoughts:
    Most of it seems to say “Being a subroutine doesn’t imply something doesn’t suffer”. That’s fine, but few positive arguments are made. Starting with the letter ‘h’ doesn’t imply something doesn’t suffer either—but it’d be strange to say “Humans obviously suffer, so why not houses, hills and hiccups?”.
    We infer preference from experience of suffering/joy...:
    [Joe Xs when he might not X] & [Joe experiences suffering and joy] → [Joe prefers Xing]
    [this rock is Xing] → [this rock Xs]
    Methinks someone is petitioing a principii.
    (Joe is mechanistic too—but the suffering/joy being part of that mechanism is what gets us to call it “preference”)
    Too much is conflated:
    $[H a p p e n i n g t o x] \equiv̸ [A i m i n g t o x] \equiv̸ [P r e f e r r i n g t o x]$
    In particular, I can aim to x and not care whether I succeed. Not achieving an aim doesn’t imply frustration or suffering in general—we just happen to be wired that way (but it’s not universal, even for humans: we can try something whimsical-yet-goal-directed, and experience no suffering/frustration when it doesn’t work). [taboo/disambiguate ‘aim’ if necessary]
    There’s no argument made for frustration/satisfaction. It’s just assumed that not achieving a goal is frustrating, and that achieving one is satisfying. A case can be made to ascribe intentionality to many systems—e.g. Dennett’s intentional stance. Ascribing welfare is a further step, and requires further arguments.
    Non-achievement of an aim isn’t inherently frustrating (c.f. Buddhists—and indeed current robots).
    The only argument I saw on this was “we can sum over possible interpretations”—sure, but I can do that for hiccups too.