adastra22 comments on What are the best arguments for/against AIs being “slightly ‘nice’”?

adastra22 26 Sep 2024 4:29 UTC
3 points
2
.
- Raemon 26 Sep 2024 4:40 UTC
  4 points
  0
  Parent
  He might or might not, but if he doesn’t he’s less likely to end up controlling the solar system and/or lightcone.
  - adastra22 26 Sep 2024 5:33 UTC
    5 points
    4
    Parent
    .
    - ryan_greenblatt 26 Sep 2024 22:19 UTC
      2 points
      0
      Parent
      
      His claim is that we should expect any random evolved agent to mostly care about long-run power.
      
      I meant that any system which mostly cares about long-run power won’t be selected out. I don’t really have a strong view about whether other systems that don’t care about long-run power will end up persisting, especially earlier (e.g. human evolution). I was just trying to argue against a claim about what gets selected out.
      
      My language was bit sloppy here.
      
      (If evolutionary pressures continue forever, then ultimately you’d expect that all systems have to act very similarly to ones that only care about long-run power, but there could be other motivations that explain this. So, at least from a behavioral perspective, I do expect that ultimately (if evolutionary pressures continue forever) you get systems which at least act like they are optimizing for long-run power. I wasn’t really trying to make an argument about this though.)
      - adastra22 27 Sep 2024 21:21 UTC
        3 points
        0
        Parent
        .
        ryan_greenblatt 27 Sep 2024 22:58 UTC
        2 points
        0
        Parent
        Then shouldn’t such systems (which can surely recognize this argument) just take care of short term survival instrumentally? Maybe you’re making a claim about irrationality being likely or a claim that systems that care about long run benefit act in appararently myopic ways.
        
        (Note that historically it was much harder to keep value stability/lock-in than it will be for AIs.)
        
        I’m not going to engage in detail FYI.
- ryan_greenblatt 26 Sep 2024 15:30 UTC
  4 points
  2
  Parent
  Personally? I guess I would say that I mostly (98%?) care about long-run power for similar values on reflection to me. And, probably some humans are quite close to my values and many are adjacent.
  - adastra22 27 Sep 2024 21:24 UTC
    1 point
    0
    Parent
    .
    - ryan_greenblatt 27 Sep 2024 22:59 UTC
      2 points
      0
      Parent
      As in, I care about the long-run power of values-which-are-similar-to-my-values-on-reflection. Which includes me (on reflection) by definition, but I think probably also includes lots of other humans.
      - adastra22 28 Sep 2024 0:23 UTC
        0 points
        0
        Parent
        .
        Dweomite 28 Sep 2024 6:44 UTC
        2 points
        0
        Parent
        In the context of optimization, values are anything you want (whether moral in nature or otherwise).
        Any time a decision is made based on some value, you can view that value as having exercised power by controlling the outcome of that decision.
        Or put more simply, the way that values have power, is that values have people who have power.
        adastra22 30 Sep 2024 9:47 UTC
        0 points
        0
        Parent
        .
        Dweomite 30 Sep 2024 18:07 UTC
        2 points
        0
        Parent
        You appear to be thinking of power only in extreme terms (possibly even as an on/off binary). Like, that your values “don’t have power” unless you set up a dictatorship or something.
        But “power” is being used here in a very broad sense. The personal choices you make in your own life are still a non-zero amount of power to whatever you based those choices on. If you ever try to persuade someone else to make similar choices, then you are trying to increase the amount of power held by your values. If you support laws like “no stealing” or “no murder” then you are trying to impose some of your values on other people through the use of force.
        I mostly think of government as a strategy, not an end. I bet you would too, if push came to shove; e.g. you are probably stridently against murdering or enslaving a quarter of the population, even if the measure passes by a two-thirds vote. My model says almost everyone would endorse tearing down the government if it went sufficiently off the rails that keeping it around became obviously no longer a good instrumental strategy.
        Like you, I endorse keeping the government around, even though I disagree with it sometimes. But I endorse that on the grounds that the government is net-positive, or at least no worse than [the best available alternative, including switching costs]. If that stopped being true, then I would no longer endorse keeping the current government. (And yes, it could become false due to a great alternative being newly-available, even if the current government didn’t get any worse in absolute terms. e.g. someone could wait until democracy is invented before they endorse replacing their monarchy.)
        I’m not sure that “no one should have the power to enforce their own values” is even a coherent concept. Pick a possible future—say, disassembling the earth to build a Dyson sphere—and suppose that at least one person wants it to happen, and at least one person wants it not to happen. When the future actually arrives, it will either have happened, or not—which means at least one person “won” and at least one person “lost”. What exactly does it mean for “neither of those people had the power to enforce their value”, given that one of the values did, in fact, win? Don’t we have to say that one of them clearly had enough power to stymie the other?
        You could say that society should have a bunch of people in it, and that no single person should be able to overpower everyone else combined. But that doesn’t prevent some value from being able to overpower all other values, because a value can be endorsed by multiple people!
        I suppose someone could hypothetically say that they really only care about the process of government and not the result, such that they’ll accept any result as long as it is blessed by the proper process. Even if you’re willing to go to that extreme, though, that still seems like a case of wanting “your values” to have power, just where the thing you value is a particular system of government. I don’t think that having this particular value gives you any special moral high ground over people who value, say, life and happiness.
        I also think that approximately no one actually has that as a terminal value.
        adastra22 30 Sep 2024 19:35 UTC
        1 point
        0
        Parent
        .
        Dweomite 30 Sep 2024 21:19 UTC
        2 points
        −2
        Parent
        I think you’re still thinking in terms of something like formalized political power, whereas other people are thinking in terms of “any ability to affect the world”.
        Suppose a fantastically powerful alien called Superman comes to earth, and starts running around the city of Metropolis, rescuing people and arresting criminals. He has absurd amounts of speed, strength, and durability. You might think of Superman as just being a helpful guy who doesn’t rule anything, but as a matter of capability he could demand almost anything from the rest of the world and the rest of the world couldn’t stop him. Superman is de facto ruler of Earth; he just has a light touch.
        If you consider that acceptable, then you aren’t objecting to “god-like status and control”, you just have opinions about how that control should be exercised.
        If you consider that UNacceptable, then you aren’t asking for Superman to behave in certain ways, you are asking for Superman to not exist (or for some other force to exist that can check him).
        Most humans (probably including you) are currently a “prisoner” of a coalition of humans who will use armed force to subdue and punish you if you take any actions that the coalition (in its sole discretion) deems worthy of such punishment. Many of these coalitions (though not all of them) are called “governments”. Most humans seem to consider the existence of such coalitions to be a good thing on balance (though many would like to get rid of certain particular coalitions).
        I will grant that most commenters on LessWrong probably want Superman to take a substantially more interventionist approach than he does in DC Comics (because frankly his talents are wasted stopping petty crime in one city).
        Most commenters here still seem to want Superman to avoid actions that most humans would disapprove of, though.
        Amalthea 1 Oct 2024 8:19 UTC
        2 points
        1
        Parent
        I’m definitely fine with not having Superman, but I’m willing to settle on him not intervening.
        
        On a different note, I’d disagree that Superman, just by existing and being powerful, is a de facto ruler in any sense—he of course could be, but that would entail a tradeoff that he may not like (living an unburdened life).

adastra22 comments on What are the best arguments for/​against AIs being “slightly ‘nice’”?

adastra22 comments on What are the best arguments for/against AIs being “slightly ‘nice’”?