Raemon comments on What are the best arguments for/against AIs being “slightly ‘nice’”?

Raemon 26 Sep 2024 4:40 UTC
4 points
0
He might or might not, but if he doesn’t he’s less likely to end up controlling the solar system and/or lightcone.
- adastra22 26 Sep 2024 5:33 UTC
  5 points
  4
  Parent
  That’s by the by. His claim is that we should expect any random evolved agent to mostly care about long-run power. User ryan_greenblatt is an evolved agent. Does he mostly care about long-run power? No? If not, why should “we expect that ultimately the only thing selected for is mostly caring about long run power?” [emphasis mine]
  Edit: Ryan, you’ve marked the bolded text above as “misunderstands position,” yet this is literally what you wrote:
  
  > Shouldn’t we expect that ultimately the only thing selected for is mostly caring about long run power?
  
  What did I misunderstand here? My bolded text above is almost a word-for-word restatement of your claim.
  - ryan_greenblatt 26 Sep 2024 22:19 UTC
    2 points
    0
    Parent
    
    His claim is that we should expect any random evolved agent to mostly care about long-run power.
    
    I meant that any system which mostly cares about long-run power won’t be selected out. I don’t really have a strong view about whether other systems that don’t care about long-run power will end up persisting, especially earlier (e.g. human evolution). I was just trying to argue against a claim about what gets selected out.
    
    My language was bit sloppy here.
    
    (If evolutionary pressures continue forever, then ultimately you’d expect that all systems have to act very similarly to ones that only care about long-run power, but there could be other motivations that explain this. So, at least from a behavioral perspective, I do expect that ultimately (if evolutionary pressures continue forever) you get systems which at least act like they are optimizing for long-run power. I wasn’t really trying to make an argument about this though.)
    - adastra22 27 Sep 2024 21:21 UTC
      3 points
      0
      Parent
      If we’re talking about Darwinian contexts, systems which optimize for long-run power are in fact often selected out. Long-run benefit is of no utility unless short-term survival is taken care of, and long-run and short-term needs are often at odds.
      So from a behavioral perspective I expect that you get systems which are optimizing for short-term survival. Indeed, I think this point is trivial and one which you probably agree with. What I’m saying is that short-term survival and long-run power are not necessarily correlated, and I think that is the crux.
      Let’s take an example that is not rigorously worked out and is probably wrong in some details, but can serve to illustrate. Long-run power in humans is derivative of social structure: the leaders of the tribe control the tribe’s collective resources. If you want power in human society, you need to rise to the top of our social structures, and the optimal ways of doing that are generally not nice.
      But why do we have social structures at all? Why are we organized as tribes? Because we are social animals who prefer the company of others. Being with others beats striking out on your own because, generally speaking, other people in the tribe are nice. Niceness creates an environment in which sycophantic power seeking pays off, but only because there is severe evolutionary pressure towards niceness in the first place. [In the environment which gave rise to humans, not as a general statement.]
      - ryan_greenblatt 27 Sep 2024 22:58 UTC
        2 points
        0
        Parent
        Then shouldn’t such systems (which can surely recognize this argument) just take care of short term survival instrumentally? Maybe you’re making a claim about irrationality being likely or a claim that systems that care about long run benefit act in appararently myopic ways.
        
        (Note that historically it was much harder to keep value stability/lock-in than it will be for AIs.)
        
        I’m not going to engage in detail FYI.

Raemon comments on What are the best arguments for/​against AIs being “slightly ‘nice’”?

Raemon comments on What are the best arguments for/against AIs being “slightly ‘nice’”?