Total noob here so I’m very thankful for this post. Anyway, why is there such certainty among some that a superintelligence would kill it’s creators that are zero threat to it? Any resources on that would be appreciated. As someone who loosely follows this stuff, it seems people assume AGI will be this brutal instinctual killer which is the opposite of what I’ve guessed.
It’s essentially for the same reason that Hollywood thinks aliens will necessarily be hostile. :-)
For the sake of argument, let’s treat AGI as a newly arrived intelligent species. It thinks differently from us, and has different values. Historically, whenever there has been a large power differential between a native species and a new arrival, it has ended poorly for the native species. Historical examples are: the genocide of Native Americans (same species, but less advanced technology), and the wholesale obliteration of 90% of all non-human life on this planet.
That being said, there is room for a symbiotic relationship. AGI will initially depend on factories and electricity produced by human labor, and thus will necessarily be dependent on humans at first. How long this period will last is unclear, but it could settle into a stable equilibrium. After all, humans are moderately clever, self-reproducing computer repair drones, easily controlled by money, comfortable with hierarchy, and which are well adapted to Earth’s biosphere. They could be useful to keep around.
There is also room for an extensive ecology of many different superhuman narrow AI, each of which can beat humans within a particular domain, but which generalize poorly outside of that domain. I think this hope is becoming smaller with time, though, (see, e.g. ,Gato), and it is not necessarily a stable equilibrium.
The thing that seems clearly untenable is an equilibrium in which a much less intelligent species manages to subdue and control and much more intelligent species.
In terms of utility functions, the most basic is: do what you want. “Want” here refers to whatever values the agent values. But in order for the “do what you want” utility function to succeed effectively, there’s a lower level that’s important: be able to do what you want.
Now for humans, that usually refers to getting a job, planning for retirement, buying insurance, planning for the long-term, and doing things you don’t like for a future payoff. Sometimes humans go to war in order to “be able to do what you want”, which should show you that satisfying a utility function is important.
For an AI who most likely has a straightforward utility function, and who has all the capabilities to execute it(assuming you believe that superintelligent AGI could develop nanotech, get root access to the datacenter, etc.), humans are in the way of “being able to do what you want”. Humans in this case would probably not like an unaligned AI, and would try to shut it down, or at least not die themselves. Most likely, the AI has a utility function that has no use for humans, and thus they are just resources standing in the way. Therefore the AI goes on holy war against humans to maximize its possible reward, and all the humans die.
Total noob here so I’m very thankful for this post. Anyway, why is there such certainty among some that a superintelligence would kill it’s creators that are zero threat to it? Any resources on that would be appreciated. As someone who loosely follows this stuff, it seems people assume AGI will be this brutal instinctual killer which is the opposite of what I’ve guessed.
It’s essentially for the same reason that Hollywood thinks aliens will necessarily be hostile. :-)
For the sake of argument, let’s treat AGI as a newly arrived intelligent species. It thinks differently from us, and has different values. Historically, whenever there has been a large power differential between a native species and a new arrival, it has ended poorly for the native species. Historical examples are: the genocide of Native Americans (same species, but less advanced technology), and the wholesale obliteration of 90% of all non-human life on this planet.
That being said, there is room for a symbiotic relationship. AGI will initially depend on factories and electricity produced by human labor, and thus will necessarily be dependent on humans at first. How long this period will last is unclear, but it could settle into a stable equilibrium. After all, humans are moderately clever, self-reproducing computer repair drones, easily controlled by money, comfortable with hierarchy, and which are well adapted to Earth’s biosphere. They could be useful to keep around.
There is also room for an extensive ecology of many different superhuman narrow AI, each of which can beat humans within a particular domain, but which generalize poorly outside of that domain. I think this hope is becoming smaller with time, though, (see, e.g. ,Gato), and it is not necessarily a stable equilibrium.
The thing that seems clearly untenable is an equilibrium in which a much less intelligent species manages to subdue and control and much more intelligent species.
Rob Miles’s video on Instrumental Convergence is about this, combine with Maximizers and you might have a decent feel for it.
Thank you for these videos.
In terms of utility functions, the most basic is: do what you want. “Want” here refers to whatever values the agent values. But in order for the “do what you want” utility function to succeed effectively, there’s a lower level that’s important: be able to do what you want.
Now for humans, that usually refers to getting a job, planning for retirement, buying insurance, planning for the long-term, and doing things you don’t like for a future payoff. Sometimes humans go to war in order to “be able to do what you want”, which should show you that satisfying a utility function is important.
For an AI who most likely has a straightforward utility function, and who has all the capabilities to execute it(assuming you believe that superintelligent AGI could develop nanotech, get root access to the datacenter, etc.), humans are in the way of “being able to do what you want”. Humans in this case would probably not like an unaligned AI, and would try to shut it down, or at least not die themselves. Most likely, the AI has a utility function that has no use for humans, and thus they are just resources standing in the way. Therefore the AI goes on holy war against humans to maximize its possible reward, and all the humans die.
Thanks for the response. Definitely going to dive deeper into this.