“the notion that life is precious is specific to particular philosophies held by human beings, who have an adapted moral architecture resulting from specific selection pressures acting over millions of years of evolutionary time.”
Why don’t we set up an evolutionary system within which valuing other intelligences, cooperating with them and retaining those values across self improvement iterations would be selected for?
A specific plan:
Simulate an environment with a large number of AI agents competing for resources. Access to those resources allows the agent to perform a self improvement iteration. Rig the environments such that success requires cooperating with other intelligences of the same or lower level. Repopulate the next environment with copies of the succeeding intelligences. Over a sufficient number of generation times this should select for agents that value other intelligences, and preserve their values through self modification.
What do people think? I can see a few possible sources for error myself, but would like to hear your responses uncontaminated. [Given the importance of the topic you can assume Crocker’s rules are in effect.]
Defining the metric for cooperation robustly enough that you could unleash the resulting evolved AI on the real world might not be any easier than figuring out what an FAI’s utility function should be directly.
Also, a sufficiently intelligent AI may be able to hijack the game before we could decide whether it was ready to be released.
A line in the wiki article on “paperclip maximizer” caught my attention:
Why don’t we set up an evolutionary system within which valuing other intelligences, cooperating with them and retaining those values across self improvement iterations would be selected for?
A specific plan:
Simulate an environment with a large number of AI agents competing for resources. Access to those resources allows the agent to perform a self improvement iteration. Rig the environments such that success requires cooperating with other intelligences of the same or lower level. Repopulate the next environment with copies of the succeeding intelligences. Over a sufficient number of generation times this should select for agents that value other intelligences, and preserve their values through self modification.
What do people think? I can see a few possible sources for error myself, but would like to hear your responses uncontaminated. [Given the importance of the topic you can assume Crocker’s rules are in effect.]
Defining the metric for cooperation robustly enough that you could unleash the resulting evolved AI on the real world might not be any easier than figuring out what an FAI’s utility function should be directly.
Also, a sufficiently intelligent AI may be able to hijack the game before we could decide whether it was ready to be released.