Eliezer, this particular point you made is of concern to me:
“* When an optimization process seems to have an inconsistent preference ranking—for example, it’s quite possible in evolutionary biology for allele A to beat out allele B, which beats allele C, which beats allele A—then you can’t interpret the system as performing optimization as it churns through its cycles. Intelligence is efficient optimization; churning through preference cycles is stupid, unless the interim states of churning have high terminal utility.”
You see, it seems quite likely to me that humans evaluate utility in such a circular way under many circumstances, and therefore aren’t performing any optimizations. Ask middle school girls to rank boyfriend prenference and you find Billy beats Joey who beats Micky who beats Billy… Now, when you ask an AI to carry out an optimization of human utility based on observing how people optimize their own utility as evidence, what do you suppose will happen? Certainly humans optimize somethings, sometimes, but optimizations of some things are at odds with others. Think how some people want both security and adventure. A man might have one (say security), be happy for a time, get bored, then move on to the other and repeat the cycle. Is opimization a flux of the two states? Or the one that gives the most utility over the other? I suppose you could take an integral of utility over time and find which set of states = max utility over time. How are we going to begin to define utility? “We like it! But it has to be real, no wire-heading.” Now throw in the complication of different people having utility functions at odds with each other. Not everyone can be king of the world, no matter how much utility they will derive from this position. Now ask the machine to be efficient- do it as easily as possible, so that easier solutions are favored over more difficult “expensive” ones.
Even if we avoid all the pitfalls of ‘misunderstanding’ the initial command to ‘optimize utility’, what gives you reason to assume you or I or any of the small, small subsegment of the population that reads this blog is going to like what the vector sum of all human preferences, utilities, etc. coughs up?
Eliezer, this particular point you made is of concern to me: “* When an optimization process seems to have an inconsistent preference ranking—for example, it’s quite possible in evolutionary biology for allele A to beat out allele B, which beats allele C, which beats allele A—then you can’t interpret the system as performing optimization as it churns through its cycles. Intelligence is efficient optimization; churning through preference cycles is stupid, unless the interim states of churning have high terminal utility.”
You see, it seems quite likely to me that humans evaluate utility in such a circular way under many circumstances, and therefore aren’t performing any optimizations. Ask middle school girls to rank boyfriend prenference and you find Billy beats Joey who beats Micky who beats Billy… Now, when you ask an AI to carry out an optimization of human utility based on observing how people optimize their own utility as evidence, what do you suppose will happen? Certainly humans optimize somethings, sometimes, but optimizations of some things are at odds with others. Think how some people want both security and adventure. A man might have one (say security), be happy for a time, get bored, then move on to the other and repeat the cycle. Is opimization a flux of the two states? Or the one that gives the most utility over the other? I suppose you could take an integral of utility over time and find which set of states = max utility over time. How are we going to begin to define utility? “We like it! But it has to be real, no wire-heading.” Now throw in the complication of different people having utility functions at odds with each other. Not everyone can be king of the world, no matter how much utility they will derive from this position. Now ask the machine to be efficient- do it as easily as possible, so that easier solutions are favored over more difficult “expensive” ones.
Even if we avoid all the pitfalls of ‘misunderstanding’ the initial command to ‘optimize utility’, what gives you reason to assume you or I or any of the small, small subsegment of the population that reads this blog is going to like what the vector sum of all human preferences, utilities, etc. coughs up?