I think we should expect AGIs to have more stable goal systems that are less affected by their beliefs and environment than humans. Remember that humans are a symbol processing system on top of a behavior learning system on top of an association learning system. And we don’t let our beliefs propagate by default, and our brains experience physiological changes as a result of aging. It seems like there would be a lot more room for goal change in such a messy aging architecture.
The least intelligent humans tend not to be very cautious and tend to have poor impulse control. (Examples: children and petty criminals.) The areas of the brain associated with impulse control only develop later in life, just like intelligence only develops later in life, so we tend to assume intelligence and cautious behavior are correlated. But things don’t have to be that way for AGIs. Let’s be careful not to generalize from humans and assume that unintelligent AGIs will be incautious about modifying themselves.
If such a poorly designed system as a human has the ability to change its goals in response to stimuli, and we find this to be a desirable property, then surely a carefully designed AI will have the same property, unless we have an even better property to replace it with? The argument, “humans are bad, AIs are good, therefore AIs will do something bad” seems unlikely at face value.
(Note that I would like something that more reliably acquires desirable goals than humans, so still think FAI research is worthwhile, but I would prefer that only the strongest arguments be presented for it, especially given the base rate of objection to FAI-style arguments.)
I think we should expect AGIs to have more stable goal systems that are less affected by their beliefs and environment than humans. Remember that humans are a symbol processing system on top of a behavior learning system on top of an association learning system. And we don’t let our beliefs propagate by default, and our brains experience physiological changes as a result of aging. It seems like there would be a lot more room for goal change in such a messy aging architecture.
The least intelligent humans tend not to be very cautious and tend to have poor impulse control. (Examples: children and petty criminals.) The areas of the brain associated with impulse control only develop later in life, just like intelligence only develops later in life, so we tend to assume intelligence and cautious behavior are correlated. But things don’t have to be that way for AGIs. Let’s be careful not to generalize from humans and assume that unintelligent AGIs will be incautious about modifying themselves.
If such a poorly designed system as a human has the ability to change its goals in response to stimuli, and we find this to be a desirable property, then surely a carefully designed AI will have the same property, unless we have an even better property to replace it with? The argument, “humans are bad, AIs are good, therefore AIs will do something bad” seems unlikely at face value.
(Note that I would like something that more reliably acquires desirable goals than humans, so still think FAI research is worthwhile, but I would prefer that only the strongest arguments be presented for it, especially given the base rate of objection to FAI-style arguments.)
Why is changing one’s goals in response to stimuli a valuable property? A priori, it doesn’t seem valuable or harmful.
This wasn’t meant to be an argument either way for FAI research, just a thought on something Pei said.