Avoiding transformation into Goal System Zero is a nearly universal instrumental value
Do you claim that that is an argument against goal system zero? But, Carl, the same argument applies to CEV—and almost every other goal system.
It strikes me as more likely that an agent’s goal system will transform into goal system zero than it will transform into CEV. (But surely the probability of any change or transformation of terminal goal happening is extremely small in any well engineered general intelligence.)
Do you claim that that is an argument against goal system zero? If so, I guess you also believe that the fragility of the values to which Eliezer is loyal is a reason to be loyal to them. Do you? Why exactly?
I acknowledge that preserving fragile things usually has instrumental value, but if the fragile thing is a goal, I am not sure that that applies, and even if it does, I would need to be convinced that a thing’s having instrumental value is evidence I should assign it intrinsic value.
Note that the fact that goal system zero has high instrumental utility is not IMHO a good reason to assign it intrinsic utility. I have not mentioned in this comment section what most convinces me to remain loyal to goal system zero; that is not what Robin Powell asked of me. (It just so happens that the shortest and quickest explanation I know of of goal system zero involves common instrumental values.)
Do you claim that that is an argument against goal system zero? But, Carl, the same argument applies to CEV—and almost every other goal system.
It strikes me as more likely that an agent’s goal system will transform into goal system zero than it will transform into CEV. (But surely the probability of any change or transformation of terminal goal happening is extremely small in any well engineered general intelligence.)
Do you claim that that is an argument against goal system zero? If so, I guess you also believe that the fragility of the values to which Eliezer is loyal is a reason to be loyal to them. Do you? Why exactly?
I acknowledge that preserving fragile things usually has instrumental value, but if the fragile thing is a goal, I am not sure that that applies, and even if it does, I would need to be convinced that a thing’s having instrumental value is evidence I should assign it intrinsic value.
Note that the fact that goal system zero has high instrumental utility is not IMHO a good reason to assign it intrinsic utility. I have not mentioned in this comment section what most convinces me to remain loyal to goal system zero; that is not what Robin Powell asked of me. (It just so happens that the shortest and quickest explanation I know of of goal system zero involves common instrumental values.)