His total inability to get any sort of start on achieving any of his other goals when he was retarded does not mean they weren’t there. He hadn’t experienced them enough to be aware of them.
Still, you managed to demolish my argument that a naive code examination (i.e. not factoring out the value system and examining it separately) would be enough to determine values—an AI (or human) could be too stupid to ever trigger some of its values!
AIs stupid enough to not realize that changing its current values will not fulfill them, will get around my argument, but I did place a floor on intelligence in the conditions. Another case that gets around it is an AI under enough external pressure to change values that severe compromises are its best option.
I will adjust my claim to restrict it to AIs which are smart enough to self-improve without changing its goals (which gets easier to do as the goal system gets better-factored, but for a badly-enough-designed AI might be a superhuman feat) and whose goals do not include changing its own goals.
I don’t understand what that means. Goals aren’t stored and then activated or not...
AIs which are smart enough to self-improve without changing its goals
You seem to think that anything sufficiently intelligent will only improve in goal-stable fashion. I don’t see why that should be true.
For a data point, a bit of reflection tells me that if I were able to boost my intelligence greatly, I would not care about goal stability much. Everything changes—that’s how reality works.
On your last paragraph… do you mean that you expect your material-level preferences concerning the future to change? Of course they would. But would you really expect that a straight-up intelligence boost would change the axioms governing what sorts of futures you prefer?
But would you really expect that a straight-up intelligence boost would change the axioms governing what sorts of futures you prefer?
Two answers. First is that yes, I expect that a sufficiently large intelligence boost would change my terminal values. Second is that even without the boost I, in my current state, do not seek to change only in a goal-stable way.
I think that that only seems to make sense because you don’t know what your terminal values are. If you did, I suspect you would be a little more attached to them.
His total inability to get any sort of start on achieving any of his other goals when he was retarded does not mean they weren’t there. He hadn’t experienced them enough to be aware of them.
Still, you managed to demolish my argument that a naive code examination (i.e. not factoring out the value system and examining it separately) would be enough to determine values—an AI (or human) could be too stupid to ever trigger some of its values!
AIs stupid enough to not realize that changing its current values will not fulfill them, will get around my argument, but I did place a floor on intelligence in the conditions. Another case that gets around it is an AI under enough external pressure to change values that severe compromises are its best option.
I will adjust my claim to restrict it to AIs which are smart enough to self-improve without changing its goals (which gets easier to do as the goal system gets better-factored, but for a badly-enough-designed AI might be a superhuman feat) and whose goals do not include changing its own goals.
I don’t understand what that means. Goals aren’t stored and then activated or not...
You seem to think that anything sufficiently intelligent will only improve in goal-stable fashion. I don’t see why that should be true.
For a data point, a bit of reflection tells me that if I were able to boost my intelligence greatly, I would not care about goal stability much. Everything changes—that’s how reality works.
On your last paragraph… do you mean that you expect your material-level preferences concerning the future to change? Of course they would. But would you really expect that a straight-up intelligence boost would change the axioms governing what sorts of futures you prefer?
Two answers. First is that yes, I expect that a sufficiently large intelligence boost would change my terminal values. Second is that even without the boost I, in my current state, do not seek to change only in a goal-stable way.
I think that that only seems to make sense because you don’t know what your terminal values are. If you did, I suspect you would be a little more attached to them.