Thanks, this all makes sense and I agree. Asking what it “really” values was intentionally anthropomorphic, as I was asking about what “it will want to work around constraints” really meant in practical terms, a claim which I believe was made by others.
I’m totally on board with “we can’t express our actual desires with a finite list of constraints”, just wasn’t with “an AI will circumvent constraints for kicks”.
I guess there’s a subtlety to it—if you assign: “you get 1 utilon per paperclip that exists, and you are permitted to manufacture 10 paperclips per day”, then we’ll get problematic side effects as described elsewhere. If you assign “you get 1 utilon per paperclip that you manufacture, up to a maximum of 10 paperclips/utilons per day” or something along those lines, I’m not convinced that any sort of “circumvention” behavior would occur (though the AI would probably wipe out all life to ensure that nothing could adversely affect its future paperclip production capabilities, so the distinction is somewhat academic).
Thanks, this all makes sense and I agree. Asking what it “really” values was intentionally anthropomorphic, as I was asking about what “it will want to work around constraints” really meant in practical terms, a claim which I believe was made by others.
I’m totally on board with “we can’t express our actual desires with a finite list of constraints”, just wasn’t with “an AI will circumvent constraints for kicks”.
I guess there’s a subtlety to it—if you assign: “you get 1 utilon per paperclip that exists, and you are permitted to manufacture 10 paperclips per day”, then we’ll get problematic side effects as described elsewhere. If you assign “you get 1 utilon per paperclip that you manufacture, up to a maximum of 10 paperclips/utilons per day” or something along those lines, I’m not convinced that any sort of “circumvention” behavior would occur (though the AI would probably wipe out all life to ensure that nothing could adversely affect its future paperclip production capabilities, so the distinction is somewhat academic).
In any case, thanks for the detailed reply :)