No one has yet solved “and then stop” for AGI even though this should be easier than a generic stop button which in turn should be easier than full corrigibility. (Also I don’t think we know how to refer to things in the world in a way that gets an AI to care about it rather than observations of it or its representation of it)
No one has yet solved “and then stop” for AGI even though this should be easier than a generic stop button which in turn should be easier than full corrigibility. (Also I don’t think we know how to refer to things in the world in a way that gets an AI to care about it rather than observations of it or its representation of it)