i don’t think this really makes sense. “alignment” means we can align it to the values of a person or group. if that person or group’s CEV wants there to be a hell where people they think of as bad suffer maximally, or if that CEV even just wants there to be a meat industry with real animals in it, then that’s exactly what the AI will implement. “alignment” is not some objectively good utility function within which variations in human values don’t matter that much, because there is no objective good.
an algorithm for universal prosocial bargaining that can be verified by all of its users, including militaries and states
i don’t think we get that, i think we get an AI that takes over the world very quickly no matter what. it’s just that, if it’s aligned to good values, we then get utopia rather than extinction or hell.
yeah that sounds like the MIRI perspective. I continue to believe there is a fundamental shared structure in all moral systems and that identifying it would allow universalized co-protection.
i don’t think this really makes sense. “alignment” means we can align it to the values of a person or group. if that person or group’s CEV wants there to be a hell where people they think of as bad suffer maximally, or if that CEV even just wants there to be a meat industry with real animals in it, then that’s exactly what the AI will implement. “alignment” is not some objectively good utility function within which variations in human values don’t matter that much, because there is no objective good.
i don’t think we get that, i think we get an AI that takes over the world very quickly no matter what. it’s just that, if it’s aligned to good values, we then get utopia rather than extinction or hell.
yeah that sounds like the MIRI perspective. I continue to believe there is a fundamental shared structure in all moral systems and that identifying it would allow universalized co-protection.