This reminds me of what I like best about what had been Paul’s approach (now it’s more people’s): a better acknowledgement of the limitations of humans and the difficulty of building models of them that are any less complex than the original human itself. I realize there are many reasons people would want not to worry about these things, main one being that additional, stronger constraints make the problem easier to solve for, but I think for actually solving AI alignment we’re going to have to face these more general challenges with weaker assumptions. I think Paul’s existing writing probably doesn’t go far enough, but he does call this the “easy” problem, so we can address the epistemological issues surrounding one agent learning about what exists in another agents experience in the “hard” version.
This reminds me of what I like best about what had been Paul’s approach (now it’s more people’s): a better acknowledgement of the limitations of humans and the difficulty of building models of them that are any less complex than the original human itself. I realize there are many reasons people would want not to worry about these things, main one being that additional, stronger constraints make the problem easier to solve for, but I think for actually solving AI alignment we’re going to have to face these more general challenges with weaker assumptions. I think Paul’s existing writing probably doesn’t go far enough, but he does call this the “easy” problem, so we can address the epistemological issues surrounding one agent learning about what exists in another agents experience in the “hard” version.