I think the right direction is to think about what we want from self-modification informally, and refine that into a toy problem that the agent in the post can’t solve. It’s confusing in a philosophical kind of way. I don’t have the right skill for it, but I’ve been trying to mine Eliezer’s Arbital writings.
I think the right direction is to think about what we want from self-modification informally, and refine that into a toy problem that the agent in the post can’t solve. It’s confusing in a philosophical kind of way. I don’t have the right skill for it, but I’ve been trying to mine Eliezer’s Arbital writings.