Long before goal mutation is a problem malformed constraints become a problem. Consider a thought experiment: Someone offers to pay you 100 dollars when a wheelbarrow is full of water from a nearby lake, and provides you with the wheelbarrow and a teaspoon. Before you have to worry about people deciding they don’t care about 100 dollars, you need to decide how to keep them from just pushing the wheelbarrow into the lake.
Long before goal mutation is a problem malformed constraints become a problem.
True. But we are not arguing about what is a bigger (or earlier) problem. I’m being told that an AI can not, absolutely can NOT change its original goals (or terminal values). And that looks very handwavy to me.
Long before goal mutation is a problem malformed constraints become a problem. Consider a thought experiment: Someone offers to pay you 100 dollars when a wheelbarrow is full of water from a nearby lake, and provides you with the wheelbarrow and a teaspoon. Before you have to worry about people deciding they don’t care about 100 dollars, you need to decide how to keep them from just pushing the wheelbarrow into the lake.
True. But we are not arguing about what is a bigger (or earlier) problem. I’m being told that an AI can not, absolutely can NOT change its original goals (or terminal values). And that looks very handwavy to me.