This argument is based on drawing an analogy between
Humans building an AI; and
An AI improving itself
in the sense that both have to get their values into a system. But the two situations are substantially disanalogous because the AI starts with a system that has its values already implemented. it can simply improve parts that are independent of its values. Doing this would be easier with a modular architecture, but it should be doable even without that. It’s much easier to find parts of the system that don’t affect values than it is to nail down exactly where the values are encoded.
“It’s much easier to find parts of the system that don’t affect values than it is to nail down exactly where the values are encoded.”—I really don’t see why this is true, how can you only change parts that don’t affect values if you don’t know where values are encoded?
This argument is based on drawing an analogy between
Humans building an AI; and
An AI improving itself
in the sense that both have to get their values into a system. But the two situations are substantially disanalogous because the AI starts with a system that has its values already implemented. it can simply improve parts that are independent of its values. Doing this would be easier with a modular architecture, but it should be doable even without that. It’s much easier to find parts of the system that don’t affect values than it is to nail down exactly where the values are encoded.
“It’s much easier to find parts of the system that don’t affect values than it is to nail down exactly where the values are encoded.”—I really don’t see why this is true, how can you only change parts that don’t affect values if you don’t know where values are encoded?