In the example of a human overcoming the “win at chess” frame, I don’t see how that reduces the orthogonality. An example given is that “the point is to have a good time” but I could comparably plausible see that a parent could also go “we need to tech this kid that world is a hard place” and go all out. But feature the relevant kind of frame shifting away from simple win but there is no objectively right “better goal” they don’t converge on what the greater point might be.
I feel like applied to humans just because people do ethics doesn’t mean that they agree on it. I can also see that there can be multiple “fronts” of progress, different political systems will call for different kind of ethical progress. The logic seems to be that because humans are capable of modest general intelligence if a human were to have a silly goal they would refelct out of it. This would seem to suggest that if a country would be in a war of aggression they would just see the error of their ways and recorrect to be peaceful. While we often do think our enemies are doing ethics wrong, I don’t think that goal non-sharing is effectively explained by the other party not being able to sustain ethical thinking.
Thefore I think there is a hidden assumption that goal transcendense happens in the same direction in all agents and this is needed in order for goal transcendence to wipe out orthogonality. Worse we might start with the same goal and reinterpret the situation to mean different things such as chess not being sufficient to nail down whether it is more important for children to learn to be sociable or efficient in the world. One could even imagine worlds where one of the answers would be heavily favoured but still could contain identical games of chess (living in Sparta vs in the internet age). In so far that human opinions agreeing is based on trying to solve the same “human condition” that could be in jeopardy if the “ai condition” is genuinely different.
In the example of a human overcoming the “win at chess” frame, I don’t see how that reduces the orthogonality. An example given is that “the point is to have a good time” but I could comparably plausible see that a parent could also go “we need to tech this kid that world is a hard place” and go all out. But feature the relevant kind of frame shifting away from simple win but there is no objectively right “better goal” they don’t converge on what the greater point might be.
I feel like applied to humans just because people do ethics doesn’t mean that they agree on it. I can also see that there can be multiple “fronts” of progress, different political systems will call for different kind of ethical progress. The logic seems to be that because humans are capable of modest general intelligence if a human were to have a silly goal they would refelct out of it. This would seem to suggest that if a country would be in a war of aggression they would just see the error of their ways and recorrect to be peaceful. While we often do think our enemies are doing ethics wrong, I don’t think that goal non-sharing is effectively explained by the other party not being able to sustain ethical thinking.
Thefore I think there is a hidden assumption that goal transcendense happens in the same direction in all agents and this is needed in order for goal transcendence to wipe out orthogonality. Worse we might start with the same goal and reinterpret the situation to mean different things such as chess not being sufficient to nail down whether it is more important for children to learn to be sociable or efficient in the world. One could even imagine worlds where one of the answers would be heavily favoured but still could contain identical games of chess (living in Sparta vs in the internet age). In so far that human opinions agreeing is based on trying to solve the same “human condition” that could be in jeopardy if the “ai condition” is genuinely different.