I don’t see how that makes any difference. You could convince “agents that value rationality terminally, for its own sake” that killing people is evil, but you couldn’t necessarily convince them not to kill people, much like Pebblesorters could convince them that 15 is composite but they couldn’t necessarily convince them not to heap 15 pebbles together.
You can’t necessarily convince them, and I didn’t say you could, necessarily. That depends on the probability of the claims that morality can be figured out and/or turned into a persuasive argument. These need to be estimated in order to estimate the likeliness of the MIRI solution being optimal, since the higher the probability of alternatives the lower the probability of the MIRI scenario.
I don’t see how that makes any difference. You could convince “agents that value rationality terminally, for its own sake” that killing people is evil, but you couldn’t necessarily convince them not to kill people, much like Pebblesorters could convince them that 15 is composite but they couldn’t necessarily convince them not to heap 15 pebbles together.
You can’t necessarily convince them, and I didn’t say you could, necessarily. That depends on the probability of the claims that morality can be figured out and/or turned into a persuasive argument. These need to be estimated in order to estimate the likeliness of the MIRI solution being optimal, since the higher the probability of alternatives the lower the probability of the MIRI scenario.
Probabilities make a difference.