A1987dM comments on No Universally Compelling Arguments in Math or Science

A1987dM 5 Nov 2013 20:39 UTC
5 points
Yes. You can convince a sufficiently rational paperclip maximizer that killing people is Yudkowsy::evil, but you can’t convince it to not take Yudkowsy::evil actions, no matter how rational it is. AKA the orthogonality thesis (when talking about other minds) and “the utility function is not up for grabs” (when talking about ourselves).
- TheAncientGeek 5 Nov 2013 21:00 UTC
  −10 points
  Parent
  You are using rational to mean instrumentally rational. You can’t disprove the existence of agents that value rationality terminally, for its own sake … indeed the OT means they must exist. And when people say rationally persuadablable agents exist, that iswhat they mean by rational....they are not using your language.
  - A1987dM 6 Nov 2013 9:39 UTC
    2 points
    Parent
    I don’t see how that makes any difference. You could convince “agents that value rationality terminally, for its own sake” that killing people is evil, but you couldn’t necessarily convince them not to kill people, much like Pebblesorters could convince them that 15 is composite but they couldn’t necessarily convince them not to heap 15 pebbles together.
    - TheAncientGeek 6 Nov 2013 9:49 UTC
      −8 points
      Parent
      You can’t necessarily convince them, and I didn’t say you could, necessarily. That depends on the probability of the claims that morality can be figured out and/or turned into a persuasive argument. These need to be estimated in order to estimate the likeliness of the MIRI solution being optimal, since the higher the probability of alternatives the lower the probability of the MIRI scenario.
      
      Probabilities make a difference.