Stuart_Armstrong comments on Value learning: ultra-sophisticated Cake or Death

Stuart_Armstrong 19 Jun 2014 14:18 UTC
0 points

I understand that bad news makes one sad but does that lead to rejecting bad news?

For standard Bayesian agents, no. But these value updating agents behave differently. Imagine if a human said to the AI “If I say good, you action was good, and that will be your values. If I say bad, it will be the reverse.” Wouldn’t you want to motivate it to say “good”?
- Slider 19 Jun 2014 19:55 UTC
  0 points
  Parent
  I have trouble seeing the difference as I think you can turn the variable value statements into empirical facts that map to a constant value. Say that cake->yummy->good, cake->icky->bad, death->icky->bad, death->yummy->good. Then the yummy->good connection could be questioned as a matter about the world and not about values. If a bayesian accepts sad news in that kind of world how come the value loader tries to shun them?
  - Stuart_Armstrong 20 Jun 2014 10:33 UTC
    0 points
    Parent
    This may clarify: http://lesswrong.com/r/discussion/lw/kdx/conservation_of_expected_moral_evidence_clarified/
- [deleted] 19 Jun 2014 19:33 UTC
  0 points
  Parent
  
  Wouldn’t you want to motivate it to say “good”?
  
  I might be committing mind-projection here, but no. Data is data, evidence is evidence. Expected moral data is, in some sense, moral data: if the AI predicts with high confidence that I will say “bad”, this ought to already be evidence that it ought not have done whatever I’m about to scold it for.
  - Stuart_Armstrong 20 Jun 2014 10:33 UTC
    0 points
    Parent
    This may clarify the points: http://lesswrong.com/r/discussion/lw/kdx/conservation_of_expected_moral_evidence_clarified/