Shmi comments on The human side of interaction

Shmi 26 Jan 2019 6:46 UTC
2 points
I’d expect a more promising approach toward capability amplification to be focused on actual human behavior, not on explicit human feedback. Humans are notoriously bad at explaining real reasons for why we do what we do, so accepting their words as quality feedback seems counterproductive. The feedback need not be ignored, but treated as just another source of information, just like lies and misguided ideas are a source of information about the person expressing them. A reward function would not be anything explicit, but a sort of a Turing test, (Pinocchio test?): fitting in and being implicitly recognized as a fellow human. That’s how real humans learn, and seems like a promising way to start, at least in some constrained environment with reasonably clear behavioral boundaries and expectations.
- Rohin Shah 26 Jan 2019 18:12 UTC
  3 points
  Parent
  Humans are notoriously bad at explaining real reasons for why we do what we do, so accepting their words as quality feedback seems counterproductive. The feedback need not be ignored, but treated as just another source of information, just like lies and misguided ideas are a source of information about the person expressing them.
  Agreed, but the hard question seems to be how you interpret that feedback, given that you can’t interpret it literally.
  A reward function would not be anything explicit, but a sort of a Turing test, (Pinocchio test?): fitting in and being implicitly recognized as a fellow human.
  Fyi, this sounds like imitation learning.