JamesAndrix comments on Open Thread: July 2009

JamesAndrix 3 Jul 2009 17:16 UTC
0 points
At the risk of providing a non-answer I’ll say: Operant conditioning.

The test problem, the solving of it, and getting an answer correspond to a light coming on, pressing a lever, and getting food.

We’ve long since been trained that solving problems in that context build up token points that will pay out later in praise and promises of money.

Presumably this training translates fairly well to real world problems.
- [deleted] 6 Jul 2009 5:39 UTC
  −2 points
  Parent
  Indeed, that’s the conclusion I came to. What I wonder now is how we operant-condition ourselves without just reinforcing reinforcement itself. Which, I suppose, is more or less precisely what the Friendly AI problem is.