Florian_Dietz comments on Teaching an AI not to cheat?

Florian_Dietz Dec 23, 2016, 10:57 PM
0 points
My definition of cheating for these purposes is essentially “don’t do what we don’t want you to do, even if we never bothered to tell you so and expected you to notice it on your own”. This skill would translate well to real-world domains.

Of course, if the games you are using to teach what cheating is are too simple, then you don’t want to use those kinds of games. If neither board games nor simple game theory games are complex enough, then obviously you need to come up with a more complicated kind of game. It seems to me that finding a difficult game to play that teaches you about human expectations and cheating is significantly easier than defining “what is cheating” manually.

One simple example that could be used to teach an AI: let it play an empire-building videogame, and ask it to “reduce unemployment”. Does it end up murdering everyone who is unemployed? That would be cheating. This particular example even translates really well to reality, for obvious reasons.

By the way, why would you not want the AI to be left in “a nebulous fog”. The more uncertain the AI is about what is and is not cheating, the more cautious it will be.