This thing looks more and more relevant as I think about it. What it does is not just optimizing an objective function in a weird and unexpected way, but actually learning it in all its complicatedness from observed human behavior.
Would it be an overestimation to call this a FAI research paper?
The point is that it’s not, but making it so is a design goal of the paper.
Example: Mario immediately jumping into a pit at level 2. According to the learned utility function of the system, it’s a good idea. According to ours, it’s not.
Just as with optimizing smiling faces. But while that one was purely a thought experiment, this paper presents a practical, experimentally testable benchmark for utility function learning, and, by the way, shows a not-yet-perfect but working solution for it. (After all, Mario’s Flying Goomba Kick of High Munchkinry definitely satisfies our utility functions.)
Nothing. It’s mostly useful to illustrate cognitive biases around AI, demonstrating how alien a simple “utility”-maximizing process is, compared to how humans think about things. It’s an example answer to the standard, “But my AI wouldn’t do a stupid thing like that” objection. Well, yes, actually, it would. And the simpler and more elegant your design is, the higher the probability that it will do things like that: things you don’t even think about because to a human, they’re obviously stupid. (At the same time, of course, it will also do things that seem utterly brilliant to a human, for the exact same reason: finding that brilliant move first required doing something stupid, like jumping at an enemy.)
It also illustrates some decision theory concepts, like looking into the future to see how your actions fare, and the importance of matching the machine’s “utility” with a human’s utility. (In each game, the actual game utility differs in certain ways from the simple utility function derived from scoring, and it’s these differences that create the bad-weird moves.)
This thing looks more and more relevant as I think about it. What it does is not just optimizing an objective function in a weird and unexpected way, but actually learning it in all its complicatedness from observed human behavior.
Would it be an overestimation to call this a FAI research paper?
AI research paper? Maybe not.
What’s friendly about this AI?
The point is that it’s not, but making it so is a design goal of the paper.
Example: Mario immediately jumping into a pit at level 2. According to the learned utility function of the system, it’s a good idea. According to ours, it’s not.
Just as with optimizing smiling faces. But while that one was purely a thought experiment, this paper presents a practical, experimentally testable benchmark for utility function learning, and, by the way, shows a not-yet-perfect but working solution for it. (After all, Mario’s Flying Goomba Kick of High Munchkinry definitely satisfies our utility functions.)
Nothing. It’s mostly useful to illustrate cognitive biases around AI, demonstrating how alien a simple “utility”-maximizing process is, compared to how humans think about things. It’s an example answer to the standard, “But my AI wouldn’t do a stupid thing like that” objection. Well, yes, actually, it would. And the simpler and more elegant your design is, the higher the probability that it will do things like that: things you don’t even think about because to a human, they’re obviously stupid. (At the same time, of course, it will also do things that seem utterly brilliant to a human, for the exact same reason: finding that brilliant move first required doing something stupid, like jumping at an enemy.)
It also illustrates some decision theory concepts, like looking into the future to see how your actions fare, and the importance of matching the machine’s “utility” with a human’s utility. (In each game, the actual game utility differs in certain ways from the simple utility function derived from scoring, and it’s these differences that create the bad-weird moves.)