IAFF-User-111 comments on An idea for creating safe AI

IAFF-User-111 13 Mar 2017 1:07 UTC
0 points
AF
I spoke with Huw about this idea. I was thinking along similar lines at some point, but only for “safe-shutdown”, e.g. if you had a self-driving car that anticipated encountering a dangerous situation and wanted to either:
1. pull over immediately
2. cede control to a human operator
It seems intuitive to give it a shutdown policy that triggers in such cases, and that aims to minimize a combined objective of time-to-shutdown and risk-of-shutdown. (Of course, this doesn’t deal with interrupting the agent, ala Armstrong and Orseau.)

Huw pointed out that a similar strategy can be used for any “genie”-style goal (i.e. you want an agent to do one thing as efficiently as possible, and then shut-down until you give it another command), which made me substantially more interested in it.

This seems similar in spirit to giving your agent a short horizon, but now you also have regular terminations, by default, which has some extra pros and cons.