Suppose the AI builds devices in the environment, especially computational devices designed to offload cognitive labor. What do you want to happen when the AI is “switched off”? Hence, magical category.
Interesting, I didn’t think of this situation. How do you define “lack of intelligence” or “removal of the effect of intelligence” in the environment, so that an AI can implement that state? How is this state best achieved?
Once the system is established, the world will ever be determined by a specific goal system, even if the goal is for the world to appear as if no AI is present, starting from a certain time. The best solution is for AI to pretend of not being present, “pulling the planets along their elliptic orbits”.
As an aside, “waiting for Eliezer to find a loophole” probably does not constitute a safe and effective means of testing AI utility functions. This is something we want provable from first principles, not “proven” by “well, I can’t think of a counterexample”.
Right. I know you realize this, and the post was fine in the context of “random discussion on the internet”. However, if someone wants to actually, seriously specify a utility function for an AI any description that starts with “here’s a high-level rule to avoid bad things” and then works from there looking for potential loopholes is deeply and fundamentally misguided completely independently of the rule proposed.
Suppose the AI builds devices in the environment, especially computational devices designed to offload cognitive labor. What do you want to happen when the AI is “switched off”? Hence, magical category.
Interesting, I didn’t think of this situation. How do you define “lack of intelligence” or “removal of the effect of intelligence” in the environment, so that an AI can implement that state? How is this state best achieved?
Once the system is established, the world will ever be determined by a specific goal system, even if the goal is for the world to appear as if no AI is present, starting from a certain time. The best solution is for AI to pretend of not being present, “pulling the planets along their elliptic orbits”.
D’oh. Yes, of course, that breaks it.
As an aside, “waiting for Eliezer to find a loophole” probably does not constitute a safe and effective means of testing AI utility functions. This is something we want provable from first principles, not “proven” by “well, I can’t think of a counterexample”.
Of course, hence ”...and probably complicated in some other ways that haven’t occurred to me in two minutes.”.
Right. I know you realize this, and the post was fine in the context of “random discussion on the internet”. However, if someone wants to actually, seriously specify a utility function for an AI any description that starts with “here’s a high-level rule to avoid bad things” and then works from there looking for potential loopholes is deeply and fundamentally misguided completely independently of the rule proposed.