As an aside, “waiting for Eliezer to find a loophole” probably does not constitute a safe and effective means of testing AI utility functions. This is something we want provable from first principles, not “proven” by “well, I can’t think of a counterexample”.
Right. I know you realize this, and the post was fine in the context of “random discussion on the internet”. However, if someone wants to actually, seriously specify a utility function for an AI any description that starts with “here’s a high-level rule to avoid bad things” and then works from there looking for potential loopholes is deeply and fundamentally misguided completely independently of the rule proposed.
As an aside, “waiting for Eliezer to find a loophole” probably does not constitute a safe and effective means of testing AI utility functions. This is something we want provable from first principles, not “proven” by “well, I can’t think of a counterexample”.
Of course, hence ”...and probably complicated in some other ways that haven’t occurred to me in two minutes.”.
Right. I know you realize this, and the post was fine in the context of “random discussion on the internet”. However, if someone wants to actually, seriously specify a utility function for an AI any description that starts with “here’s a high-level rule to avoid bad things” and then works from there looking for potential loopholes is deeply and fundamentally misguided completely independently of the rule proposed.