Imagine that I will either stub my toe a minute from now (outcome A), or won’t (outcome B). I don’t see why your proposed utility function would order A and B correctly. Since I can come up with many examples of A and B, there’s a high chance that the gradient of your utility function at the current state of the world is pointing in a wrong direction. That seems to be a bad sign in itself, and also rules out the possibility of testing your utility function would a weak AI, because it wouldn’t improve humanity’s welfare. So we need to put your function in a strong AI, push the button and hope for the best.
You have tried to guess what kind of future that strong AI would create, but I don’t see how to make such guesses with confidence. The failure modes might be much more severe than something like “unnecessary militancy against extra-terrestrial life”. It’s most likely that humans won’t exist at all, because some other device that you haven’t thought of would be better at maximizing “logical depth” or whatever. For a possible nightmare failure mode, imagine that humans are best at creating “logical depth” when they’re in pain because that makes them think frantically, so the AI will create many humans and torture them.
Eliezer had a nice post arguing against similar approaches.
I am trying to develop a function that preserves human existence. You are objecting that you might stub your toe. I think we may be speaking past each other.
I feel that a good FAI designer shouldn’t dismiss these objections so easily. The toe-stubbing problem is much simpler than preserving human existence. To solve the toe-stubbing problem, you just need to correctly order two outcomes A and B. To preserve human existence, you need to order outcome A against all possible outcomes B that a powerful creative intelligence might come up with. Any solution to the bigger problem that doesn’t allow you to solve smaller problems is very suspicious to me.
I guess I disagree with this assessment of which problem is easier.
Humanity continues to exist while people stub their toes all the time. I.e., currently the humanity existing problem is close to solved, and the toe stubbing problem has by and large not been.
Imagine that I will either stub my toe a minute from now (outcome A), or won’t (outcome B). I don’t see why your proposed utility function would order A and B correctly. Since I can come up with many examples of A and B, there’s a high chance that the gradient of your utility function at the current state of the world is pointing in a wrong direction. That seems to be a bad sign in itself, and also rules out the possibility of testing your utility function would a weak AI, because it wouldn’t improve humanity’s welfare. So we need to put your function in a strong AI, push the button and hope for the best.
You have tried to guess what kind of future that strong AI would create, but I don’t see how to make such guesses with confidence. The failure modes might be much more severe than something like “unnecessary militancy against extra-terrestrial life”. It’s most likely that humans won’t exist at all, because some other device that you haven’t thought of would be better at maximizing “logical depth” or whatever. For a possible nightmare failure mode, imagine that humans are best at creating “logical depth” when they’re in pain because that makes them think frantically, so the AI will create many humans and torture them.
Eliezer had a nice post arguing against similar approaches.
I am trying to develop a function that preserves human existence. You are objecting that you might stub your toe. I think we may be speaking past each other.
I feel that a good FAI designer shouldn’t dismiss these objections so easily. The toe-stubbing problem is much simpler than preserving human existence. To solve the toe-stubbing problem, you just need to correctly order two outcomes A and B. To preserve human existence, you need to order outcome A against all possible outcomes B that a powerful creative intelligence might come up with. Any solution to the bigger problem that doesn’t allow you to solve smaller problems is very suspicious to me.
I guess I disagree with this assessment of which problem is easier.
Humanity continues to exist while people stub their toes all the time. I.e., currently the humanity existing problem is close to solved, and the toe stubbing problem has by and large not been.