There’s a difference between “don’t affect anything outside your box” and “don’t go outside your box.” My point is that we don’t necessarily have to make FAI before anyone makes a self-improving AI. There are goal systems that, while not reflecting human values and goals, would still prevent an AI from destroying humanity.
There’s a difference between “don’t affect anything outside your box” and “don’t go outside your box.”
Non-obviously, there is no difference in principle. It is a distinction held for animals who can’t “go outside their box” in the manner they perform actions, but not for intelligences that can construct autonomous intelligent systems elsewhere. Whatever set of possible configurations the AI can reach in the outside of its box, it’s free to optimize according to its goals. You only need one, however surprising, leak of influence for the outcome to become determined by AI’s values, which, if not precisely tuned for humanity, are effectively fatal.
There’s a difference between “don’t affect anything outside your box” and “don’t go outside your box.” My point is that we don’t necessarily have to make FAI before anyone makes a self-improving AI. There are goal systems that, while not reflecting human values and goals, would still prevent an AI from destroying humanity.
Non-obviously, there is no difference in principle. It is a distinction held for animals who can’t “go outside their box” in the manner they perform actions, but not for intelligences that can construct autonomous intelligent systems elsewhere. Whatever set of possible configurations the AI can reach in the outside of its box, it’s free to optimize according to its goals. You only need one, however surprising, leak of influence for the outcome to become determined by AI’s values, which, if not precisely tuned for humanity, are effectively fatal.
Notes on how such a goal system should be produced would make an excellent top-level post.