If you give to the AI the goal “Earn me 10 million dollars/paper clips” then it is unlikely that it will do something questionable in order to achieve the goal.
I disagree. In a stochastic universe, you can never be certain you’ve achieved your goal. An otherwise unrestricted AI with that goal will create a few trillion paper clips. just to be sure, and then obsessively count them again and again. You might make the argument that minor practical restrictions can make that AI design safe, which is plausible, but the goal is not intrinsically safe.
Your trick does nothing to solve this issue.
This is for a reduced impact AI. The idea is to allow a larger impact in one area, without reduced impact shutting it down.
I disagree. In a stochastic universe, you can never be certain you’ve achieved your goal. An otherwise unrestricted AI with that goal will create a few trillion paper clips. just to be sure, and then obsessively count them again and again. You might make the argument that minor practical restrictions can make that AI design safe, which is plausible, but the goal is not intrinsically safe.
Under a literal interpretation of the statement it will create exactly 10 million paper clips, not one more. Anyway, nothing is 100% safe.
Yes, but expected utility maximisers of this type will still take over the universe, if they could do it easily, to better accomplish their goals. Reduced impact agents won’t.
I disagree. In a stochastic universe, you can never be certain you’ve achieved your goal. An otherwise unrestricted AI with that goal will create a few trillion paper clips. just to be sure, and then obsessively count them again and again. You might make the argument that minor practical restrictions can make that AI design safe, which is plausible, but the goal is not intrinsically safe.
This is for a reduced impact AI. The idea is to allow a larger impact in one area, without reduced impact shutting it down.
Under a literal interpretation of the statement it will create exactly 10 million paper clips, not one more. Anyway, nothing is 100% safe.
Yes, but expected utility maximisers of this type will still take over the universe, if they could do it easily, to better accomplish their goals. Reduced impact agents won’t.