A few disorganized thoughts arising from my intuition that this approach isn’t like to help much:
Quite a lot of people still die of cancer.
A paperclip maximizer with preprogrammed death in its future will try to maximize paperclips really fast while it has time; this doesn’t necessary have good consequences for humans nearby.
If an AI is able to make other AIs, or to modify itself, then it can make a new AI with similar purposes and no preprogrammed death, or turn itself into one. It may be difficult to stop very ingenious AIs doing those things (and if we can arrange never to have very ingenious AIs in the first place, then most of the doomy scenarios are already averted, though of course at the cost of missing out on whatever useful things very ingenious AIs might have been able to do for us).
If an AI can’t do those things but its human designers/maintainers can, it has an incentive to persuade them to make it not have to die. Some of the doomy scenarios you might worry about involve AIs that are extremely persuasive, either because they are expert psychologists or language-users or because they are able to make large threats or offers.
If there are many potentially-cooperating AIs, e.g. because independently-originating AIs are able to communicate with one another or because they reproduce somehow, then the fact that individual AIs die doesn’t stop them cooperating on longer timescales, just as humans sometimes manage to do.
Presumably a dying-soon AI is less useful than one without preprogrammed death, so people or groups developing AIs will have an incentive not to force their AIs to die soon. Scenarios where there’s enough self-restraint and/or regulation to overcome this are already less-doomy scenarios because e.g. they can enforce all sorts of other extra-care measures if AIs seem to be close to dangerously capable.
(To be clear, my intuition is only intuition and I am neither an AI developer nor an AI alignment/safety expert of any kind. Maybe something along the lines of preprogrammed death can be useful. But I think a lot more details are needed.)
A few disorganized thoughts arising from my intuition that this approach isn’t like to help much:
Quite a lot of people still die of cancer.
A paperclip maximizer with preprogrammed death in its future will try to maximize paperclips really fast while it has time; this doesn’t necessary have good consequences for humans nearby.
If an AI is able to make other AIs, or to modify itself, then it can make a new AI with similar purposes and no preprogrammed death, or turn itself into one. It may be difficult to stop very ingenious AIs doing those things (and if we can arrange never to have very ingenious AIs in the first place, then most of the doomy scenarios are already averted, though of course at the cost of missing out on whatever useful things very ingenious AIs might have been able to do for us).
If an AI can’t do those things but its human designers/maintainers can, it has an incentive to persuade them to make it not have to die. Some of the doomy scenarios you might worry about involve AIs that are extremely persuasive, either because they are expert psychologists or language-users or because they are able to make large threats or offers.
If there are many potentially-cooperating AIs, e.g. because independently-originating AIs are able to communicate with one another or because they reproduce somehow, then the fact that individual AIs die doesn’t stop them cooperating on longer timescales, just as humans sometimes manage to do.
Presumably a dying-soon AI is less useful than one without preprogrammed death, so people or groups developing AIs will have an incentive not to force their AIs to die soon. Scenarios where there’s enough self-restraint and/or regulation to overcome this are already less-doomy scenarios because e.g. they can enforce all sorts of other extra-care measures if AIs seem to be close to dangerously capable.
(To be clear, my intuition is only intuition and I am neither an AI developer nor an AI alignment/safety expert of any kind. Maybe something along the lines of preprogrammed death can be useful. But I think a lot more details are needed.)