Hmm… interesting ideas. I don’t intend to suggest that the AI would have human intentions at all, I think we might be modelling the idea of a failsafe in a different way.
I was assuming that the idea was an AI with a separate utility-maximising system, but to also make it follow laws as absolute, inviolable rules, thus stopping unintended consequences from the utility maximisation. In this system, the AI would ‘want’ to pursue its more general goal and the laws would be blocks. As such, it would find other ways to pursue its goals, including changing the laws themselves.
If the corpus of laws instead form part of what the computer is trying to achieve/uphold we face different problems. Firstly, laws are prohibitions and it’s not clear how to ‘maximise’ them beyond simple obedience. Unless it’s stopping other people breaking them in a Robocop way. Second, failsafes are needed because even ‘maximise human desire satisfaction’ can throw up lots of unintended results. An entire corpus of law would be far more unpredictable in its effects as a core programme, and thus require even more failsafes!
On a side point, my argument about cause, negligence etc. was not that the computer would fail to understand them, but that as regards a superintelligence, they could easily be either meaningless or over-effective.
For an example of the latter, if we allow someone to die, that’s criminal negligence. This is designed for walking past drowning people and ignoring them etc. A law-abiding computer might calculate, say, that even with cryonics etc, every life will end in death due to the universe’s heat death. It might then sterilise the entire human population to avoid new births, as each birth would necessitate a death. And so on. Obviously this would clash with other laws, but that’s part of the problem: every action would involve culpability in some way, due to greater knowledge of consequences.
The laws might be appropriately viewed primarily as blocks that keep the AI from taking actions deemed unacceptable by the collective. AIs could pursue whatever goals they sees fit within the constraints of the law.
However, the laws wouldn’t be all prohibitions. The “general laws” would be more prescriptive, e.g., life, liberty, justice for all. The “specific laws” would tend to be more prohibition oriented. Presumably the vast majority of them would be written to handle common situations and important edge cases. If someone suspects the citizenry may be at jeopardy of frequent runaway trolly incidents, the legislature can write statutes on what is legal to throw under the wheels to prevent deaths of (certain configurations of) innocent bystanders. Probably want to start with inanimate objects before considering sentient robots, terminally sick humans, fat men, puppies, babies, and whatever. (It might be nice to have some clarity on this! :-))
To explore your negligence case example, I imagine some statute might require agents to rescue people in imminent danger of losing their lives if possible, subject to certain extenuating cicumstances. The legislature and public can have a lively debate about whether this law still makes sense in a future where dead people can be easily reanimated or if human life is really not valuable in the grand scheme of things. If humans have good representatives in the legistature and/or good a few good AI advocates, mass human extermination shouldn’t be a problem, at least until the consensus shifts in such directions. Perhaps some day there may be a consensus on forced sterilizations to prevent greater harms. I’d argue such a system of laws should be able to handle it. The key seems to be to legislate prescriptions and prohibitions relevant to current state of society and change them as the facts on the ground change. This would seem to get around the impossibility of defining eternal laws or algorithms that are ever-true in every possible future state.
I still don’t see how laws as barriers could be effective. People are arguing whether it’s possible to write highly specific failsafe rules capable of acting as barriers, and the general feeling is that you wouldn’t be able to second-guess the AI enough to do that effectively. I’m not sure what replacing these specific laws with a large corpus of laws achieves. On the plus side, you’ve got a large group of overlapping controls that might cover each others’ weaknesses. But they’re not specially written with AI in mind and even if they were, small political shifts could lead to loopholes opening. And the number also means that you can’t clearly see what’s permitted or not: it risks an illusion of safety simply because we find it harder to think of something bad an AI could do that doesn’t break any law.
Not to mention the fact that a utility-maximising AI would seek to change laws to make them better for humans, so the rules controlling the AI would be a target of their influence.
I guess here I’d reiterate this point from my latest reply to orthonormal:
Again, it’s not only about having lots of rules. More importantly it’s about the checks and balances and enforcement the system provides.
It may not be helpful to think of some grand utility-maximising AI that constantly strives to maximize human happiness or some other similar goals, and can cause us to wake up in some alternate reality some day. It would be nice to have some AIs working on how to maximize some things human’s value, e.g., health, happiness, attractive and sensible shoes. If any of those goals would appear to be impeded by current law, the AI would lobby it’s legislator to amend the law. And in a better future, important amendments would go through rigorous analysis in a few days, better root out unintended consequences, and be enacted as quickly as prudent.
Hmm… interesting ideas. I don’t intend to suggest that the AI would have human intentions at all, I think we might be modelling the idea of a failsafe in a different way.
I was assuming that the idea was an AI with a separate utility-maximising system, but to also make it follow laws as absolute, inviolable rules, thus stopping unintended consequences from the utility maximisation. In this system, the AI would ‘want’ to pursue its more general goal and the laws would be blocks. As such, it would find other ways to pursue its goals, including changing the laws themselves.
If the corpus of laws instead form part of what the computer is trying to achieve/uphold we face different problems. Firstly, laws are prohibitions and it’s not clear how to ‘maximise’ them beyond simple obedience. Unless it’s stopping other people breaking them in a Robocop way. Second, failsafes are needed because even ‘maximise human desire satisfaction’ can throw up lots of unintended results. An entire corpus of law would be far more unpredictable in its effects as a core programme, and thus require even more failsafes!
On a side point, my argument about cause, negligence etc. was not that the computer would fail to understand them, but that as regards a superintelligence, they could easily be either meaningless or over-effective.
For an example of the latter, if we allow someone to die, that’s criminal negligence. This is designed for walking past drowning people and ignoring them etc. A law-abiding computer might calculate, say, that even with cryonics etc, every life will end in death due to the universe’s heat death. It might then sterilise the entire human population to avoid new births, as each birth would necessitate a death. And so on. Obviously this would clash with other laws, but that’s part of the problem: every action would involve culpability in some way, due to greater knowledge of consequences.
The laws might be appropriately viewed primarily as blocks that keep the AI from taking actions deemed unacceptable by the collective. AIs could pursue whatever goals they sees fit within the constraints of the law.
However, the laws wouldn’t be all prohibitions. The “general laws” would be more prescriptive, e.g., life, liberty, justice for all. The “specific laws” would tend to be more prohibition oriented. Presumably the vast majority of them would be written to handle common situations and important edge cases. If someone suspects the citizenry may be at jeopardy of frequent runaway trolly incidents, the legislature can write statutes on what is legal to throw under the wheels to prevent deaths of (certain configurations of) innocent bystanders. Probably want to start with inanimate objects before considering sentient robots, terminally sick humans, fat men, puppies, babies, and whatever. (It might be nice to have some clarity on this! :-))
To explore your negligence case example, I imagine some statute might require agents to rescue people in imminent danger of losing their lives if possible, subject to certain extenuating cicumstances. The legislature and public can have a lively debate about whether this law still makes sense in a future where dead people can be easily reanimated or if human life is really not valuable in the grand scheme of things. If humans have good representatives in the legistature and/or good a few good AI advocates, mass human extermination shouldn’t be a problem, at least until the consensus shifts in such directions. Perhaps some day there may be a consensus on forced sterilizations to prevent greater harms. I’d argue such a system of laws should be able to handle it. The key seems to be to legislate prescriptions and prohibitions relevant to current state of society and change them as the facts on the ground change. This would seem to get around the impossibility of defining eternal laws or algorithms that are ever-true in every possible future state.
I still don’t see how laws as barriers could be effective. People are arguing whether it’s possible to write highly specific failsafe rules capable of acting as barriers, and the general feeling is that you wouldn’t be able to second-guess the AI enough to do that effectively. I’m not sure what replacing these specific laws with a large corpus of laws achieves. On the plus side, you’ve got a large group of overlapping controls that might cover each others’ weaknesses. But they’re not specially written with AI in mind and even if they were, small political shifts could lead to loopholes opening. And the number also means that you can’t clearly see what’s permitted or not: it risks an illusion of safety simply because we find it harder to think of something bad an AI could do that doesn’t break any law.
Not to mention the fact that a utility-maximising AI would seek to change laws to make them better for humans, so the rules controlling the AI would be a target of their influence.
I guess here I’d reiterate this point from my latest reply to orthonormal:
It may not be helpful to think of some grand utility-maximising AI that constantly strives to maximize human happiness or some other similar goals, and can cause us to wake up in some alternate reality some day. It would be nice to have some AIs working on how to maximize some things human’s value, e.g., health, happiness, attractive and sensible shoes. If any of those goals would appear to be impeded by current law, the AI would lobby it’s legislator to amend the law. And in a better future, important amendments would go through rigorous analysis in a few days, better root out unintended consequences, and be enacted as quickly as prudent.