I’m certainly not advocating absolute unqualified non-intervention. I wrote “a value of noninterference in certain matters”. Certainly the AI should interfere to e.g. offer help just before something happens which the AI thinks the person would not want to happen to them (the AI is qualified to make such decisions if it can calculate CEV). In such a situation the AI would explain matters and offer aid and advice, but ultimate deciding power might still lie with the person, depending on the circumstances.
Nonintervention doesn’t just mean non-intervention by the AI, it means nonintervention by one person with another. If someone makes a request for the AI to prevent another person from doing something to them, then again in at least some (most? all?) circumstances the AI should interfere to do so; that is actually upholding the principle of uninterference.
I like both these caveats. The scenario becomes something far more similar to what a CEV could plausibly be without the artificial hack. Horror stories become much harder to construct.
Off the top of my head one potential remaining weaknesses include the inability to prevent a rival, less crippled AGI from taking over without interfering pre-emptively with an individual who is not themselves interfering with anyone. Getting absolute power requires intervention (or universally compliant subjects). Not getting absolute power means something else can get it and outcomes are undefined.
That’s a good point. The AI’s ability to not interfere is constrained by its need to monitor everything that’s going on. Not just to detect someone building a rival AI, but to detect simpler cases like someone torturing a simulated person, or even just a normal flesh and bone child who wasn’t there nine months earlier. To detect people who get themselves into trouble without yet realizing it, or who are going to attack other people nonconsensually, and give these people help before something bad actually happens to them, all requires monitoring.
And while a technologically advanced AI might monitor using tools we humans couldn’t even detect today, to advanced posthumans every possible tool might be painfully obvious. E.g. if you have to expose everything your megaton-of-computronium brain calculates to the AI, because that’s enough to simulate all the humans alive in 2012 in enough detail that they would count as persons to the AI. But to the asteroid-sized brain this means the AI is literally aware of all its thoughts: it has zero privacy.
It does appear that universal surveillance is the cost of universally binding promises (you won’t be tortured no matter where you go and what you do in AI-controlled space). To reduce costs and increase trust, the AI should be transparent to everyone itself, and should be publicly and verifiably committed to being a perfectly honest and neutral party that never reveals the secrets and private information it monitors to anyone.
I’d like to note that all of this also applies to any FAI singleton that implements some policies that we today consider morally required—like making sure no-one is torturing simulated people or raising their baby wrong. If there’s no generally acceptable FAI behavior that doesn’t include surveillance, then all else is equal and I still prefer my AI to a pure CEV implementation.
And while a technologically advanced AI might monitor using tools we humans couldn’t even detect today, to advanced posthumans every possible tool might be painfully obvious. E.g. if you have to expose everything your megaton-of-computronium brain calculates to the AI, because that’s enough to simulate all the humans alive in 2012 in enough detail that they would count as persons to the AI. But to the asteroid-sized brain this means the AI is literally aware of all its thoughts: it has zero privacy.
It would seem that the FAI should require only to be exposed to you the complete state of your brain at a point of time where it can reliably predict or prove that you are ‘safe’, using the kind of reasoning we often assume as a matter of course when describing UDT decision problems. Such an FAI would have information about what you are thinking—and in particular a great big class of what it knows you are not thinking—but not necessary detailed knowledge of what you are thinking specifically.
For improved privacy the inspection could be done by a spawned robot AI programmed to self destruct after analyzing you and returning nothing but a boolean safety indicator back to the FAI.
Prediction has some disadvantages compared to constant observation:
Some physical systems are hard to model well with simplification; even for the AI it might be necessary to use simulations composed of amounts of matter proportional to the thing simulated. If about one half of all matter has to be given over to the AI, instead of being used to create more people and things, that is a significant loss of opportunity. (Maybe the AI should tax people in simulation-resources, and those who opt in to surveillance have much lower taxes :-)
Simulations naturally have a rising risk of divergence over time. The AI is not literally Omega. It will have to come in and take periodical snapshots of everyone’s state to correct the simulations.
Simulations have a chance of being wrong. However small the chance, if the potential result is someone building a UFAI challenger, it might be unacceptable to take that chance.
OTOH, surveillance might be much cheaper (I don’t know for sure) and also allows destroying the evidence close to the site of observation once it is analyzed, preserving a measure of privacy.
I like both these caveats. The scenario becomes something far more similar to what a CEV could plausibly be without the artificial hack. Horror stories become much harder to construct.
Off the top of my head one potential remaining weaknesses include the inability to prevent a rival, less crippled AGI from taking over without interfering pre-emptively with an individual who is not themselves interfering with anyone. Getting absolute power requires intervention (or universally compliant subjects). Not getting absolute power means something else can get it and outcomes are undefined.
That’s a good point. The AI’s ability to not interfere is constrained by its need to monitor everything that’s going on. Not just to detect someone building a rival AI, but to detect simpler cases like someone torturing a simulated person, or even just a normal flesh and bone child who wasn’t there nine months earlier. To detect people who get themselves into trouble without yet realizing it, or who are going to attack other people nonconsensually, and give these people help before something bad actually happens to them, all requires monitoring.
And while a technologically advanced AI might monitor using tools we humans couldn’t even detect today, to advanced posthumans every possible tool might be painfully obvious. E.g. if you have to expose everything your megaton-of-computronium brain calculates to the AI, because that’s enough to simulate all the humans alive in 2012 in enough detail that they would count as persons to the AI. But to the asteroid-sized brain this means the AI is literally aware of all its thoughts: it has zero privacy.
It does appear that universal surveillance is the cost of universally binding promises (you won’t be tortured no matter where you go and what you do in AI-controlled space). To reduce costs and increase trust, the AI should be transparent to everyone itself, and should be publicly and verifiably committed to being a perfectly honest and neutral party that never reveals the secrets and private information it monitors to anyone.
I’d like to note that all of this also applies to any FAI singleton that implements some policies that we today consider morally required—like making sure no-one is torturing simulated people or raising their baby wrong. If there’s no generally acceptable FAI behavior that doesn’t include surveillance, then all else is equal and I still prefer my AI to a pure CEV implementation.
It would seem that the FAI should require only to be exposed to you the complete state of your brain at a point of time where it can reliably predict or prove that you are ‘safe’, using the kind of reasoning we often assume as a matter of course when describing UDT decision problems. Such an FAI would have information about what you are thinking—and in particular a great big class of what it knows you are not thinking—but not necessary detailed knowledge of what you are thinking specifically.
For improved privacy the inspection could be done by a spawned robot AI programmed to self destruct after analyzing you and returning nothing but a boolean safety indicator back to the FAI.
Prediction has some disadvantages compared to constant observation:
Some physical systems are hard to model well with simplification; even for the AI it might be necessary to use simulations composed of amounts of matter proportional to the thing simulated. If about one half of all matter has to be given over to the AI, instead of being used to create more people and things, that is a significant loss of opportunity. (Maybe the AI should tax people in simulation-resources, and those who opt in to surveillance have much lower taxes :-)
Simulations naturally have a rising risk of divergence over time. The AI is not literally Omega. It will have to come in and take periodical snapshots of everyone’s state to correct the simulations.
Simulations have a chance of being wrong. However small the chance, if the potential result is someone building a UFAI challenger, it might be unacceptable to take that chance.
OTOH, surveillance might be much cheaper (I don’t know for sure) and also allows destroying the evidence close to the site of observation once it is analyzed, preserving a measure of privacy.