The staged understanding is an important topic, and we may want to ensure that Young AI will not perform irreversible actions before it reaches higher stages at which it understands the instrumental value of preserving humans.
Interestingly, the usefulness of killing humans declines with each stage of AI development. It may be very high during the takeover stage, high during the building of a Dyson sphere (for those Earth atoms!), 0.000000000001 after the Dyson sphere is built (the cost of the hole in it), and something like 10^-35 for a Galactic scale AI.
I think that we should state in our commitment to simulating unfriendly AI that any AI which merely freezes all humans is not friendly.
For bacterium example: I think here cost-benefit analysis will be different and we can’t use it as intuition pump as most of us are not biologists interested in new antibiotics. Cleaning the kitchen is not changing chances that there will be useful bacteria as cleaining itself may cause some mutations in bacteria. And if I am biologist looking for new bacteria, I will build biolab anyway.
The main problem of this (and my) example is that AI will preserve humans not to get some additional utility, but to escape small probability risk for itself.
I think it is a mistake to assume the relevant cost metric is fractional rather than absolute. The galactic scale AI can do a lot more with the resources humans require than the matrioshka brain can, in absolute terms, because it can use them with greater understanding and precision.
And I don’t think a matrioshka brain loses much in terms of risk or benefit by wiping out current humans while keeping a few yotta bytes of data in cold storage encoding the genomes and neural connectomes of humans for future recreation if needed, just like I lose nothing by wiping out bacteria as long as I know that anything they might provide could be re-invented or re-discovered if needed.
Your main point about risk to the AI from other intelligences or acausal trade depends sensitively on just how small the risk probability for the AI is. There’s quite a few different ways of estimating that, and it is not at all clear to me that “small” is still large enough to justify the cost. Maybe it is, and we get saved by divine grace. That’s great if we do. But it’s not at all clear to me, even if eventually it turns out it should be true, that any given AI will know or believe that at the time when it needs to decide whether it’s worthwhile to destroy any particular group or form of humans.
Even among humans, it’s not at all clear to many that the existence of livestock and pets (even well cared for and happy) is good for the animals themselves or for the humans who raise and care for them. There really are well-meaning people who honestly act questions like “Given the choice, should we sterilize the biosphere and/or voluntarily go extinct?” and arrive at “yes.” For me, the kind of argument you’re putting forward immediately runs up against underestimating the diversity of minds that exist and will or can exist at any given level of intelligence and capability.
To clarify, I am not sure that we can relax expecting this type of salvation. It is just plan D after the first three fail (A—alignment, B—pause, C—boxing or destroying). We can make plan D stronger by:
1. Improving decision theory 2. Exploring the power of commitments 3. Sending messages to aliens in space now (METI) with a request for help in order to improve our counterfactual negotiation position
But to do all these, we need to look more in this direction.
A galactic brain can surely extract more utility from additional pieces of matter, but it is less likely to be fate-changing utility, because fate can’t change 100 billion times after exploring each of 100 billion stars.
Scanning brains is an expensive act for a very young AI, so it may need to preserve us until later stages, and in later stages, the utility of our atoms will decline.
There could be an intermediate variant between storing data and preserving Earth: building a space station where each person gets an apartment. This space station would take around 10^3 cubic km of material, and given Earth’s volume of 10^12 cubic km, it would require only one billionth of Earth’s mass.
I think that for any young AI, the estimation of being in a simulation should be at least 10 percent. The independent probability of meeting aliens should also be around 10 percent. In each case, I would give a 0.1 chance that preserving humans will make some difference. In total, I estimate that preserving humans eliminates a 2 percent risk to AI survival from its point of view.
If aligned AI is created by negative utilitarians, it will kill us all.
Makes sense, I get that, all good points and good discussion.
I would say, scanning brains may be expensive early on, but cryonics and plasticination should be very cheap relative to space-based apartments, but the time it’s possible to have either at scale.
The staged understanding is an important topic, and we may want to ensure that Young AI will not perform irreversible actions before it reaches higher stages at which it understands the instrumental value of preserving humans.
Interestingly, the usefulness of killing humans declines with each stage of AI development. It may be very high during the takeover stage, high during the building of a Dyson sphere (for those Earth atoms!), 0.000000000001 after the Dyson sphere is built (the cost of the hole in it), and something like 10^-35 for a Galactic scale AI.
I think that we should state in our commitment to simulating unfriendly AI that any AI which merely freezes all humans is not friendly.
For bacterium example: I think here cost-benefit analysis will be different and we can’t use it as intuition pump as most of us are not biologists interested in new antibiotics. Cleaning the kitchen is not changing chances that there will be useful bacteria as cleaining itself may cause some mutations in bacteria. And if I am biologist looking for new bacteria, I will build biolab anyway.
The main problem of this (and my) example is that AI will preserve humans not to get some additional utility, but to escape small probability risk for itself.
I think it is a mistake to assume the relevant cost metric is fractional rather than absolute. The galactic scale AI can do a lot more with the resources humans require than the matrioshka brain can, in absolute terms, because it can use them with greater understanding and precision.
And I don’t think a matrioshka brain loses much in terms of risk or benefit by wiping out current humans while keeping a few yotta bytes of data in cold storage encoding the genomes and neural connectomes of humans for future recreation if needed, just like I lose nothing by wiping out bacteria as long as I know that anything they might provide could be re-invented or re-discovered if needed.
Your main point about risk to the AI from other intelligences or acausal trade depends sensitively on just how small the risk probability for the AI is. There’s quite a few different ways of estimating that, and it is not at all clear to me that “small” is still large enough to justify the cost. Maybe it is, and we get saved by divine grace. That’s great if we do. But it’s not at all clear to me, even if eventually it turns out it should be true, that any given AI will know or believe that at the time when it needs to decide whether it’s worthwhile to destroy any particular group or form of humans.
Even among humans, it’s not at all clear to many that the existence of livestock and pets (even well cared for and happy) is good for the animals themselves or for the humans who raise and care for them. There really are well-meaning people who honestly act questions like “Given the choice, should we sterilize the biosphere and/or voluntarily go extinct?” and arrive at “yes.” For me, the kind of argument you’re putting forward immediately runs up against underestimating the diversity of minds that exist and will or can exist at any given level of intelligence and capability.
To clarify, I am not sure that we can relax expecting this type of salvation. It is just plan D after the first three fail (A—alignment, B—pause, C—boxing or destroying). We can make plan D stronger by:
1. Improving decision theory
2. Exploring the power of commitments
3. Sending messages to aliens in space now (METI) with a request for help in order to improve our counterfactual negotiation position
But to do all these, we need to look more in this direction.
A galactic brain can surely extract more utility from additional pieces of matter, but it is less likely to be fate-changing utility, because fate can’t change 100 billion times after exploring each of 100 billion stars.
Scanning brains is an expensive act for a very young AI, so it may need to preserve us until later stages, and in later stages, the utility of our atoms will decline.
There could be an intermediate variant between storing data and preserving Earth: building a space station where each person gets an apartment. This space station would take around 10^3 cubic km of material, and given Earth’s volume of 10^12 cubic km, it would require only one billionth of Earth’s mass.
I think that for any young AI, the estimation of being in a simulation should be at least 10 percent. The independent probability of meeting aliens should also be around 10 percent. In each case, I would give a 0.1 chance that preserving humans will make some difference. In total, I estimate that preserving humans eliminates a 2 percent risk to AI survival from its point of view.
If aligned AI is created by negative utilitarians, it will kill us all.
Makes sense, I get that, all good points and good discussion.
I would say, scanning brains may be expensive early on, but cryonics and plasticination should be very cheap relative to space-based apartments, but the time it’s possible to have either at scale.