Long ago Eliezer Yudkowsky believed that “To the extent someone says that a superintelligence would wipe out humanity, they are either arguing that wiping out humanity is in fact the right thing to do (even though we see no reason why this should be the case) or they are arguing that there is no right thing to do (in which case their argument that we should not build intelligence defeats itself).”
Larry Page allegedly dismissed concern about AI risk as speciesism.
Selection bias
In these examples, the believers eventually realized their folly, and favoured humanity over misaligned AI in the end.[1]
However, maybe we only see the happy endings due to selection bias! Someone who continues to work against humanity won’t tell you that they are doing so, e.g. during the brief period Eliezer Yudkowsky was confused he kept it a secret.
So the true number of people working against humanity is unknown. We only know the number of people who eventually snapped out of it.
Nonetheless, it’s not worthwhile to start a witch hunt, no matter how suspiciously someone behaves, because throwing such accusations will merely invite mockery.
But I assume a utility function maximizer doesn’t draw a qualitative difference between these two categories—it’s just a difference in magnitude.
When the agents are choosing between “live and let live” and “a total war”, the war involves a loss of resources. The line I was hinting as was something like: suppose we can’t conquer those planets, but we could destroy them—should we?
From humanity’s perspective, destroying a planet is not an improvement over having it converted to paperclips. But if there is some kind of Hell machine, building gigantic eternal torture chambers, we might prefer those torture chambers to be destroyed.
From the paperclip maximizer’s perspective, I guess, if it’s not paperclips, it does not make a difference.
there’s no reason they can’t just beam hostile radio signals at each other all day long.
Signals are useless unless received. A planet governed by an AI could adopt a “don’t look up” policy.
Perhaps there will be a technology to create some fog between solar systems that would stop the signals. Though that’s dangerous: if you don’t see the signals, you also don’t see potential spaceships flying towards you.
I guess an AI could implement some kind of firewall, make a smaller AI that only observes the part of the universe, reports spaceships and ignores everything else. But then the obvious response would be to send the spaceships along with the signal. Then, destroying things along the boundary becomes relevant again.
I guess there might be some interesting arms race about how much toxic information can you include in something that the other side cannot ignore, such as the structure of your attacking spaceships. I imagine something like sending million spaceships that can physically attack the enemy bases, but also their positions encode some infohazard, so when the enemy starts monitoring them, it inevitably works with the dangerous data. (As a silly example, imagine a fleet of spaceships flying in a formation that spells “this sentence is false”.)
Both sides rationally conclude that the existence of the other is incompatible with maximizing their own utility function.
Why? I mean, if the other AI didn’t exist, I could take over their part of the universe and that would be better. But assuming that I can’t destroy it, is it doing something actively bad, or is it just a waste of resources?
(For example, from the perspective of humanity, something that creates sentient humans and then tortures them horribly would be actively bad; something that converts uninhabited planets to paperclips is just a waste of resources.)
If the AIs see each other as merely a waste of resources, and they don’t assume the probability of victory to be significantly higher than 50%, they could just give up on the other half of universe, and e.g. burn up the resources along the boundary, to make it more difficult to travel to each other. Blow up the stars along the boundary, and shoot down everything that flies across that empty space towards you.
Examples
In How it feels to have your mind hacked by an AI, a software engineer fell in love with an AI, and thought oh if only AGI would have her persona, it would surely be aligned.
Long ago Eliezer Yudkowsky believed that “To the extent someone says that a superintelligence would wipe out humanity, they are either arguing that wiping out humanity is in fact the right thing to do (even though we see no reason why this should be the case) or they are arguing that there is no right thing to do (in which case their argument that we should not build intelligence defeats itself).”
Larry Page allegedly dismissed concern about AI risk as speciesism.
Selection bias
In these examples, the believers eventually realized their folly, and favoured humanity over misaligned AI in the end.[1]
However, maybe we only see the happy endings due to selection bias! Someone who continues to work against humanity won’t tell you that they are doing so, e.g. during the brief period Eliezer Yudkowsky was confused he kept it a secret.
So the true number of people working against humanity is unknown. We only know the number of people who eventually snapped out of it.
Nonetheless, it’s not worthwhile to start a witch hunt, no matter how suspiciously someone behaves, because throwing such accusations will merely invite mockery.
At least for Blaked and Eliezer Yudkowsky. I don’t think Larry Page ever walked back or denied his statements.
That’s what we get for living in a culture where calling something ”...ism” wins the debate.
When the agents are choosing between “live and let live” and “a total war”, the war involves a loss of resources. The line I was hinting as was something like: suppose we can’t conquer those planets, but we could destroy them—should we?
From humanity’s perspective, destroying a planet is not an improvement over having it converted to paperclips. But if there is some kind of Hell machine, building gigantic eternal torture chambers, we might prefer those torture chambers to be destroyed.
From the paperclip maximizer’s perspective, I guess, if it’s not paperclips, it does not make a difference.
Signals are useless unless received. A planet governed by an AI could adopt a “don’t look up” policy.
Perhaps there will be a technology to create some fog between solar systems that would stop the signals. Though that’s dangerous: if you don’t see the signals, you also don’t see potential spaceships flying towards you.
I guess an AI could implement some kind of firewall, make a smaller AI that only observes the part of the universe, reports spaceships and ignores everything else. But then the obvious response would be to send the spaceships along with the signal. Then, destroying things along the boundary becomes relevant again.
I guess there might be some interesting arms race about how much toxic information can you include in something that the other side cannot ignore, such as the structure of your attacking spaceships. I imagine something like sending million spaceships that can physically attack the enemy bases, but also their positions encode some infohazard, so when the enemy starts monitoring them, it inevitably works with the dangerous data. (As a silly example, imagine a fleet of spaceships flying in a formation that spells “this sentence is false”.)
Why? I mean, if the other AI didn’t exist, I could take over their part of the universe and that would be better. But assuming that I can’t destroy it, is it doing something actively bad, or is it just a waste of resources?
(For example, from the perspective of humanity, something that creates sentient humans and then tortures them horribly would be actively bad; something that converts uninhabited planets to paperclips is just a waste of resources.)
If the AIs see each other as merely a waste of resources, and they don’t assume the probability of victory to be significantly higher than 50%, they could just give up on the other half of universe, and e.g. burn up the resources along the boundary, to make it more difficult to travel to each other. Blow up the stars along the boundary, and shoot down everything that flies across that empty space towards you.