Even if each individual member of a population AGI is as intelligent as any hundred humans put together, I expect that we could (with sufficient effort) create secure deployment and monitoring protocols that the individual AI could not break, if it weren’t able to communicate with the rest of the population beforehand.
The state of human vs human security seems to be a cat and mouse game where neither attacker nor defender has a huge upper hand. The people trying to attack systems and defend them are about as smart and knowledgable. (sometimes the same people do both) The economic incentives to attack and to defend are usually similar. Systems get broken sometimes but not always.
This suggests that there is reason to be worried as soon as the AI(’s?) trying to break out are about as smart as the humans trying to contain them.
Depending on your threat modeling of a given breach, this could be comforting or terrifying.
The economic incentives to attack and to defend are usually similar. Systems get broken sometimes but not always.
If the cost of a loss (AGI escapes, takes over the world, and runs it worse than humans are) is much higher, that changes the “economic incentives” about this. It implies that “sometimes but not always” is a very dangerous equilibrium. If the cost of a loss (AGI has a bit more influence on the outside world, but doesn’t actually destroy much) is more inline with today’s incentives, it’s a fine thing.
In today’s banking systems, the amount of money the hacker gains is about what the bank looses. Therefore, the current balance of effort should have about as much money going into defending the bank and attacking it.
So I generalize to say that attacking is about as hard as defending in computer security, if the time and intellect doing both are similar, the attacker wins about half the time. (ie between 10% and 90% or something.)
When dealing with AI systems, the total intellectual output must be greater than that of your security team in order to be economically viable. (Or you would pay the security team to do the task directly) For a special purpose AI that only recognises images, this is fine.
For a significantly superhuman AI, it will rip through all your security.
For a general AI of about human level, if the amount of work it is doing is significantly greater than the amount of work your security team did, then it has probably put more thought into your security vulnerabilities than the humans have.
And yes, I am operating on the background assumption that the default behaviour of an escaped AI is to destroy the world. Especially if it has access to loads of compute, and the humans have given it an intelligence dial.
The state of human vs human security seems to be a cat and mouse game where neither attacker nor defender has a huge upper hand. The people trying to attack systems and defend them are about as smart and knowledgable. (sometimes the same people do both) The economic incentives to attack and to defend are usually similar. Systems get broken sometimes but not always.
This suggests that there is reason to be worried as soon as the AI(’s?) trying to break out are about as smart as the humans trying to contain them.
Depending on your threat modeling of a given breach, this could be comforting or terrifying.
If the cost of a loss (AGI escapes, takes over the world, and runs it worse than humans are) is much higher, that changes the “economic incentives” about this. It implies that “sometimes but not always” is a very dangerous equilibrium. If the cost of a loss (AGI has a bit more influence on the outside world, but doesn’t actually destroy much) is more inline with today’s incentives, it’s a fine thing.
In today’s banking systems, the amount of money the hacker gains is about what the bank looses. Therefore, the current balance of effort should have about as much money going into defending the bank and attacking it.
So I generalize to say that attacking is about as hard as defending in computer security, if the time and intellect doing both are similar, the attacker wins about half the time. (ie between 10% and 90% or something.)
When dealing with AI systems, the total intellectual output must be greater than that of your security team in order to be economically viable. (Or you would pay the security team to do the task directly) For a special purpose AI that only recognises images, this is fine.
For a significantly superhuman AI, it will rip through all your security.
For a general AI of about human level, if the amount of work it is doing is significantly greater than the amount of work your security team did, then it has probably put more thought into your security vulnerabilities than the humans have.
And yes, I am operating on the background assumption that the default behaviour of an escaped AI is to destroy the world. Especially if it has access to loads of compute, and the humans have given it an intelligence dial.