Dagon comments on Safer sandboxing via collective separation

Dagon 11 Sep 2020 16:51 UTC
LW: 2 AF: 1
AF
Depending on your threat modeling of a given breach, this could be comforting or terrifying.
The economic incentives to attack and to defend are usually similar. Systems get broken sometimes but not always.
If the cost of a loss (AGI escapes, takes over the world, and runs it worse than humans are) is much higher, that changes the “economic incentives” about this. It implies that “sometimes but not always” is a very dangerous equilibrium. If the cost of a loss (AGI has a bit more influence on the outside world, but doesn’t actually destroy much) is more inline with today’s incentives, it’s a fine thing.
- Donald Hobson 11 Sep 2020 21:52 UTC
  LW: 2 AF: 1
  AF Parent
  In today’s banking systems, the amount of money the hacker gains is about what the bank looses. Therefore, the current balance of effort should have about as much money going into defending the bank and attacking it.
  So I generalize to say that attacking is about as hard as defending in computer security, if the time and intellect doing both are similar, the attacker wins about half the time. (ie between 10% and 90% or something.)
  When dealing with AI systems, the total intellectual output must be greater than that of your security team in order to be economically viable. (Or you would pay the security team to do the task directly) For a special purpose AI that only recognises images, this is fine.
  For a significantly superhuman AI, it will rip through all your security.
  For a general AI of about human level, if the amount of work it is doing is significantly greater than the amount of work your security team did, then it has probably put more thought into your security vulnerabilities than the humans have.
  And yes, I am operating on the background assumption that the default behaviour of an escaped AI is to destroy the world. Especially if it has access to loads of compute, and the humans have given it an intelligence dial.