This is a very strange approach and something like this has never occurred to me. Cybersecurity has an overwhelmingly massive influence on the human side of AI safety, partly because all of todays AIs are built on computers that have to be protected from human hackers.
If you have an untrustworthy general superintelligence generating English strings meant to be “reasoning/arguments/proofs/explanations” about eg a nanosystem design, then I would not only expect the superintelligence to be able to fool humans in the sense of arguing for things that were not true in a way that fooled the humans, I’d expect the superintelligence to be able to covertly directly hack the humans in ways that I wouldn’t understand even after having been told what happened. So you must have some prior belief about the superintelligence being aligned before you dared to look at the arguments.
So it would only be helpful if exponential intelligence increase was slow, stunted, or interrupted, and this is during the catastrophe. Someone recently asked about how dumb an AI would have to be in order to kill a lot of people if it started behaving erratically and began to model its overseers, this might be relevant to that.
This is a very strange approach and something like this has never occurred to me. Cybersecurity has an overwhelmingly massive influence on the human side of AI safety, partly because all of todays AIs are built on computers that have to be protected from human hackers.
Regarding airgaps (and magnet gaps, which can penetrate faraday cages and communicate with an outside magnetometer by influencing current flows on an ordinary chip), EY once wrote that:
So it would only be helpful if exponential intelligence increase was slow, stunted, or interrupted, and this is during the catastrophe. Someone recently asked about how dumb an AI would have to be in order to kill a lot of people if it started behaving erratically and began to model its overseers, this might be relevant to that.