Yes, CPUs leak information: that is the output kind of side-channel, where an attacker can transfer information about the computation into the outside world. That is not the kind I am saying one can rule out with merely diligent pursuit of determinism.
Concurrency is a bigger concern. Concurrent algorithms can have deterministic dataflow, of course, but enforcing that naively does involve some performance penalty versus HOGWILD algorithms because, if executing a deterministic dataflow, some compute nodes will inevitably sit idle sometimes while waiting for others to catch up. However, it turns out that modern approaches to massive scale training, in order to obtain better opportunities to optimise communication complexity, forego nondeterminism anyway!
Yes, CPUs leak information: that is the output kind of side-channel, where an attacker can transfer information about the computation into the outside world. That is not the kind I am saying one can rule out with merely diligent pursuit of determinism.
I think you are misunderstanding this part, input side channels absolutely exist as well,
Spectre for instance:
On most processors, the speculative execution resulting from a branch misprediction may leave observable side effects that may reveal private data to attackers.
Note that the attacker in this case is the computation that is being sandboxed.
I understand that Spectre-type vulnerabilities allow “sandboxed” computations, such as JS in web pages, to exfiltrate information from the environment in which they are embedded. However, this is, necessarily, done via access to nondeterministic APIs, such as performance.now(), setTimeout(), SharedArrayBuffer, postMessage(), etc. If no nondeterministic primitives were provided to, let’s say, a WASM runtime with canonicalized NaNs and only Promise/Future-based (rather than shared-memory or message-passing) concurrency, I am confident that this direction of exfiltration would be robustly impossible.
The assumption here is that we can implement a system that deterministically computes mathematical functions with no side effects. We can get much closer for this than we can for leaking information outward, but we still fail to do this perfectly. There are real-world exploits that have broken this abstraction to cause undesired behaviour, and which could be used to gather information about some properties of the real world.
For example, cosmic rays cause bit errors and we can already write software that observes them with high probability of not crashing. We can harden our hardware such as by adding error correction, but this reduces the error rate without eliminating it. There are also CPU bugs, heat-related errors, and DRAM charge levels that have been exploited in real applications, and no doubt many other potential vectors that we haven’t discovered yet.
It would certainly be extremely difficult to discern much about the real world purely via such channels, but it only takes one exploitable bug that opens a completely unintended inward channel for the game to be over.
I agree that, even for side-channels exposing external information to a mathematical attacker, we cannot get this absolutely perfect. Error-correction in microelectronics is an engineering problem and engineering is never absolutely fault-free.
However, per this recent US government study, RAM error rates in high-performance compute clusters range from 0.2 to 20 faults per billion device-hours. For comparison, training GPT-3 (175B parameters) from scratch takes roughly 1-3 million device-hours. An attacker inside a deep learning training run probably gets zero bits of information via the RAM-error channel.
But suppose they get a few bits. Those bits are about as random as they come. Nor is there anything clever to do from within an algorithm to amplify the extent to which cosmic rays reflect useful information about life on Earth.
I disbelieve that your claims about real-world exploits, if cashed out, would break the abstraction of deterministic execution such as is implemented in practice for blockchain smart contracts.
I do think it’s prudent to use strong hardware and software error-correction techniques in high-stakes situations, such as advanced AI, but mostly because errors are generally bad, for reliability and ability to reason about systems and their behaviours. The absolute worst would be if the sign bit got flipped somewhere in a mesa-optimiser’s utility function. So I’m not saying we can just completely neglect concerns about cosmic rays in an AI safety context. But I am prepared to bet the farm on task-specific AIs being completely unable to learn any virology via side-channels if the AI lab training it musters a decent effort to be careful about deterministic execution (which, I stress again, is not something I think happens by default—I hope this post has some causal influence towards making it more likely).
It is worth noting that sandboxing with full determinism is unusual. By “elementary sandboxing” I do not mean to imply that it’s so easy as just applying your favourite off-the-shelf sandbox, like Docker or Xen or even an off-the-shelf WASM runtime. But full determinism is hardly unprecedented either. For example, WASM or EVM code that runs in blockchain smart contracts must be fully deterministic (for global-consensus reasons, completely unrelated to exfiltration), and it has been straightforward to modify a WASM runtime to meet this requirement.
Enforcing determinism for machine learning will require more effort because of the involvement of GPUs. One must ensure that code is executed deterministically on the GPU as well as on the CPU, and that GPU/CPU concurrency is appropriately synchronized (to enforce deterministic dataflow). But I claim this is still eminently doable, and with no noticeable performance penalty versus contemporary best practices for scalable ML, if an AI lab understood what determinism is and cared about it even a little bit.
This argument seems a bit circular, nondeterminism is indeed a necessary condition for exfiltrating outside information, so obviously if you prevent all nondeterminism you prevent exfiltration.
You are also completely right that removing access to obviously nondeterministic APIs would massively reduce the attack surface. (AFAIK most known CPU side-channel require timing information.)
But I am not confident that this kind of attack would be “robustly impossible”. All you need is finding some kind of nondeterminism that can be used as a janky timer and suddenly all Spectre-class vulnerabilities are accessible again.
For instance I am pretty sure that rowhammer depends on the frequency of the writes. If you insert some instruction between the writes to RAM, you can suddenly measure the execution time of said instruction by looking at how many cycles it took to flip a bit with rowhammer. (I am not saying that this particular attack would work, I am just saying that I am not confident you couldn’t construct something similar that would.)
I am confident that this direction of exfiltration would be robustly impossible.
If you have some deeper reason for believing this it would probably be worth its own post. I am not saying that its impossible to construct some clever sandbox environment that ensures determinism even on a buggy CPU with unknown classes of bugs, I am just saying that I don’t know of existing solutions.
(Also in my opinion it would be much easier to just make a non-buggy CPU instead of trying to prove correctness of something executing on a buggy one. (Though proving your RAM correct seems quite hard, e.g. deriving the lack of rowhammer-like attacks from Maxwell’s laws or something.))
Yes, CPUs leak information: that is the output kind of side-channel, where an attacker can transfer information about the computation into the outside world. That is not the kind I am saying one can rule out with merely diligent pursuit of determinism.
Concurrency is a bigger concern. Concurrent algorithms can have deterministic dataflow, of course, but enforcing that naively does involve some performance penalty versus HOGWILD algorithms because, if executing a deterministic dataflow, some compute nodes will inevitably sit idle sometimes while waiting for others to catch up. However, it turns out that modern approaches to massive scale training, in order to obtain better opportunities to optimise communication complexity, forego nondeterminism anyway!
I think you are misunderstanding this part, input side channels absolutely exist as well, Spectre for instance:
Note that the attacker in this case is the computation that is being sandboxed.
I understand that Spectre-type vulnerabilities allow “sandboxed” computations, such as JS in web pages, to exfiltrate information from the environment in which they are embedded. However, this is, necessarily, done via access to nondeterministic APIs, such as
performance.now()
,setTimeout()
,SharedArrayBuffer
,postMessage()
, etc. If no nondeterministic primitives were provided to, let’s say, a WASM runtime with canonicalized NaNs and only Promise/Future-based (rather than shared-memory or message-passing) concurrency, I am confident that this direction of exfiltration would be robustly impossible.The assumption here is that we can implement a system that deterministically computes mathematical functions with no side effects. We can get much closer for this than we can for leaking information outward, but we still fail to do this perfectly. There are real-world exploits that have broken this abstraction to cause undesired behaviour, and which could be used to gather information about some properties of the real world.
For example, cosmic rays cause bit errors and we can already write software that observes them with high probability of not crashing. We can harden our hardware such as by adding error correction, but this reduces the error rate without eliminating it. There are also CPU bugs, heat-related errors, and DRAM charge levels that have been exploited in real applications, and no doubt many other potential vectors that we haven’t discovered yet.
It would certainly be extremely difficult to discern much about the real world purely via such channels, but it only takes one exploitable bug that opens a completely unintended inward channel for the game to be over.
I agree that, even for side-channels exposing external information to a mathematical attacker, we cannot get this absolutely perfect. Error-correction in microelectronics is an engineering problem and engineering is never absolutely fault-free.
However, per this recent US government study, RAM error rates in high-performance compute clusters range from 0.2 to 20 faults per billion device-hours. For comparison, training GPT-3 (175B parameters) from scratch takes roughly 1-3 million device-hours. An attacker inside a deep learning training run probably gets zero bits of information via the RAM-error channel.
But suppose they get a few bits. Those bits are about as random as they come. Nor is there anything clever to do from within an algorithm to amplify the extent to which cosmic rays reflect useful information about life on Earth.
I disbelieve that your claims about real-world exploits, if cashed out, would break the abstraction of deterministic execution such as is implemented in practice for blockchain smart contracts.
I do think it’s prudent to use strong hardware and software error-correction techniques in high-stakes situations, such as advanced AI, but mostly because errors are generally bad, for reliability and ability to reason about systems and their behaviours. The absolute worst would be if the sign bit got flipped somewhere in a mesa-optimiser’s utility function. So I’m not saying we can just completely neglect concerns about cosmic rays in an AI safety context. But I am prepared to bet the farm on task-specific AIs being completely unable to learn any virology via side-channels if the AI lab training it musters a decent effort to be careful about deterministic execution (which, I stress again, is not something I think happens by default—I hope this post has some causal influence towards making it more likely).
Yeah, it reduces the probability from massive problem to Pascal’s mugging probabilities. (Or it reduces it below the noise floor.)
Arguably, they reduce the probability to literally 0, i.e it is impossible for AI to break out of the box.
It is worth noting that sandboxing with full determinism is unusual. By “elementary sandboxing” I do not mean to imply that it’s so easy as just applying your favourite off-the-shelf sandbox, like Docker or Xen or even an off-the-shelf WASM runtime. But full determinism is hardly unprecedented either. For example, WASM or EVM code that runs in blockchain smart contracts must be fully deterministic (for global-consensus reasons, completely unrelated to exfiltration), and it has been straightforward to modify a WASM runtime to meet this requirement.
Enforcing determinism for machine learning will require more effort because of the involvement of GPUs. One must ensure that code is executed deterministically on the GPU as well as on the CPU, and that GPU/CPU concurrency is appropriately synchronized (to enforce deterministic dataflow). But I claim this is still eminently doable, and with no noticeable performance penalty versus contemporary best practices for scalable ML, if an AI lab understood what determinism is and cared about it even a little bit.
This argument seems a bit circular, nondeterminism is indeed a necessary condition for exfiltrating outside information, so obviously if you prevent all nondeterminism you prevent exfiltration.
You are also completely right that removing access to obviously nondeterministic APIs would massively reduce the attack surface. (AFAIK most known CPU side-channel require timing information.)
But I am not confident that this kind of attack would be “robustly impossible”. All you need is finding some kind of nondeterminism that can be used as a janky timer and suddenly all Spectre-class vulnerabilities are accessible again.
For instance I am pretty sure that rowhammer depends on the frequency of the writes. If you insert some instruction between the writes to RAM, you can suddenly measure the execution time of said instruction by looking at how many cycles it took to flip a bit with rowhammer. (I am not saying that this particular attack would work, I am just saying that I am not confident you couldn’t construct something similar that would.)
If you have some deeper reason for believing this it would probably be worth its own post. I am not saying that its impossible to construct some clever sandbox environment that ensures determinism even on a buggy CPU with unknown classes of bugs, I am just saying that I don’t know of existing solutions.
(Also in my opinion it would be much easier to just make a non-buggy CPU instead of trying to prove correctness of something executing on a buggy one. (Though proving your RAM correct seems quite hard, e.g. deriving the lack of rowhammer-like attacks from Maxwell’s laws or something.))