It is worth noting that sandboxing with full determinism is unusual. By “elementary sandboxing” I do not mean to imply that it’s so easy as just applying your favourite off-the-shelf sandbox, like Docker or Xen or even an off-the-shelf WASM runtime. But full determinism is hardly unprecedented either. For example, WASM or EVM code that runs in blockchain smart contracts must be fully deterministic (for global-consensus reasons, completely unrelated to exfiltration), and it has been straightforward to modify a WASM runtime to meet this requirement.
Enforcing determinism for machine learning will require more effort because of the involvement of GPUs. One must ensure that code is executed deterministically on the GPU as well as on the CPU, and that GPU/CPU concurrency is appropriately synchronized (to enforce deterministic dataflow). But I claim this is still eminently doable, and with no noticeable performance penalty versus contemporary best practices for scalable ML, if an AI lab understood what determinism is and cared about it even a little bit.
It is worth noting that sandboxing with full determinism is unusual. By “elementary sandboxing” I do not mean to imply that it’s so easy as just applying your favourite off-the-shelf sandbox, like Docker or Xen or even an off-the-shelf WASM runtime. But full determinism is hardly unprecedented either. For example, WASM or EVM code that runs in blockchain smart contracts must be fully deterministic (for global-consensus reasons, completely unrelated to exfiltration), and it has been straightforward to modify a WASM runtime to meet this requirement.
Enforcing determinism for machine learning will require more effort because of the involvement of GPUs. One must ensure that code is executed deterministically on the GPU as well as on the CPU, and that GPU/CPU concurrency is appropriately synchronized (to enforce deterministic dataflow). But I claim this is still eminently doable, and with no noticeable performance penalty versus contemporary best practices for scalable ML, if an AI lab understood what determinism is and cared about it even a little bit.