There are some serious issues that need to be overcome for any scheme of this nature to be secure.
Chips are not tamper-proof black boxes. Secure computing enclaves like SGX are routinely broken by academics: https://arstechnica.com/information-technology/2022/08/architectural-bug-in-some-intel-cpus-is-more-bad-news-for-sgx-users/. Secure flash memory is expensive and tampering with the memory writing logic or hardware could allow an actor to write false logs. Cryptographically signing logs runs the risk of leaking key material through power or fault-injection side channels. Once a chip is physically in the possession of someone with university-lab-level equipment and expertise, all bets are off w.r.t. on-chip controls. While some of these exploits are hard to carry out at scale, even a single effective exploit renders whole generations of GPUs uncontrollable.
The author does not actually propose a “Proof of Training Transcript” protocol, only providing a possible definition of such a scheme. While they acknowledge the challenges in constructing such a scheme, I think it is worth highlighting the fact that constructing a secure, efficient instantiation of such a scheme is not something that we (as a species) know how to do. The best noninteractive proofs for arbitrary verifiable computation currently require several orders of magnitude more time to generate the proof than to carry out the original computation and proof generation is not totally parallelizable. The requirement to generate PoTTs basically obviates the utility of using a fast GPU in the first place. It is possible to imagine a special-purpose proof scheme for gradient descent with faster concrete efficiency, but the vague outline of a scheme proposed by the author relies on a stream of nonstandard hardness assumptions that I find very unconvincing.
I think that from a quick read of the paper or from the summary in the post one might be led to believe that such a scheme could be implemented with a few years of engineering effort and the cooperation of chip manufacturers. In fact, substantial advances in cryptography would be required.
Policy-makers’ attempts to enforce policy by requiring the use of special chips have in the past largely been broken: Clipper chip, DRM via SGX, etc.
There are some serious issues that need to be overcome for any scheme of this nature to be secure.
Chips are not tamper-proof black boxes. Secure computing enclaves like SGX are routinely broken by academics: https://arstechnica.com/information-technology/2022/08/architectural-bug-in-some-intel-cpus-is-more-bad-news-for-sgx-users/. Secure flash memory is expensive and tampering with the memory writing logic or hardware could allow an actor to write false logs. Cryptographically signing logs runs the risk of leaking key material through power or fault-injection side channels. Once a chip is physically in the possession of someone with university-lab-level equipment and expertise, all bets are off w.r.t. on-chip controls. While some of these exploits are hard to carry out at scale, even a single effective exploit renders whole generations of GPUs uncontrollable.
The author does not actually propose a “Proof of Training Transcript” protocol, only providing a possible definition of such a scheme. While they acknowledge the challenges in constructing such a scheme, I think it is worth highlighting the fact that constructing a secure, efficient instantiation of such a scheme is not something that we (as a species) know how to do. The best noninteractive proofs for arbitrary verifiable computation currently require several orders of magnitude more time to generate the proof than to carry out the original computation and proof generation is not totally parallelizable. The requirement to generate PoTTs basically obviates the utility of using a fast GPU in the first place. It is possible to imagine a special-purpose proof scheme for gradient descent with faster concrete efficiency, but the vague outline of a scheme proposed by the author relies on a stream of nonstandard hardness assumptions that I find very unconvincing.
I think that from a quick read of the paper or from the summary in the post one might be led to believe that such a scheme could be implemented with a few years of engineering effort and the cooperation of chip manufacturers. In fact, substantial advances in cryptography would be required.
Policy-makers’ attempts to enforce policy by requiring the use of special chips have in the past largely been broken: Clipper chip, DRM via SGX, etc.