Error
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
I was surprised the paper didn’t mention photonics or optoelectronics even once.
If looking at 5-10+ year projections, and dedicating pages to discussing the challenges in scaling compute and energy use, the rate of progress in that area in parallel to the progress in models themselves is potentially relevant.
Particularly because a dramatic hardware shift like that is likely going to mean a significant portion of progress up until that shift in topics like interpretability and alignment may be going out the window. Even if the initial shift is a 1:1 transition of capabilities and methodologies, it seems extremely unlikely that continued progress from that point onwards will be identical to what we’d expect to see in electronics.
We may well end up in a situation where fully abusing the efficiencies at hand in new hardware solutions means even more obscured (literally) operations vs OOM higher costs and diminishing returns on performance in exchange for interpretability and control.
Currently, my best guess is that we’re heading towards a prisoner’s dilemma fueled leap of faith moment within around a decade or so where nation states afraid of the other side beating them to an inflection point pull the trigger on an advancement jump with uncertain outcomes. And while I’m not particularly inclined to the likelihood the outcome ends up being “kill everyone,” I’m pretty much 100% that it’s not going to be “let’s enable and support CCP leadership like a good party member” or “crony capitalism is going great, let’s keep that going for another century.”
Unless a fundamental wall is hit in progress, the status quo is almost certainly over, we just haven’t manifested it yet. The CCP stealing AGI secrets, while devastating for national security in the short term, is invariably a poison pill in the long term for party control. Just as it’s going to be an eventual end of the corporations funding oligarchy in the West. My all causes p(doom) is incredibly high even if AGI is out of the picture, so I’m not overly worried with what’s happening, but it sure is bizarre watching global forces double down on what I cannot see as anything but their own long term institutional demise in a race for short term gains over a competitor.
Why is this the case?
It’s still early to tell, as the specific characteristics of a photonic or optoelectronic neural network are still formulating in the developing literature.
For example, in my favorite work of the year so far, the researchers found they could use sound waves to reconfigure an optical neural network as the sound waves effectively preserved a memory of previous photon states as they propagated: https://www.nature.com/articles/s41467-024-47053-6
In particular, this approach is a big step forward for bidirectional ONN, which addresses what I think is the biggest current flaw in modern transformers—their unidirectionality. I discussed this more in a collection of thoughts on directionality impact on data here: https://www.lesswrong.com/posts/bmsmiYhTm7QJHa2oF/looking-beyond-everett-in-multiversal-views-of-llms
If you have bidirectionality where previously you didn’t, it’s not a reach to expect that the way in which data might encode in the network, as well as how the vector space is represented, might not be the same. And thus, that mechanistic interpretability gains may get a bit of a reset.
And this is just one of many possible ways it may change by the time the tech finalizes. The field of photonics, particularly for neural networks, is really coming along nicely. There may yet be future advances (I think this is very likely given the pace to date), and advantages the medium offers that electronics haven’t.
It’s hard to predict exactly what’s going to happen when two different fields which have each had unexpected and significant gains over the past 5 years collide, but it’s generally safe to say that it will at very least result in other unexpected things too.
No, actually what follows is the end of “The Project” part.
“Parting Thoughts” is not quoted at all (it’s an important part with interesting discussion, and Leopold introduces the approach he is calling AGI realism in that part).
I found this bit confusing. Why is hardware encryption supposed to help? Why would it be better than software encryption? Is the idea to prevent the model from reading its own weights via a physical side-channel?
What exactly is “many-key signoff” supposed to mean and how is it supposed to help?
Hardware encryption likely means that dedicated on-chip hardware to handle keys and decrypting weights and activations on-the-fly.
The hardware/software divide here is likely a bit fuzzy but having dedicated hardware or a separate on-chip core makes it easier to isolate and accelerate the security critical operations. If security costs too much performance, people will be tempted to turn it off.
Encrypting data in motion and data at rest (in GPU memory) makes sense since this minimizes trust. An attacker with hardware access will have a hard time getting weights and activations unless they can get data directly off the chip.
Many-key signoff is nuclear-lauch-style security where multiple keyholders must use their keys to approve an action. The idea being that a single rogue employee can’t do something bad like copy model weights to an internet server or change inference code to add a side channel that leaks model weights or to sabotage inference misuse prevention/monitoring.
This is commonly done in high security fields like banking where several employees hold key shares that must be used together to sign code to be deployed on hardware security modules.
Thank you.
I think maybe my confusion here is related to the threat model. If a model gained root access to the device that it’s running on, it seems like it could probably subvert these security measures? Anyway I’d be interested to read a more detailed description of the threat model and how this stuff is supposed to help.
More specifically, it seems a bit weird to imagine an attacker who has physical access to a running server, yet isn’t able to gain de facto root access for the purpose of weight exfiltration. E.g. you can imagine using your physical access to copy the encrypted weights on to a different drive running a different OS, then boot from that drive, and the new OS has been customized to interface with the chip so as to exfiltrate the weights. Remembering that the chip can’t exactly be encrypting every result of its weight computations using some super-secret key, because if it did, the entire setup would effectively be useless. Seems to me like the OS has to be part of the TCB along with the chip?
What you’re describing above is how Bitlocker on Windows works on every modern Windows PC. The startup process involves a chain of trust with various bootloaders verifying the next thing to start and handing off keys until windows starts. Crucially, the keys are different if you start something that’s not windows (IE:not signed by Microsoft). You can’t just boot Linux and decrypt the drive since different keys would be generated for Linux during boot and they won’t decrypt the drive.
Mobile devices and game consoles are even more locked down. If there’s no bootloader unlock from your carrier or device manufacturer and no vulnerability (hardware or software) to be found, you’re stuck with the stock OS. You can’t boot something else because the chip will refuse to boot anything not signed by the OEM/Carrier. You can’t downgrade because fuses have been blown to prevent it and enforce a minimum revision number. Nothing booted outside the chip will have the keys locked up inside it needed to decrypt things and do remote attestation.
Root isn’t full control
Having root on a device isn’t the silver bullet it once was. Security is still kind of terrible and people don’t do enough to lock everything down properly, but the modern approach seems to be: 1) isolate security critical properties/code 2) put it in a secure armored protected box somewhere inside the chip. 3) make sure you don’t stuff enough crap inside the box the attacker can compromise via a bug too.
They tend to fail in a couple of ways
Pointless isolation (EG:lock encryption keys inside a box but let the user ask it to encrypt/decrypt anything)
the box isn’t hack proof (examples below)
they forget about a chip debug feature that lets you read/write security critical device memory
you can mess with chip operating voltage/frequency to make security critical operations fail.
The stuff too much stuff inside the box (EG: a full Java virtual machine and a webserver)
I expect ML accelerator security to be full of holes. They won’t get it right, but it is possible in principle.
ML accelerator security wish-list
As for what we might want for an ML accelerator:
Separating out enforcement of security critical properties:
restrict communications
supervisory code configures/approves communication channels such that all data in/out must be encrypted
There’s still steganography. With complete control over code that can read the weights or activations we can hide data in message timing for example
Still, protecting weights/activations in transit is a good start.
can enforce “who can talk to who”
example:ensure inference outputs must go through supervision model that OKs them as safe.
Supervision inference chip can then send data to API server then to customer
Inference chip literally can’t send data anywhere but the supervision model
can enforce network isolation for dangerous experimental system (though you really should be using airgaps for that)
Enforcing code signing like what apple does. Current Gen GPUs support virtualisation and user/kernel modes. Control what code has access to what data for reading/writing. Code should not have write access to itself. Attackers would have to find return oriented programming attacks or similar that could work on GPU shader code. This makes life harder for the attacker.
Apple does this on newer SOCs to prevent execution of non-signed code.
Doing that would help a lot. Not sure how well that plays with infiniband/NVlink networking but that can be encrypted too in principle. If a virtual memory system is implemented, it’s not that hard to add a field to the page table for an encryption key index.
Boundaries of the trusted system
You’ll need to manage keys and securely communicate with all the accelerator chips. This likely involves hardware security modules that are extra super duper secure. Decryption keys for data chips must work on and for inter-chip communication are sent securely to individual accelerator chips similar to how keys are sent to cable TV boxes to allow them to decrypt programs they have paid for.
This is how you actually enforce access control.
If a message is not for you, if you aren’t supposed to read/write that part of the distributed virtual memory space, you don’t get keys to decrypt it. Simple and effective.
“You” the running code never touch the keys. The supervisory code doesn’t touch the keys. Specialised crypto hardware unwraps the key and then uses it for (en/de)cryption without any software in the chip ever having access to it.
I appreciate your replies. I had some more time to think and now I have more takes. This isn’t my area, but I’m having fun thinking about it.
See https://en.wikipedia.org/wiki/File:ComputerMemoryHierarchy.svg
Disk encryption is table stakes. I’ll assume any virtual memory is also encrypted. I don’t know much about that.
I’m assuming no use of flash memory.
Absent homomorphic encryption, we have to decrypt in the registers, or whatever they’re called for a GPU.
So basically the question is how valuable is it to encrypt the weights in RAM and possibly in the processor cache. For the sake of this discussion, I’m going to assume reading from the processor cache is just as hard as reading from the registers, so there’s no point in encrypting the processor cache if we’re going to decrypt in registers anyway. (Also, encrypting the processor cache could really hurt performance!)
So that leaves RAM: how much added security we get if we encrypt RAM in addition to encrypting disk.
One problem I notice: An attacker who has physical read access to RAM may very well also have physical write access to RAM. That allows them to subvert any sort of boot-time security, by rewriting the running OS in RAM.
If the processor can only execute signed code, that could help. But an attacker could still control which signed code the processor runs (by strategically changing the contents at an instruction pointer?) I suspect this is enough in practice.
A somewhat insane idea would be for the OS to run encrypted in RAM to make it harder for an attacker to tamper with it. I doubt this would help—an attacker could probably infer from the pattern of memory accesses which OS code does what. (Assuming they’re able to observe memory accesses.)
So overall it seems like with physical write access to RAM, an attacker can probably get de facto root access, and make the processor their puppet. At that point, I think exfiltrating the weights should be pretty straightforward. I’m assuming intermediate activations must be available for interpretability, so it seems possible to infer intermediate weights by systematically probing intermediate activations and solving for the weights.
If you could run the OS from ROM, so it can’t be tampered with, maybe that could help. I’m assuming no way to rewrite the ROM or swap in a new ROM while the system is running. Of course, that makes OS updates annoying since you have to physically open things up and swap them out. Maybe that introduces new vulnerabilities.
In any case, overall I suspect the benefit-to-effort ratio is higher elsewhere. I would focus on making sure the AI isn’t capable of reading its own RAM in the first place, and isn’t trying to.
TLDR:Memory encryption alone is indeed not enough. Modifications and rollback must be prevented too.
memory encryption and authentication has come a long way
Unless there’s a massive shift in ML architectures to doing lots of tiny reads/writes, overheads will be tiny. I’d guesstimate the following:
negligible performance drop / chip area increase
~1% of DRAM and cache space[1]
It’s hard to build hardware or datacenters that resists sabotage if you don’t do this. You end up having to trust the maintenance people aren’t messing with the equipment and the factories haven’t added any surprises to the PCBs. With the right security hardware, you trust TSMC and their immidiate suppliers and no one else.
Not sure if we have the technical competence to pull it off. Apple’s likely one of the few that’s even close to secure and it took them more than a decade of expensive lessons to get there. Still, we should put in the effort.
Agreed that alignment is going to be the harder problem. Considering the amount of fail when it comes to building correct security hardware that operates using known principles … things aren’t looking great.
</TLDR> rest of comment is just details
Morphable Counters: Enabling Compact Integrity Trees For Low-Overhead Secure Memories
Performance cost
Overheads are usually quite low for CPU workloads:
<1% extra DRAM required[1]
<<10% execution time increase
Executable code can be protected with negligible overhead by increasing the size of the rewritable authenticated blocks for a given counter to 4KB or more. Overhead is then comparable to the page table.
For typical ML workloads, the smallest data block is already 2x larger (GPU cache lines 128 bytes vs 64 bytes on CPU gives 2x reduction). Access patterns should be nice too, large contiguous reads/writes.
Only some unusual workloads see significant slowdown (EG: large graph traversal/modification) but this can be on the order of 3x.[2]
A real example (intel SGX)
Use case: launch an application in a “secure enclave” so that host operating system can’t read it or tamper with it.
It used an older memory protection scheme:
hash tree
Each 64 byte chunk of memory is protected by an 8 byte MAC
MACs are 8x smaller than the data they protect so each tree level is 8x smaller
split counter modes in the linked paper can do 128x per level
The memory encryption works.
If intel SGX enclave memory is modified, this is detected and the whole CPU locks up.
SGX was not secure. The memory encryption/authentication is solid. The rest … not so much. Wikipedia lists 8 separate vulnerabilities including ones that allow leaking of the remote attestation keys. That’s before you get to attacks on other parts of the chip and security software that allow dumping all the keys stored on chip allowing complete emulation.
AMD didn’t do any better of course One Glitch to Rule Them All: Fault Injection Attacks Against AMD’s Secure Encrypted Virtualization
How low overheads are achieved
ECC (error correcting code) memory includes extra chips to store error correction data. Repurposing this to store MACs +10 bit ECC gets rid of extra data accesses. As long as we have the counters cached for that section of memory there’s no extra overhead. Tradeoff is going from ability to correct 2 flipped bits to only correcting 1 flipped bit.
DRAM internally reads/writes lots of data at once as part of a row. standard DDR memory reads 8KB rows rather than just a 64B cache line. We can store tree parts for a row inside that row to reduce read/write latency/cost since row switches are expensive.
Memory that is rarely changed (EG:executable code) can be protected as a single large block. If we don’t need to read/write individual 64 byte chunks, then a single 4KiB page can be a rewritable unit. Overhead is then negligible if you can store MACs in place of an ECC code.
Counters don’t need to be encrypted, just authenticated. verification can be parallelized if you have the memory bandwidth to do so.
can also delay verification and assume data is fine, if verification fails, entire chip shuts down to prevent bad results from getting out.
Technically we need +12.5% to store MAC tags. If we assume ECC (error correcting code) memory is in use, which already has +12.5% for ECC, we can store MAC tags + smaller ECC at the cost of 1 bit of error correction.
Random reads/writes bloat memory traffic by >3x since we need to deal with 2+ uncached tree levels. We can hide latency by delaying verify of higher tree levels and panicking if it fails before results can leave chip (Intel SGX does exactly this). But if most traffic bloats, we bottleneck on memory bandwidth and perf drops a lot.