Ebenezer Dukakis comments on Quotes from Leopold Aschenbrenner’s Situational Awareness Paper

Ebenezer Dukakis 8 Jun 2024 7:44 UTC
2 points
0

...we’ll need much more extreme security against model self-exfiltration across the board, from hardware encryption to many-key signoff.

I found this bit confusing. Why is hardware encryption supposed to help? Why would it be better than software encryption? Is the idea to prevent the model from reading its own weights via a physical side-channel?

What exactly is “many-key signoff” supposed to mean and how is it supposed to help?
- anithite 9 Jun 2024 17:37 UTC
  2 points
  0
  Parent
  Hardware encryption likely means that dedicated on-chip hardware to handle keys and decrypting weights and activations on-the-fly.
  
  The hardware/software divide here is likely a bit fuzzy but having dedicated hardware or a separate on-chip core makes it easier to isolate and accelerate the security critical operations. If security costs too much performance, people will be tempted to turn it off.
  
  Encrypting data in motion and data at rest (in GPU memory) makes sense since this minimizes trust. An attacker with hardware access will have a hard time getting weights and activations unless they can get data directly off the chip.
  
  Many-key signoff is nuclear-lauch-style security where multiple keyholders must use their keys to approve an action. The idea being that a single rogue employee can’t do something bad like copy model weights to an internet server or change inference code to add a side channel that leaks model weights or to sabotage inference misuse prevention/monitoring.
  
  This is commonly done in high security fields like banking where several employees hold key shares that must be used together to sign code to be deployed on hardware security modules.
  - Ebenezer Dukakis 10 Jun 2024 1:39 UTC
    1 point
    0
    Parent
    Thank you.
    
    I think maybe my confusion here is related to the threat model. If a model gained root access to the device that it’s running on, it seems like it could probably subvert these security measures? Anyway I’d be interested to read a more detailed description of the threat model and how this stuff is supposed to help.
    
    More specifically, it seems a bit weird to imagine an attacker who has physical access to a running server, yet isn’t able to gain de facto root access for the purpose of weight exfiltration. E.g. you can imagine using your physical access to copy the encrypted weights on to a different drive running a different OS, then boot from that drive, and the new OS has been customized to interface with the chip so as to exfiltrate the weights. Remembering that the chip can’t exactly be encrypting every result of its weight computations using some super-secret key, because if it did, the entire setup would effectively be useless. Seems to me like the OS has to be part of the TCB along with the chip?
    - anithite 10 Jun 2024 4:54 UTC
      2 points
      0
      Parent
      What you’re describing above is how Bitlocker on Windows works on every modern Windows PC. The startup process involves a chain of trust with various bootloaders verifying the next thing to start and handing off keys until windows starts. Crucially, the keys are different if you start something that’s not windows (IE:not signed by Microsoft). You can’t just boot Linux and decrypt the drive since different keys would be generated for Linux during boot and they won’t decrypt the drive.
      Mobile devices and game consoles are even more locked down. If there’s no bootloader unlock from your carrier or device manufacturer and no vulnerability (hardware or software) to be found, you’re stuck with the stock OS. You can’t boot something else because the chip will refuse to boot anything not signed by the OEM/Carrier. You can’t downgrade because fuses have been blown to prevent it and enforce a minimum revision number. Nothing booted outside the chip will have the keys locked up inside it needed to decrypt things and do remote attestation.
      Root isn’t full control
      Having root on a device isn’t the silver bullet it once was. Security is still kind of terrible and people don’t do enough to lock everything down properly, but the modern approach seems to be: 1) isolate security critical properties/code 2) put it in a secure armored protected box somewhere inside the chip. 3) make sure you don’t stuff enough crap inside the box the attacker can compromise via a bug too.
      They tend to fail in a couple of ways
      Pointless isolation (EG:lock encryption keys inside a box but let the user ask it to encrypt/decrypt anything)
      the box isn’t hack proof (examples below)
      they forget about a chip debug feature that lets you read/write security critical device memory
      you can mess with chip operating voltage/frequency to make security critical operations fail.
      The stuff too much stuff inside the box (EG: a full Java virtual machine and a webserver)
      I expect ML accelerator security to be full of holes. They won’t get it right, but it is possible in principle.
      ML accelerator security wish-list
      As for what we might want for an ML accelerator:
      Separating out enforcement of security critical properties:
      restrict communications
      supervisory code configures/approves communication channels such that all data in/out must be encrypted
      There’s still steganography. With complete control over code that can read the weights or activations we can hide data in message timing for example
      Still, protecting weights/activations in transit is a good start.
      can enforce “who can talk to who”
      example:ensure inference outputs must go through supervision model that OKs them as safe.
      Supervision inference chip can then send data to API server then to customer
      Inference chip literally can’t send data anywhere but the supervision model
      can enforce network isolation for dangerous experimental system (though you really should be using airgaps for that)
      Enforcing code signing like what apple does. Current Gen GPUs support virtualisation and user/kernel modes. Control what code has access to what data for reading/writing. Code should not have write access to itself. Attackers would have to find return oriented programming attacks or similar that could work on GPU shader code. This makes life harder for the attacker.
      Apple does this on newer SOCs to prevent execution of non-signed code.
      Doing that would help a lot. Not sure how well that plays with infiniband/NVlink networking but that can be encrypted too in principle. If a virtual memory system is implemented, it’s not that hard to add a field to the page table for an encryption key index.
      Boundaries of the trusted system
      You’ll need to manage keys and securely communicate with all the accelerator chips. This likely involves hardware security modules that are extra super duper secure. Decryption keys for data chips must work on and for inter-chip communication are sent securely to individual accelerator chips similar to how keys are sent to cable TV boxes to allow them to decrypt programs they have paid for.
      This is how you actually enforce access control.
      If a message is not for you, if you aren’t supposed to read/write that part of the distributed virtual memory space, you don’t get keys to decrypt it. Simple and effective.
      “You” the running code never touch the keys. The supervisory code doesn’t touch the keys. Specialised crypto hardware unwraps the key and then uses it for (en/de)cryption without any software in the chip ever having access to it.
      - Ebenezer Dukakis 14 Jun 2024 4:02 UTC
        1 point
        0
        Parent
        I appreciate your replies. I had some more time to think and now I have more takes. This isn’t my area, but I’m having fun thinking about it.
        
        See https://en.wikipedia.org/wiki/File:ComputerMemoryHierarchy.svg
        
        Disk encryption is table stakes. I’ll assume any virtual memory is also encrypted. I don’t know much about that.
        
        I’m assuming no use of flash memory.
        
        Absent homomorphic encryption, we have to decrypt in the registers, or whatever they’re called for a GPU.
        
        So basically the question is how valuable is it to encrypt the weights in RAM and possibly in the processor cache. For the sake of this discussion, I’m going to assume reading from the processor cache is just as hard as reading from the registers, so there’s no point in encrypting the processor cache if we’re going to decrypt in registers anyway. (Also, encrypting the processor cache could really hurt performance!)
        
        So that leaves RAM: how much added security we get if we encrypt RAM in addition to encrypting disk.
        
        One problem I notice: An attacker who has physical read access to RAM may very well also have physical write access to RAM. That allows them to subvert any sort of boot-time security, by rewriting the running OS in RAM.
        
        If the processor can only execute signed code, that could help. But an attacker could still control which signed code the processor runs (by strategically changing the contents at an instruction pointer?) I suspect this is enough in practice.
        
        A somewhat insane idea would be for the OS to run encrypted in RAM to make it harder for an attacker to tamper with it. I doubt this would help—an attacker could probably infer from the pattern of memory accesses which OS code does what. (Assuming they’re able to observe memory accesses.)
        
        So overall it seems like with physical write access to RAM, an attacker can probably get de facto root access, and make the processor their puppet. At that point, I think exfiltrating the weights should be pretty straightforward. I’m assuming intermediate activations must be available for interpretability, so it seems possible to infer intermediate weights by systematically probing intermediate activations and solving for the weights.
        
        If you could run the OS from ROM, so it can’t be tampered with, maybe that could help. I’m assuming no way to rewrite the ROM or swap in a new ROM while the system is running. Of course, that makes OS updates annoying since you have to physically open things up and swap them out. Maybe that introduces new vulnerabilities.
        
        In any case, overall I suspect the benefit-to-effort ratio is higher elsewhere. I would focus on making sure the AI isn’t capable of reading its own RAM in the first place, and isn’t trying to.
        anithite 17 Jun 2024 17:15 UTC
        2 points
        0
        Parent
        TLDR:Memory encryption alone is indeed not enough. Modifications and rollback must be prevented too.
        memory encryption and authentication has come a long way
        Unless there’s a massive shift in ML architectures to doing lots of tiny reads/writes, overheads will be tiny. I’d guesstimate the following:
        negligible performance drop / chip area increase
        ~1% of DRAM and cache space^[1]
        It’s hard to build hardware or datacenters that resists sabotage if you don’t do this. You end up having to trust the maintenance people aren’t messing with the equipment and the factories haven’t added any surprises to the PCBs. With the right security hardware, you trust TSMC and their immidiate suppliers and no one else.
        Not sure if we have the technical competence to pull it off. Apple’s likely one of the few that’s even close to secure and it took them more than a decade of expensive lessons to get there. Still, we should put in the effort.
        in any case, overall I suspect the benefit-to-effort ratio is higher elsewhere. I would focus on making sure the AI isn’t capable of reading its own RAM in the first place, and isn’t trying to.
        Agreed that alignment is going to be the harder problem. Considering the amount of fail when it comes to building correct security hardware that operates using known principles … things aren’t looking great.
        </TLDR> rest of comment is just details
        Morphable Counters: Enabling Compact Integrity Trees For Low-Overhead Secure Memories
        Memory contents protected with MACs are still vulnerable to tampering through replay attacks. For example, an adversary can replace a tuple of { Data, MAC, Counter } in memory with older values without detection. Integrity-trees [7], [13], [20] prevent replay attacks using multiple levels of MACs in memory, with each level ensuring the integrity of the level below. Each level is smaller than the level below, with the root small enough to be securely stored on-chip.
        
        [improvement TLDR for this paper: they find a data compression scheme for counters to increase the tree branching factor to 128 per level from 64 without increasing re-encryption when counters overflow]
        Performance cost
        Overheads are usually quite low for CPU workloads:
        <1% extra DRAM required^[1]
        <<10% execution time increase
        Executable code can be protected with negligible overhead by increasing the size of the rewritable authenticated blocks for a given counter to 4KB or more. Overhead is then comparable to the page table.
        
        For typical ML workloads, the smallest data block is already 2x larger (GPU cache lines 128 bytes vs 64 bytes on CPU gives 2x reduction). Access patterns should be nice too, large contiguous reads/writes.
        Only some unusual workloads see significant slowdown (EG: large graph traversal/modification) but this can be on the order of 3x.^[2]
        A real example (intel SGX)
        Use case: launch an application in a “secure enclave” so that host operating system can’t read it or tamper with it.
        It used an older memory protection scheme:
        hash tree
        Each 64 byte chunk of memory is protected by an 8 byte MAC
        MACs are 8x smaller than the data they protect so each tree level is 8x smaller
        split counter modes in the linked paper can do 128x per level
        The memory encryption works.
        If intel SGX enclave memory is modified, this is detected and the whole CPU locks up.
        SGX was not secure. The memory encryption/authentication is solid. The rest … not so much. Wikipedia lists 8 separate vulnerabilities including ones that allow leaking of the remote attestation keys. That’s before you get to attacks on other parts of the chip and security software that allow dumping all the keys stored on chip allowing complete emulation.
        AMD didn’t do any better of course One Glitch to Rule Them All: Fault Injection Attacks Against AMD’s Secure Encrypted Virtualization
        How low overheads are achieved
        ECC (error correcting code) memory includes extra chips to store error correction data. Repurposing this to store MACs +10 bit ECC gets rid of extra data accesses. As long as we have the counters cached for that section of memory there’s no extra overhead. Tradeoff is going from ability to correct 2 flipped bits to only correcting 1 flipped bit.
        DRAM internally reads/writes lots of data at once as part of a row. standard DDR memory reads 8KB rows rather than just a 64B cache line. We can store tree parts for a row inside that row to reduce read/write latency/cost since row switches are expensive.
        Memory that is rarely changed (EG:executable code) can be protected as a single large block. If we don’t need to read/write individual 64 byte chunks, then a single 4KiB page can be a rewritable unit. Overhead is then negligible if you can store MACs in place of an ECC code.
        Counters don’t need to be encrypted, just authenticated. verification can be parallelized if you have the memory bandwidth to do so.
        can also delay verification and assume data is fine, if verification fails, entire chip shuts down to prevent bad results from getting out.
        ^
        Technically we need +12.5% to store MAC tags. If we assume ECC (error correcting code) memory is in use, which already has +12.5% for ECC, we can store MAC tags + smaller ECC at the cost of 1 bit of error correction.
        ^
        Random reads/writes bloat memory traffic by >3x since we need to deal with 2+ uncached tree levels. We can hide latency by delaying verify of higher tree levels and panicking if it fails before results can leave chip (Intel SGX does exactly this). But if most traffic bloats, we bottleneck on memory bandwidth and perf drops a lot.

Ebenezer Dukakis comments on Quotes from Leopold Aschenbrenner’s Situational Awareness Paper

Root isn’t full control

ML accelerator security wish-list

Boundaries of the trusted system

Performance cost

A real example (intel SGX)

How low overheads are achieved