ryan_greenblatt comments on We might be dropping the ball on Autonomous Replication and Adaptation.

ryan_greenblatt 3 Jun 2024 19:40 UTC
LW: 2 AF: 2
0
AF

Also, are you picturing them gaining money from crimes, then buying compute legitimately? I think the “crimes” part is hard to stop but the “paying for compute” part is relatively easy to stop.

Both legitimately and illegitimately acquiring compute could plausibly be the best route. I’m uncertain.

It doesn’t seem that easy to lock down legitimate compute to me? I think you need to shutdown people buying/renting 8xH100 style boxes. This seems quite difficult potentially.

The model weights probably don’t fit in VRAM in a single 8xH100 device (the model weights might be 5 TB when 4 bit quantized), but you can maybe fit with 8-12 of these. And, you might not need amazing interconnect (e.g. normal 1 GB/s datacenter internet is fine) for somewhat high latency (but decently good throughput) pipeline parallel inference. (You only need to send the residual stream per token.) I might do a BOTEC on this later. Unless you’re aware of a BOTEC on this?
- Hjalmar_Wijk 5 Jun 2024 0:10 UTC
  LW: 9 AF: 7
  0
  AF Parent
  I did some BOTECs on this and think 1 GB/s is sort of borderline, probably works but not obviously.
  
  E.g. I assumed a ~10TB at fp8 MoE model with a sparsity factor of 4 with 32768 hidden size.
  
  With 32kB per token you could send at most 30k tokens/second over a 1GB/s interconnect. Not quite sure what a realistic utilization would be, but maybe we halve that to 15k?
  
  If the model was split across 20 8xH100 boxes, then each box might do ~250 GFLOP/token (2 * 10T parameters / (4*20)), so each box would do at most 3.75 PFLOP/second, which might be about ~20-25% utilization.
  
  This is not bad, but for a model with much more sparsity or GPUs with a different FLOP/s : VRAM ratio or spottier connection etc. the bandwidth constraint might become quite harsh.
  
  (the above is somewhat hastily reconstructed from some old sheets, might have messed something up)
- Tao Lin 4 Jun 2024 23:36 UTC
  3 points
  0
  Parent
  for reference, just last week i rented 3 8xh100 boxes without any KYC