MichaelStJules comments on johnswentworth’s Shortform

MichaelStJules Jun 22, 2024, 1:34 PM
29 points
0
Why do you believe AMD and Google make better hardware than Nvidia?
- johnswentworth Jun 22, 2024, 6:25 PM
  20 points
  0
  Parent
  The easiest answer is to look at the specs. Of course specs are not super reliable, so take it all with many grains of salt. I’ll go through the AMD/Nvidia comparison here, because it’s a comparison I looked into a few months back.
  MI300x vs H100
  Techpowerup is a third-party site with specs for the MI300x and the H100, so we can do a pretty direct comparison between those two pages. (I don’t know if the site independently tested the two chips, but they’re at least trying to report comparable numbers.) The H200 would arguably be more of a “fair comparison” since the MI300x came out much later than the H100; we’ll get to that comparison next. I’m starting with MI300x vs H100 comparison because techpowerup has specs for both of them, so we don’t have to rely on either company’s bullshit-heavy marketing materials as a source of information. Also, even the H100 is priced 2-4x more expensive than the MI300x (~$30-45k vs ~$10-15k), so it’s not unfair to compare the two.
  Key numbers (MI300x vs H100):
  - float32 TFLOPs: ~80 vs ~50
  - float16 TFLOPs: ~650 vs ~200
  - memory: 192 GB vs 80 GB (note that this is the main place where the H200 improves on the H100)
  - bandwidth: ~10 TB/s vs ~2 TB/s
  … so the comparison isn’t even remotely close. The H100 is priced 2-4x higher but is utterly inferior in terms of hardware.
  MI300x vs H200
  I don’t know of a good third-party spec sheet for the H200, so we’ll rely on Nvidia’s page. Note that they report some numbers “with sparsity” which, to make a long story short, means those numbers are blatant marketing bullshit. Other than those numbers, I’ll take their claimed specs at face value.
  Key numbers (MI300x vs H200):
  - float32 TFLOPs: ~80 vs ~70
  - float16 TFLOPs: don’t know, Nvidia conspicuously avoided reporting that number
  - memory: 192 GB vs 141 GB
  - bandwidth: ~10 TB/s vs ~5 TB/s
  So they’re closer than the MI300x vs H100, but the MI300x still wins across the board. And pricewise, the H200 is probably around $40k, so 3-4x more expensive than the MI300x.
  - ryan_greenblatt Jun 22, 2024, 9:09 PM
    24 points
    2
    Parent
    Its worth noting that even if nvidia is charging 2-4x more now, the ultimate question for competitiveness will be manufactoring cost for nvidia vs amd. If nvidia has much lower manufactoring costs than amd per unit performance (but presumably higher markup), then nvidia might win out even if their product is currently worse per dollar.
    
    Note also that price discrimination might be a big part of nvidia’s approach. Scaling labs which are willing to go to great effort to drop compute cost by a factor of two are a subset of nvidia’s customers where nvidia would ideally prefer to offer lower prices. I expect that nvidia will find a way to make this happen.

MichaelStJules comments on johnswentworth’s Shortform

MI300x vs H100

MI300x vs H200