gwern comments on Analysis: US restricts GPU sales to China

gwern 19 Oct 2022 19:30 UTC
6 points
0
BGI Genomics, like the mothership in mainland China proper? I’m less familiar with the corporate structure and revenue and dependency on overseas supply chains than I would hope you are, but my understanding was that BGI didn’t have much business overseas (having failed to compete with Illumina), and was reliant on domestic demand mostly from agriculture & medicine, and didn’t depend on Illumina having spent the past decade+ trying and mostly failing to surpass Illumina (but at least it does have its own decent sequencers). So, having failed in those ways and forced into autarky, it doesn’t seem like sanctions/embargos can hurt BGI much more than it already has been? And macroeconomics-wise, sequencing seems like it would be reasonably robust a business because farmers won’t stop needing genetics-related services nor will patients stop getting sick. And politics-wise, I see no particular Xi-techlash angle for him doing things like overnight outlawing the industry or censors just refusing to approve any video game release for a year. So overall, doesn’t seem too bad.

Honestly, UAV seems like it’d be a worse place to go simply because that sounds to me like it’ll be more disrupted by random chip problems and export issues due to being extremely military-linked dual-use tech. Drones are stuffed full of all sorts of random weird little chips (see: Russian problems getting UAVs and resorts like getting them from Iran), and how much domestic Chinese demand for UAVs could there possibly be to make up for exports?
- Lao Mein 20 Oct 2022 3:54 UTC
  2 points
  0
  Parent
  Thanks, I was just really worried because our entire sequencing pipeline uses Illumina products. But I asked around our sequencing division, and they think the difference between using Illumina and BGI products isn’t too big—what BGI lacks in quality it makes up for in lower costs. Apparently, the difference in Q30 (% of reads with errors <0.1% of bases) is ~90-95 for Illumina and 80 for BGI, which is marginal. Switching wouldn’t be a major problem according to them. BGI also uses Chinese chipsets, which means sanctions aren’t going to impact it much. I don’t think this is going to be as bad as I first thought.
  How likely do you think China getting cut off from Illumina is? Do you think consumer GPUs are going to be restricted?
  - gwern 20 Oct 2022 22:16 UTC
    11 points
    2
    Parent
    Yes, that was my impression. BGI sequencing is not as good as Illumina, but it’s not like it’d destroy them to switch over. And sequencers don’t use any really high-end GPUs (even if you’d like to have them for bioinformatics), so it’s not like a chip embargo is an immediate halt-production problem the way it is for building a new supercomputer or building cars stuffed full of miscellaneous chips
    
    Do you think consumer GPUs are going to be restricted?
    
    Based on a plain reading, consumer GPUs already are restricted: it’s not a ban on A100s/H100s, it’s a ban on any system as powerful as an A100, to quote Nvidia:
    
    The license requirement also includes any future NVIDIA integrated circuit achieving both peak performance and chip-to-chip I/O performance equal to or greater than thresholds that are roughly equivalent to the A100, as well as any system that includes those circuits.
    
    This is a static threshold, which makes no allowance for Moore’s law. It is an upheld hand: “Thus far, and no further!” And it is a threshold which may be biting already. EDIT: it looks like they are going to grandfather in consumer GPUs, ignoring their TFLOPs and focusing only on the interconnect bandwidth+latency, where I think all Nvidia consumer GPUs < A100. This may be a bad bet: people have had not much incentive to figure out how to yoke together large numbers of consumer GPUs (as most people using >1 consumer GPU are either hobbyists who generally can stick all the GPUs they can afford into a single box, or cryptominers for whom interconnect is irrelevant because the PoW mining is by design embarassingly-parallel), so if Chinese hyperscalers or major AI labs can purchase all the 4090s they want, now there will be vastly more incentive to figure out hardware hacks (will, say, TSMC ban Chinese designs for interconnect chips which do no computation...?) or low-communication strategies (even large constant factor penalties, which would render them useless to Western groups, will eventually be worthwhile to Chinese groups barred from >=A100 GPUs.)
    
    What is the ‘peak performance’ roughly equivalent to an A100? Well, Nvidia’s product page tells me an A100 delivers 19 FP32 TFLOPs, where FP32 is probably a conservative number. (I don’t think there are very many cases in DL where you would still need FP64, and while everyone is trying to move to FP16 and lower, where an A100 does 312 TFLOPS instead, you still can’t always easily or reliably train every arch with FP/BF16 and you can encounter some severe problems especially with the important case of large Transformers.) 19 TFLOPS may sound like a lot… but the top-end gaming GPU Nvidia RTX 4090 goes as high as 86 TFLOPS (specced at 82 TFLOPS, with GeForce RTX 3090 Ti at 40 TFLOPS and GeForce RTX 2080 Ti at 14 TFLOPS).
    
    (The H100, incidentally, is 67 TFLOPS.)
    
    So, if USG is banning ‘peak performance roughly equivalent to A100’, and gaming GPUs are turning in TFLOPS several times that of the A100, then the obvious interpretation would seem to be that yeah, lots of consumer GPUs are already restricted. All the way back to like 2019 GPUs, possibly.
    
    (I was really surprised by this when I went to check the numbers; I did not think that gaming GPUs were that much faster than the A100. I knew that the A100 was already kinda obsolete with the H100 in the pipeline, that there was a ‘datacenter tax’ and ‘enterprise tax’, and also that everyone has kinda understated GPU progress because they were so hard to get for so long and most people have been trucking along with GPUs years out of date—even the V100 still shows up in papers occasionally—etc, but I didn’t think it would be that much slower, like 4x. Either I am badly misunderstanding units here, or I am greatly overrating A100s & underrating recent gaming GPUs like the absurdly large 4090, or I am making a mistake in looking at FP32 because Nvidia put all of the A100 advantages into lower precisions.)
    - gwern 27 Oct 2022 20:02 UTC
      7 points
      0
      Parent
      https://twitter.com/paul_scharre/status/1585684356819046400
      
      Under Secretary of Commerce Alan Estevez confirmed at a public @CNASdc event this morning [2022-10-27] that Commerce intends to keep the same technical threshold for chips in place over time.
    - aogara 20 Oct 2022 23:48 UTC
      1 point
      0
      Parent
      Why are gaming GPUs faster than ML GPUs? Are the two somehow adapted to their special purposes, or should ML people just be using gaming GPUs?
      - Nanda Ale 21 Oct 2022 2:47 UTC
        7 points
        0
        Parent
        >Why are gaming GPUs faster than ML GPUs? Are the two somehow adapted to their special purposes, or should ML people just be using gaming GPUs?
        They aren’t really that much faster, they are basically the same chips. It’s just that the pro version is 4X as expensive. It’s mostly a datacenter tax. The gaming GPUs do generally run way hotter and power hungry, especially boosting higher, and this puts them ahead against the equivalent ML GPUs in some scenarios.
        Price difference is not only a tax though—the ML GPUs do have differences but it usually swings things by 10 to 30 percent, occasionally more. Additionally the pro versions typically have 2X-4X the GPU memory which is a huge qualitative difference in capability, and they are physically smaller and run cooler so you can put a bunch of them inside one server and link them together with high speed NVLink cables into configurations that aren’t practical with 3090s. 3090s have a single NVLink port. Pro cards have three. 4090s curiously have zero—NVIDIA likely trying to stop the trend of using cheap gaming GPUs for research. Also, the ML GPUs for last gen were also on a a 7nm TSMC process, while the gaming GPUs were on Samsung 8nm process. This means the A100 using 250 watts outperforms the 3090 using 400 watts. But they are overall the same chip.
        None of that accounts for a 4X or more cost multiplier, and the TLDR is the chips are not that different. If gaming GPUs came in higher memory configurations, and all supported NVLink, and were legally allowed to be sold in datacenters, nobody would pay the cost multiplier.
    - Lao Mein 21 Oct 2022 1:14 UTC
      0 points
      0
      Parent
      I just realized that H100s are still available from online vendors for ~189000 yuan (~$27000), which is international market price.
      Well then. Time to cash out my savings and make some money. Do you guys think it’s feasible? How much are Chinese H100 prices likely to rise? Should I be trying to scoop up high-end GPUs instead?
      Update: local computer shop says international GPU suppliers are still accepting Chinese orders and Chinese GPU prices are stable for now.