>Why are gaming GPUs faster than ML GPUs? Are the two somehow adapted to their special purposes, or should ML people just be using gaming GPUs?
They aren’t really that much faster, they are basically the same chips. It’s just that the pro version is 4X as expensive. It’s mostly a datacenter tax. The gaming GPUs do generally run way hotter and power hungry, especially boosting higher, and this puts them ahead against the equivalent ML GPUs in some scenarios.
Price difference is not only a tax though—the ML GPUs do have differences but it usually swings things by 10 to 30 percent, occasionally more. Additionally the pro versions typically have 2X-4X the GPU memory which is a huge qualitative difference in capability, and they are physically smaller and run cooler so you can put a bunch of them inside one server and link them together with high speed NVLink cables into configurations that aren’t practical with 3090s. 3090s have a single NVLink port. Pro cards have three. 4090s curiously have zero—NVIDIA likely trying to stop the trend of using cheap gaming GPUs for research. Also, the ML GPUs for last gen were also on a a 7nm TSMC process, while the gaming GPUs were on Samsung 8nm process. This means the A100 using 250 watts outperforms the 3090 using 400 watts. But they are overall the same chip.
None of that accounts for a 4X or more cost multiplier, and the TLDR is the chips are not that different. If gaming GPUs came in higher memory configurations, and all supported NVLink, and were legally allowed to be sold in datacenters, nobody would pay the cost multiplier.
>Why are gaming GPUs faster than ML GPUs? Are the two somehow adapted to their special purposes, or should ML people just be using gaming GPUs?
They aren’t really that much faster, they are basically the same chips. It’s just that the pro version is 4X as expensive. It’s mostly a datacenter tax. The gaming GPUs do generally run way hotter and power hungry, especially boosting higher, and this puts them ahead against the equivalent ML GPUs in some scenarios.
Price difference is not only a tax though—the ML GPUs do have differences but it usually swings things by 10 to 30 percent, occasionally more. Additionally the pro versions typically have 2X-4X the GPU memory which is a huge qualitative difference in capability, and they are physically smaller and run cooler so you can put a bunch of them inside one server and link them together with high speed NVLink cables into configurations that aren’t practical with 3090s. 3090s have a single NVLink port. Pro cards have three. 4090s curiously have zero—NVIDIA likely trying to stop the trend of using cheap gaming GPUs for research. Also, the ML GPUs for last gen were also on a a 7nm TSMC process, while the gaming GPUs were on Samsung 8nm process. This means the A100 using 250 watts outperforms the 3090 using 400 watts. But they are overall the same chip.
None of that accounts for a 4X or more cost multiplier, and the TLDR is the chips are not that different. If gaming GPUs came in higher memory configurations, and all supported NVLink, and were legally allowed to be sold in datacenters, nobody would pay the cost multiplier.