Also, even if we can train and run a model the size of the human brain, it would still be many orders of magnitude less energy efficient than an actual brain. Human brains use barely 20 watts.
For inference on a GPT-4 level model, GPUs use much less than a human brain, about 1-2 watts (across all necessary GPUs), if we imagine slowing them down to human speed and split the power among the LLM instances that are being processed at the same time. Even for a 30 trillion parameter model, it might only get up to 30-60 watts in this sense.
each H100 GPU uses 700 watts alone
Should count the rest of the datacenter as well, which gets it up to 1200-1400 watts per H100, about 2000 watts for B200s in GB200 systems. (It’s hilarious how some model training papers make calculations using 700 watts for CO2 emission estimates. They feel obliged to make the calculations, but then cheat like there’s no tomorrow.)
For inference on a GPT-4 level model, GPUs use much less than a human brain, about 1-2 watts (across all necessary GPUs), if we imagine slowing them down to human speed and split the power among the LLM instances that are being processed at the same time. Even for a 30 trillion parameter model, it might only get up to 30-60 watts in this sense.
Should count the rest of the datacenter as well, which gets it up to 1200-1400 watts per H100, about 2000 watts for B200s in GB200 systems. (It’s hilarious how some model training papers make calculations using 700 watts for CO2 emission estimates. They feel obliged to make the calculations, but then cheat like there’s no tomorrow.)
I was not aware of these. Thanks!