One IMO important thing that isn’t mentioned here is scaling parameter count. Neural nets can be fairly straightforwardly improved simply by making them bigger. For LLMs and AGI, there’s plenty of room to scale up, but for the neural nets that run on cars, there isn’t. Tesla’s self-driving hardware, for example, has to fit on a single chip and has to consume a small amount of energy (otherwise it’ll impact the range of the car.) They cannot just add an OOM of parameters, much less three.
They cannot just add an OOM of parameters, much less three.
How about 2 OOM’s?
HW2.5 21Tflops HW3 72x2 = 72 Tflops (redundant), HW4 3x72=216Tflops (not sure about redundancy) and Elon said in June that next gen AI5 chip for fsd would be about 10x faster say ~2Pflops
By rough approximation to brain processing power you get about 0.1Pflop per gram of brain so HW2.5 might have been a 0.2g baby mouse brain, HW3 a 1g baby rat brain HW4 perhaps adult rat, and upcoming HW5 a 20g small cat brain.
As a real world analogue cat to dog (25-100g brain) seems to me the minimum necessary range of complexity based on behavioral capabilities to do a decent job of driving—need some ability to anticipate and predict motivations and behavior of other road users and something beyond dumb reactive handling (ie somewhat predictive) to understand anomalous objects that exist on and around roads.
Nvidia Blackwell B200 can do up to about 10pflops of FP8, which is getting into large dog/wolf brain processing range, and wouldn’t be unreasonable to package in a self driving car once down closer to manufacturing cost in a few years at around 1kW peak power consumption.
I don’t think the rat brain HW4 is going to cut it, and I suspect that internal to Tesla they know it too, but it’s going to be crazy expensive to own up to it, better to keep kicking the can down the road with promises until they can deliver the real thing. AI5 might just do it, but wouldn’t be surprising to need a further oom to Nvidia Blackwell equivalent and maybe $10k extra cost to get there.
I agree about it having to fit on a single chip, but surely the neural net on-board would only have a relatively negligible impact on range compared to how much the electric motor consumes in motion?
IIRC in one of Tesla’s talks (I forget which one) they said that energy consumption of the chip was a constraint because they didn’t want it to reduce the range of the car. A quick google seems to confirm this. 100W is the limit they say: FSD Chip—Tesla—WikiChip
IDK anything about engineering, but napkin math based on googling: FSD chip consumes 36 watts currently. Over the course of 10 hours that’s 0.36 kWh. Tesla model 3 battery can fit 55kwh total, and takes about ten hours of driving to use all that up (assuming you average 30mph?) So that seems to mean that FSD chip currently uses about two-thirds of one percent of the total range of the vehicle. So if they 10x’d it, in addition to adding thousands of dollars of upfront cost due to the chips being bigger / using more chips, there would be a 6% range reduction. And then if they 10x’d it again the car would be crippled. This napkin math could be totally confused tbc.
(This napkin math is making me think Tesla might be making a strategic mistake by not going for just one more OOM. It would reduce the range and add a lot to the cost of the car, but… maybe it would be enough to add an extra 9 or two of reliability… But it’s definitely not an obvious call and I can totally see why they wouldn’t want to risk it.)
One IMO important thing that isn’t mentioned here is scaling parameter count. Neural nets can be fairly straightforwardly improved simply by making them bigger. For LLMs and AGI, there’s plenty of room to scale up, but for the neural nets that run on cars, there isn’t. Tesla’s self-driving hardware, for example, has to fit on a single chip and has to consume a small amount of energy (otherwise it’ll impact the range of the car.) They cannot just add an OOM of parameters, much less three.
They cannot just add an OOM of parameters, much less three.
How about 2 OOM’s?
HW2.5 21Tflops HW3 72x2 = 72 Tflops (redundant), HW4 3x72=216Tflops (not sure about redundancy) and Elon said in June that next gen AI5 chip for fsd would be about 10x faster say ~2Pflops
By rough approximation to brain processing power you get about 0.1Pflop per gram of brain so HW2.5 might have been a 0.2g baby mouse brain, HW3 a 1g baby rat brain HW4 perhaps adult rat, and upcoming HW5 a 20g small cat brain.
As a real world analogue cat to dog (25-100g brain) seems to me the minimum necessary range of complexity based on behavioral capabilities to do a decent job of driving—need some ability to anticipate and predict motivations and behavior of other road users and something beyond dumb reactive handling (ie somewhat predictive) to understand anomalous objects that exist on and around roads.
Nvidia Blackwell B200 can do up to about 10pflops of FP8, which is getting into large dog/wolf brain processing range, and wouldn’t be unreasonable to package in a self driving car once down closer to manufacturing cost in a few years at around 1kW peak power consumption.
I don’t think the rat brain HW4 is going to cut it, and I suspect that internal to Tesla they know it too, but it’s going to be crazy expensive to own up to it, better to keep kicking the can down the road with promises until they can deliver the real thing. AI5 might just do it, but wouldn’t be surprising to need a further oom to Nvidia Blackwell equivalent and maybe $10k extra cost to get there.
I agree about it having to fit on a single chip, but surely the neural net on-board would only have a relatively negligible impact on range compared to how much the electric motor consumes in motion?
IIRC in one of Tesla’s talks (I forget which one) they said that energy consumption of the chip was a constraint because they didn’t want it to reduce the range of the car. A quick google seems to confirm this. 100W is the limit they say: FSD Chip—Tesla—WikiChip
IDK anything about engineering, but napkin math based on googling: FSD chip consumes 36 watts currently. Over the course of 10 hours that’s 0.36 kWh. Tesla model 3 battery can fit 55kwh total, and takes about ten hours of driving to use all that up (assuming you average 30mph?) So that seems to mean that FSD chip currently uses about two-thirds of one percent of the total range of the vehicle. So if they 10x’d it, in addition to adding thousands of dollars of upfront cost due to the chips being bigger / using more chips, there would be a 6% range reduction. And then if they 10x’d it again the car would be crippled. This napkin math could be totally confused tbc.
(This napkin math is making me think Tesla might be making a strategic mistake by not going for just one more OOM. It would reduce the range and add a lot to the cost of the car, but… maybe it would be enough to add an extra 9 or two of reliability… But it’s definitely not an obvious call and I can totally see why they wouldn’t want to risk it.)
(Maybe the real constraint is cost of the chips. If each chip is currently say $5,000, then 10xing would add $45,000 to the cost of the car...)