Thanks. Your post makes point #3 from my post, and it makes two additional points I’ll call #5 and #6:
Onboard compute for Teslas, which is a constraint on model size, is tightly limited, whereas LLMs that live in the cloud don’t have to worry nearly as much about the physical space they take up, the cost of the hardware, or their power consumption.
Self-driving cars don’t get to learn through trial-and-error and become gradually more reliable, whereas LLMs do.
Re: (5), I wonder why the economics of, say, making a ChatGPT Plus subscription profitable wouldn’t constrain inference compute for GPT-4 just as much as for a Tesla.
Re: (6), Tesla customers acting as safety drivers for the “Full Self-Driving Capability” software seems like it contradicts this point.
Re (5): That might be plausible a priori, but a posteriori it seems that people are willing to pay for GPT4 despite it being way bigger/more-expensive than a tesla HW3 or HW4 computer can handle. Moreover you can make an AI system that is bigger still, and train it, and hope that it’ll pay for itself later (this worked for GPT3 and GPT4); you physically can’t put a bigger AI on your fleet of Teslas, the hardware doesn’t support it, and the idea of rebuilding the cars to have 10-100x bigger onboard computers is laughable.
To emphasize point 5 more, think about all the progress in the field of AI that has come from scaling. Think about how dumb GPT-4 would be if it was only the size of GPT-2. (go to r/locallama, scroll around, maybe download some 2B parameter models and play around with them). Scaling is arguably the biggest source of progress in AI these days… and robotaxis are mostly unable to benefit from it. (Well, they can and do scale data, but they can’t scale parameters very much.)
Re (6): It only partially contradicts the point; IMO the point still stands (though not as strongly as it would without that data!). The data Tesla gets from customers is mostly of the form “AI did X, customer intervened, no crash occurred” with a smattering of “Customer was driving, and a crash occurred.” There are precious few datapoints of “AI did X, causing crash.” Like, I’m not even sure there is 100 of them. Now obviously “customer intervened” is a proxy for “AI did something dangerous” but only a very poor proxy since customers intervene all the time when they get nervous, the vast majority of interventions are unnecessary. And customer crashes are nice but still pretty rare, and anyhow learning from others’ mistakes just isn’t as good as learning from your own.
I think the main hope here is to learn to avoid crashes in simulation, and then transfer/generalize to reality. That’s what Tesla is trying and the other companies too I think. If it works, then great, data problem solved. But sim-2-real is tricky and not a fully solved problem I think (though definitely partially solved? It seems to at least somewhat work in a bunch of domains.)
Overall I think (5) is my main argument, I think (6) and (3) are weaker.
Thanks. Your post makes point #3 from my post, and it makes two additional points I’ll call #5 and #6:
Onboard compute for Teslas, which is a constraint on model size, is tightly limited, whereas LLMs that live in the cloud don’t have to worry nearly as much about the physical space they take up, the cost of the hardware, or their power consumption.
Self-driving cars don’t get to learn through trial-and-error and become gradually more reliable, whereas LLMs do.
Re: (5), I wonder why the economics of, say, making a ChatGPT Plus subscription profitable wouldn’t constrain inference compute for GPT-4 just as much as for a Tesla.
Re: (6), Tesla customers acting as safety drivers for the “Full Self-Driving Capability” software seems like it contradicts this point.
Curious to hear your thoughts.
Nice.
Re (5): That might be plausible a priori, but a posteriori it seems that people are willing to pay for GPT4 despite it being way bigger/more-expensive than a tesla HW3 or HW4 computer can handle. Moreover you can make an AI system that is bigger still, and train it, and hope that it’ll pay for itself later (this worked for GPT3 and GPT4); you physically can’t put a bigger AI on your fleet of Teslas, the hardware doesn’t support it, and the idea of rebuilding the cars to have 10-100x bigger onboard computers is laughable.
To emphasize point 5 more, think about all the progress in the field of AI that has come from scaling. Think about how dumb GPT-4 would be if it was only the size of GPT-2. (go to r/locallama, scroll around, maybe download some 2B parameter models and play around with them). Scaling is arguably the biggest source of progress in AI these days… and robotaxis are mostly unable to benefit from it. (Well, they can and do scale data, but they can’t scale parameters very much.)
Re (6): It only partially contradicts the point; IMO the point still stands (though not as strongly as it would without that data!). The data Tesla gets from customers is mostly of the form “AI did X, customer intervened, no crash occurred” with a smattering of “Customer was driving, and a crash occurred.” There are precious few datapoints of “AI did X, causing crash.” Like, I’m not even sure there is 100 of them. Now obviously “customer intervened” is a proxy for “AI did something dangerous” but only a very poor proxy since customers intervene all the time when they get nervous, the vast majority of interventions are unnecessary. And customer crashes are nice but still pretty rare, and anyhow learning from others’ mistakes just isn’t as good as learning from your own.
I think the main hope here is to learn to avoid crashes in simulation, and then transfer/generalize to reality. That’s what Tesla is trying and the other companies too I think. If it works, then great, data problem solved. But sim-2-real is tricky and not a fully solved problem I think (though definitely partially solved? It seems to at least somewhat work in a bunch of domains.)
Overall I think (5) is my main argument, I think (6) and (3) are weaker.