One question I sometimes see people asking is, if AGI is so close, where are the self-driving cars? I think the answer is much simpler, and much stupider, than you’d think.
Waymo is operating self-driving robotaxis in SF and a few other select cities, without safety drivers. They use LIDAR, so instead of the cognitive task of driving as a human would solve it, they have substituted the easier task “driving but your eyes are laser rangefinders”.
Tesla also has self-driving, but it isn’t reliable enough to work without close human oversight. Until less than a month ago, they were using 1.2 megapixel black and white cameras. So instead of the cognitive task of driving as a human would solve it, they substituted the harder task “driving with a vision impairment and no glasses”.
If my understanding is correct, this means that Tesla’s struggle to get neural nets to drive was probably not a problem with the neural nets, and doesn’t tell us much of anything about the state of AI.
My answer to this is quite different. The paradigm that is currently getting very close to AGI is basically having a single end-to-end trained system with tons of supervised learning.
Self-driving car AI is not actually operating in this current paradigm as far as I can tell, but is operating much more in the previous paradigm of “build lots of special-purpose AI modules that you combine with the use of lots of special-case heuristics”. My sense is a lot of this is historical momentum, but also a lot of it is that you just really want your self-driving AI to be extremely reliable, so training it end-to-end is very scary.
I have outstanding bets that human self-driving performance will be achieved when people switch towards a more end-to-end trained approach without tons of custom heuristics and code.
My understanding is that they used to have a lot more special-purpose modules than they do now, but their “occupancy network” architecture has replaced a bunch of them. So they have one big end-to-end network doing most of the vision, which hands a volumetric representation over to the collection of special-purpose-smaller-modules for path planning. But path planning is the easier part (easier to generate synthetic data for, easier to detect if something is going wrong beforehand and send a take-over alarm.).
One question I sometimes see people asking is, if AGI is so close, where are the self-driving cars? I think the answer is much simpler, and much stupider, than you’d think.
Waymo is operating self-driving robotaxis in SF and a few other select cities, without safety drivers. They use LIDAR, so instead of the cognitive task of driving as a human would solve it, they have substituted the easier task “driving but your eyes are laser rangefinders”.
Tesla also has self-driving, but it isn’t reliable enough to work without close human oversight. Until less than a month ago, they were using 1.2 megapixel black and white cameras. So instead of the cognitive task of driving as a human would solve it, they substituted the harder task “driving with a vision impairment and no glasses”.
If my understanding is correct, this means that Tesla’s struggle to get neural nets to drive was probably not a problem with the neural nets, and doesn’t tell us much of anything about the state of AI.
(Crossposted with Facebook, Twitter)
My answer to this is quite different. The paradigm that is currently getting very close to AGI is basically having a single end-to-end trained system with tons of supervised learning.
Self-driving car AI is not actually operating in this current paradigm as far as I can tell, but is operating much more in the previous paradigm of “build lots of special-purpose AI modules that you combine with the use of lots of special-case heuristics”. My sense is a lot of this is historical momentum, but also a lot of it is that you just really want your self-driving AI to be extremely reliable, so training it end-to-end is very scary.
I have outstanding bets that human self-driving performance will be achieved when people switch towards a more end-to-end trained approach without tons of custom heuristics and code.
My understanding is that they used to have a lot more special-purpose modules than they do now, but their “occupancy network” architecture has replaced a bunch of them. So they have one big end-to-end network doing most of the vision, which hands a volumetric representation over to the collection of special-purpose-smaller-modules for path planning. But path planning is the easier part (easier to generate synthetic data for, easier to detect if something is going wrong beforehand and send a take-over alarm.).
That… Would be hilarious, if true. Do you think we will see self driving cars soon, then?