I was thinking about Musk’s self driving vs AGI timeline predictions, wondering if there wasn;t a way they might not be as in conflict as it seems. Like, maybe Musk believes AI will happen in such a way that large-scale retooling of the vehicle fleet and sales of new vehicles will still take years. Then it occurred to me that the “easiest” (and quickest and cheapest) way to build myself a self-driving car in Musk’s 2029 could be for me to buy smart glasses, and exoskeleton arms and legs with actuators, then connect them to an AGI via my my phone and let it move my limbs around.
I was thinking about Musk’s self driving vs AGI timeline predictions, wondering if there wasn;t a way they might not be as in conflict as it seems. Like, maybe Musk believes AI will happen in such a way that large-scale retooling of the vehicle fleet and sales of new vehicles will still take years.
So ok, let’s go over the additional features to go from right now, or the GPT refresh in summer 2024, to reach an AGI sufficient to satisfy 5⁄6 of :
and
Probably the following will do it:
a. A stronger base model
b. Post deployment learning—when the model makes an output on a task where a right or wrong answer exists, it researches alternate solutions
c. An inner real-time model, called “system 1”, that is in direct control of robotics hardware
d. Robotics I/O to (c)
e. Realtime video perception
f. A physics world simulator. This is what Nvidia’s digital twin does, there are also deepmind papers, and the suspected way Sora breaks the world into space-time slices probably also does simulation
g. Possibly integrated plugin scaffolding, where the model can on previous tasks code up tools for it’s own use, and will have continued access to these tools in all future situations
This above should solve SAT (solved), Montezuma’s revenge (solved), WinoGrande (solved), Turing test silver, general robotic capabilities, Q&A dataset, top-1 APPS benchmark. The 2-hour adversial turing test may sit unsolved long after we have AGI.
So get this: you can do all the above. In human free environments you can rapidly deploy robots, and rapidly rebuild the infrastructure to support this. Robotic self replication is now mostly automated.
But you still don’t have a driver’s license, it requires racks of power hungry GPUs to achieve the above that are mounted in data centers (so no mobile vehicles except wireless robots with a high quality access point connection), and there are still edge cases where an autonomous vehicle will run someone over.
We could see vast automated factories without a single human employee even allowed to enter the building, and human truck drivers are still in the loading docks.
A surprising outcome, I have been a fan of self driving for a decade.
it requires racks of power hungry GPUs to achieve the above that are mounted in data centers
Inference with models trained for ternary quantization (which uses massively fewer multiplications and so less power) only needs hardware that can take advantage of it, doesn’t significantly lose quality compared to full precision. Though I don’t know if there is a good RNN-like block to enable large context while still able to mostly avoid multiplications with ternary weights (as opposed to activations, which need to be more precise), which seems crucial for video. A more pressing issue might be latency.
You’re right. I’ve also worked on the hardware for the SOTA for self driving pre-transformers. (2020-2022). We had enough DDR for smaller models, not 50B+ like https://robotics-transformer-x.github.io/ . Quantized support for int8 and fp16.
I’m thinking there are 2 factors working against self driving that won’t apply to other domains:
1. Stationary hardware in a data center gets utilized better and is going to be faster than the embedded boards mounted in a vehicle, and upgraded more often. (the robots can pause operations to handle higher priority requests, a factory doesn’t have to wait on people to summon an SDC taxi but can work 24⁄7.)
2. Liability/legal delays. What I was thinking is that if transformer models really do scale to general purpose robots that have human like manipulation and generality, you could use them without any delays in human free areas. I don’t think there are many legal requirements to do this. Initially in caged off areas:
Later you would just set the safety switches at the doors and make all human workers leave the building before enabling the robots. You could restock and clean retail stores, for example : make all the customers and workers leave before letting the robots go to work.
Ideally the area the robots are in would count as an “appliance” and thus avoid building codes as well. I don’t know precisely how this works, just I don’t think Samsung has to cater to the (slightly different from everywhere else) requirements of Beaumont, Texas to sell a microwave at a store there, but anyone wanting to build a shed or factory does.
I’d add that it might be much easier to make long-distance trucks self-driving than local distribution in many cases. My dad was a wholesaler to restaurants, I rode in the trucks to make deliveries sometimes, and I can tell you that making a self-driving truck work bringing food to pizzerias in NYC requires a willingness to double park and get tickets, and human-level maneuvering to carry stuff into weirdly shaped storage rooms. So even when we do have automated trucks, there will likely be humans carrying stuff on and off them for a while.
I was thinking about Musk’s self driving vs AGI timeline predictions, wondering if there wasn;t a way they might not be as in conflict as it seems. Like, maybe Musk believes AI will happen in such a way that large-scale retooling of the vehicle fleet and sales of new vehicles will still take years. Then it occurred to me that the “easiest” (and quickest and cheapest) way to build myself a self-driving car in Musk’s 2029 could be for me to buy smart glasses, and exoskeleton arms and legs with actuators, then connect them to an AGI via my my phone and let it move my limbs around.
So ok, let’s go over the additional features to go from right now, or the GPT refresh in summer 2024, to reach an AGI sufficient to satisfy 5⁄6 of :
and
Probably the following will do it:
a. A stronger base model
b. Post deployment learning—when the model makes an output on a task where a right or wrong answer exists, it researches alternate solutions
c. An inner real-time model, called “system 1”, that is in direct control of robotics hardware
d. Robotics I/O to (c)
e. Realtime video perception
f. A physics world simulator. This is what Nvidia’s digital twin does, there are also deepmind papers, and the suspected way Sora breaks the world into space-time slices probably also does simulation
g. Possibly integrated plugin scaffolding, where the model can on previous tasks code up tools for it’s own use, and will have continued access to these tools in all future situations
This above should solve SAT (solved), Montezuma’s revenge (solved), WinoGrande (solved), Turing test silver, general robotic capabilities, Q&A dataset, top-1 APPS benchmark. The 2-hour adversial turing test may sit unsolved long after we have AGI.
So get this: you can do all the above. In human free environments you can rapidly deploy robots, and rapidly rebuild the infrastructure to support this. Robotic self replication is now mostly automated.
But you still don’t have a driver’s license, it requires racks of power hungry GPUs to achieve the above that are mounted in data centers (so no mobile vehicles except wireless robots with a high quality access point connection), and there are still edge cases where an autonomous vehicle will run someone over.
We could see vast automated factories without a single human employee even allowed to enter the building, and human truck drivers are still in the loading docks.
A surprising outcome, I have been a fan of self driving for a decade.
Inference with models trained for ternary quantization (which uses massively fewer multiplications and so less power) only needs hardware that can take advantage of it, doesn’t significantly lose quality compared to full precision. Though I don’t know if there is a good RNN-like block to enable large context while still able to mostly avoid multiplications with ternary weights (as opposed to activations, which need to be more precise), which seems crucial for video. A more pressing issue might be latency.
You’re right. I’ve also worked on the hardware for the SOTA for self driving pre-transformers. (2020-2022). We had enough DDR for smaller models, not 50B+ like https://robotics-transformer-x.github.io/ . Quantized support for int8 and fp16.
I’m thinking there are 2 factors working against self driving that won’t apply to other domains:
1. Stationary hardware in a data center gets utilized better and is going to be faster than the embedded boards mounted in a vehicle, and upgraded more often. (the robots can pause operations to handle higher priority requests, a factory doesn’t have to wait on people to summon an SDC taxi but can work 24⁄7.)
2. Liability/legal delays. What I was thinking is that if transformer models really do scale to general purpose robots that have human like manipulation and generality, you could use them without any delays in human free areas. I don’t think there are many legal requirements to do this. Initially in caged off areas:
Later you would just set the safety switches at the doors and make all human workers leave the building before enabling the robots. You could restock and clean retail stores, for example : make all the customers and workers leave before letting the robots go to work.
Ideally the area the robots are in would count as an “appliance” and thus avoid building codes as well. I don’t know precisely how this works, just I don’t think Samsung has to cater to the (slightly different from everywhere else) requirements of Beaumont, Texas to sell a microwave at a store there, but anyone wanting to build a shed or factory does.
That seems entirely plausible to me.
I’d add that it might be much easier to make long-distance trucks self-driving than local distribution in many cases. My dad was a wholesaler to restaurants, I rode in the trucks to make deliveries sometimes, and I can tell you that making a self-driving truck work bringing food to pizzerias in NYC requires a willingness to double park and get tickets, and human-level maneuvering to carry stuff into weirdly shaped storage rooms. So even when we do have automated trucks, there will likely be humans carrying stuff on and off them for a while.