Mh. I do appreciate the correction, and you do seem to have knowledge here that I do not, but I am not convinced.
Right now, chatbots can perform at levels comparable to humans on writing related tasks that humans actually do. Sure, they hallucinate, they get confused, their spatial reasoning is weak, their theory of mind is weak, etc. but they pass exams with decent grades, write essays that get into newspapers and universities and magazines, pass the Turing test, write a cover letter and correct a CV, etc. Your mileage will vary with whether they outperform a human or act like a pretty shitty human who is transparently an AI, but they are doing comparable things. And notably, the same system is doing all of these things—writing dialogues, writing code, giving advice, generating news articles.
Can you show me a robot that is capable of playing in a football and basketball match? And then dancing a tango with a partner in a crowded room? I am not saying perfectly. It is welcome to be a shitty player, who sometimes trips or misses the ball. 80 % accuracy, if you like. Our ChatBots can be beaten by 9 year old kids at some tasks, so fair enough, let the robot play football and dance with nine year olds, compete with nine year olds. But I want it running, bipedal, across a rough field, kicking a ball into the goal (or at least the approximate direction, like a kid would) with one of the two legs it is running on, while evading players who are trying to snatch the ball away, and without causing anyone severe injury. I want the same robot responding to pressure cues from the dance partner, navigating them around other dancing couples, to the rhythm of the music, holding them enough to give them support without holding them so hard they cause injury. I want the same robot walking into a novel building, and helping with tidying up and cleaning it, identifying stains and chemical bottles, selecting cleaning tools and scrubbing hard enough to get the dirt of without damaging the underlying material while then coating the surface evenly with disinfectant. Correct me if I am wrong—there is so much cool stuff happening in this field so quickly, and a lot of it is simply not remotely my area of expertise. But I am under the impression that we do not have robots who are remotely capable of this.
This is the crazy shit that sensory-motor coordination does. Holding objects hard enough that they do not slip, but without crushing them. Catching flying projectiles, and throwing them at targets, even though they are novel projectiles we have never handled before, and even when the targets are moving. Keeping our balance while bipedal, on uneven and moving ground, and while balancing heavy objects or supporting another person. Staying standing when someone is actively trying to trip you. Entering a novel, messy space, getting oriented, identifying its contents, even if it contains objects we have never seen in this form. Balancing on one leg. Chasing someone through the jungle. I am familiar with projects that have targeted these problems in isolation—heck, I saw the first robot that was capable of playing Jenga, like… nearly two decades ago? But all of this shit in coordination, within a shifting and novel environment?
In comparison, deploying a robot on a clearly marked road with clearly repeating signs, or in the air, is chosing ridiculously easy problems. Akin to programming a software that does not have flexible conversations with you, but is capable of responding to a fixed set of specific prompts with specific responses, and clustering all other prompts into the existing categories or an error.
Part of it is not the difficulty of the task, but many of the tasks you give as examples require very expensive hand built (ironically) robotics hardware to even try them. There are mere hundreds of instances of that hardware, and they are hundreds of thousands of dollars each.
There is insufficient scale. Think of all the AI hype and weak results before labs had clusters of 2048 A100s and trillion token text databases. Scale counts for everything. If in 1880, chemists had figured out how to release energy through fission, but didn’t have enough equipment and money to get weapons grade fissionables until 1944, imagine how bored we would have been with nuclear bomb hype. Nature does not care if you know the answer, only that you have more than a kilogram of refined fissionables, or nothing interesting will happen.
The thing is about your examples is that machines are trivially superhuman in all those tasks. Sure, not at the full set combined, but that’s from lack of trying—nobody has built anything with the necessary scale.
I am sure you have seen the demonstrations of a ball bearing on a rail and an electric motor keeping it balanced, or a double pendulum stabilized by a robot, or quadcopters remaining in flight with 1 wing clipped, using a control algorithm that dynamically adjusts flight after the wing damage.
All easy RL problems, all completely impossible for human beings. (we react too slowly)
The majority of what you mention are straightforward reinforcement learning problems and solvable with a general method. Most robotics manipulation tasks fall into this space.
Note that there is no economic incentive to solve many of the tasks you mention, so they won’t be. But general manufacturing robotics, where you can empty a bin of random parts in front of the machine(s), and they assemble as many fully built products of the design you provided that the parts pile allows? Very solvable and the recent google AI papers show it’s relatively easy. (I say easy because the solutions are not very complex in source code, and relatively small numbers of people are working on them.)
I assume at least for now, everyone will use nice precise industrial robot arms and overhead cameras and lidars mounted in optimal places to view the work space—there is no economic benefit to ‘embodiment’ or a robot janitor entering a building like you describe. Dancing with a partner is too risky.
But it’s not a problem of motion control or sensing, machinery is superhuman in all these ways. It’s a waste of components and compute to give a machine 2 legs or that many DOF. Nobody is going to do that for a while.
Mh. I do appreciate the correction, and you do seem to have knowledge here that I do not, but I am not convinced.
Right now, chatbots can perform at levels comparable to humans on writing related tasks that humans actually do. Sure, they hallucinate, they get confused, their spatial reasoning is weak, their theory of mind is weak, etc. but they pass exams with decent grades, write essays that get into newspapers and universities and magazines, pass the Turing test, write a cover letter and correct a CV, etc. Your mileage will vary with whether they outperform a human or act like a pretty shitty human who is transparently an AI, but they are doing comparable things. And notably, the same system is doing all of these things—writing dialogues, writing code, giving advice, generating news articles.
Can you show me a robot that is capable of playing in a football and basketball match? And then dancing a tango with a partner in a crowded room? I am not saying perfectly. It is welcome to be a shitty player, who sometimes trips or misses the ball. 80 % accuracy, if you like. Our ChatBots can be beaten by 9 year old kids at some tasks, so fair enough, let the robot play football and dance with nine year olds, compete with nine year olds. But I want it running, bipedal, across a rough field, kicking a ball into the goal (or at least the approximate direction, like a kid would) with one of the two legs it is running on, while evading players who are trying to snatch the ball away, and without causing anyone severe injury. I want the same robot responding to pressure cues from the dance partner, navigating them around other dancing couples, to the rhythm of the music, holding them enough to give them support without holding them so hard they cause injury. I want the same robot walking into a novel building, and helping with tidying up and cleaning it, identifying stains and chemical bottles, selecting cleaning tools and scrubbing hard enough to get the dirt of without damaging the underlying material while then coating the surface evenly with disinfectant. Correct me if I am wrong—there is so much cool stuff happening in this field so quickly, and a lot of it is simply not remotely my area of expertise. But I am under the impression that we do not have robots who are remotely capable of this.
This is the crazy shit that sensory-motor coordination does. Holding objects hard enough that they do not slip, but without crushing them. Catching flying projectiles, and throwing them at targets, even though they are novel projectiles we have never handled before, and even when the targets are moving. Keeping our balance while bipedal, on uneven and moving ground, and while balancing heavy objects or supporting another person. Staying standing when someone is actively trying to trip you. Entering a novel, messy space, getting oriented, identifying its contents, even if it contains objects we have never seen in this form. Balancing on one leg. Chasing someone through the jungle. I am familiar with projects that have targeted these problems in isolation—heck, I saw the first robot that was capable of playing Jenga, like… nearly two decades ago? But all of this shit in coordination, within a shifting and novel environment?
In comparison, deploying a robot on a clearly marked road with clearly repeating signs, or in the air, is chosing ridiculously easy problems. Akin to programming a software that does not have flexible conversations with you, but is capable of responding to a fixed set of specific prompts with specific responses, and clustering all other prompts into the existing categories or an error.
Part of it is not the difficulty of the task, but many of the tasks you give as examples require very expensive hand built (ironically) robotics hardware to even try them. There are mere hundreds of instances of that hardware, and they are hundreds of thousands of dollars each.
There is insufficient scale. Think of all the AI hype and weak results before labs had clusters of 2048 A100s and trillion token text databases. Scale counts for everything. If in 1880, chemists had figured out how to release energy through fission, but didn’t have enough equipment and money to get weapons grade fissionables until 1944, imagine how bored we would have been with nuclear bomb hype. Nature does not care if you know the answer, only that you have more than a kilogram of refined fissionables, or nothing interesting will happen.
The thing is about your examples is that machines are trivially superhuman in all those tasks. Sure, not at the full set combined, but that’s from lack of trying—nobody has built anything with the necessary scale.
I am sure you have seen the demonstrations of a ball bearing on a rail and an electric motor keeping it balanced, or a double pendulum stabilized by a robot, or quadcopters remaining in flight with 1 wing clipped, using a control algorithm that dynamically adjusts flight after the wing damage.
All easy RL problems, all completely impossible for human beings. (we react too slowly)
The majority of what you mention are straightforward reinforcement learning problems and solvable with a general method. Most robotics manipulation tasks fall into this space.
Note that there is no economic incentive to solve many of the tasks you mention, so they won’t be. But general manufacturing robotics, where you can empty a bin of random parts in front of the machine(s), and they assemble as many fully built products of the design you provided that the parts pile allows? Very solvable and the recent google AI papers show it’s relatively easy. (I say easy because the solutions are not very complex in source code, and relatively small numbers of people are working on them.)
I assume at least for now, everyone will use nice precise industrial robot arms and overhead cameras and lidars mounted in optimal places to view the work space—there is no economic benefit to ‘embodiment’ or a robot janitor entering a building like you describe. Dancing with a partner is too risky.
But it’s not a problem of motion control or sensing, machinery is superhuman in all these ways. It’s a waste of components and compute to give a machine 2 legs or that many DOF. Nobody is going to do that for a while.
3 days later...
https://palm-e.github.io/ https://www.lesswrong.com/posts/sMZRKnwZDDy2sAX7K/google-s-palm-e-an-embodied-multimodal-language-model
from the paper: “Data efficiency. Compared to available massive language or vision-language datasets, robotics data is significantly less abundant”
As I was saying, the reason robotics wasn’t as successful as the other tasks is because of scale, and Google seems to hold thisopinion.