A model/ensemble of models will achieve >90% on the MATH dataset using a no-calculator rule
A “no calculator rule”. If the model is just a giant neural network, it is pretty clear what this means. (Although unclear why you should care, real world neural nets are allowed to use calculators). Over the general space of all AI techniques, its unclear what this means.
A robot that can, from beginning to end, reliably wash dishes, take them out of an ordinary dishwasher and stack them into a cabinet, without breaking any dishes, and at a comparable speed to humans (<120% the average time)
This sound to me like it depends on the robotics hardware at least as much as it depends on the software.
It also has a lot of wiggle room. Suppose your robot only works under specific lighting conditions. With a specific design of dishwasher. All the cutlery and crockery has been laser scanned in advance. All the cutlery has an RFID chip glued to it. The robot doesn’t have legs, so it only works if the cabinet is within easy reach of the dishwasher. In an extreme case, imagine a large machine that consists of all sorts of vibrating ridged trays and spinning rubber cones. The metal cutlery is pulled out using a big electromagnet. At one point, different dishes are separated using a vertical wind tunnel. The equipment is too big to fit in a typical kitchen. It is almost entirely dumb. The mechanism is totally not like a human manually putting away dishes. Yet thanks to plentiful foam padding, this machine can put away even fragile dishes without breaking them.
(The way cranberries are harvested isn’t by getting careful robot arms that can visually spot each berry. They flood the field with water, bash about to knock the berries off, wait for the berries to float, and then skim them off. )
There is also the problem of maybe no one cares about that particular metric.
You could get a world where the latest AI techniques are easily enough to do your dishes task. But nobody has actually decided to bother yet. The top AI experts could program this dish robot in a weekend, but they don’t because they are busy. Or because, given the current state of the public perception of AI, a photo of a robot holding a large knife would be a PR nightmare.
A “no calculator rule”. If the model is just a giant neural network, it is pretty clear what this means. (Although unclear why you should care, real world neural nets are allowed to use calculators). Over the general space of all AI techniques, its unclear what this means.
This sound to me like it depends on the robotics hardware at least as much as it depends on the software.
It also has a lot of wiggle room. Suppose your robot only works under specific lighting conditions. With a specific design of dishwasher. All the cutlery and crockery has been laser scanned in advance. All the cutlery has an RFID chip glued to it. The robot doesn’t have legs, so it only works if the cabinet is within easy reach of the dishwasher. In an extreme case, imagine a large machine that consists of all sorts of vibrating ridged trays and spinning rubber cones. The metal cutlery is pulled out using a big electromagnet. At one point, different dishes are separated using a vertical wind tunnel. The equipment is too big to fit in a typical kitchen. It is almost entirely dumb. The mechanism is totally not like a human manually putting away dishes. Yet thanks to plentiful foam padding, this machine can put away even fragile dishes without breaking them.
(The way cranberries are harvested isn’t by getting careful robot arms that can visually spot each berry. They flood the field with water, bash about to knock the berries off, wait for the berries to float, and then skim them off. )
There is also the problem of maybe no one cares about that particular metric.
You could get a world where the latest AI techniques are easily enough to do your dishes task. But nobody has actually decided to bother yet. The top AI experts could program this dish robot in a weekend, but they don’t because they are busy. Or because, given the current state of the public perception of AI, a photo of a robot holding a large knife would be a PR nightmare.