This is interesting, but I’m a bit stuck on the claim that there is already cat-level AI (and more generally, AI matching various animals). In my experience with cats, they are fairly dumb, but they seem to have the sort of general intelligence we have, just a lot less. My intuition is that no AI has yet achieved that generality.
For example, some cats can, with great patience from the trainer, learn to recognize commands and perform tricks, much like dogs (but with the training difficulty being higher). VPT can’t do that. In some sense, I’m not even sure what it would mean for VPT to be able to do that, since it doesn’t interact with the world in that way.
If you read the VPT post/paper, to get to diamond-crafting they actually did use reinforcement learning on top of the behavior cloning, which is actually much more similar to how a cat is trained through rewards. I think it’s pretty clear that a cat could not be trained to play minecraft to diamond-level. Of course that’s not really a fair comparison, but let’s make it more fair . ..
Imagine if the cat brain was more directly wired into a VR minecraft, so that translations of it’s neural commands for looking around, moving, etc, were all translated. Do you think through reward-training we could get cats to diamond-level? I doubt it—we are talking about a long chained sequence of sub-tasks.
Now imagine the other way—as in this comment—with VPT adapted to a cat-sim. Could the VPT approach (behavioral cloning + RL) learn to do well at a cat-sim game? I think so.
I agree the cat brain is more general, also more computationally efficient, we have more to learn from it, etc—but I think it’s far from clear that the cat is more capable than VPT in the sense that matters for AGI.
I agree that VPT is (very plausibly) better at playing Minecraft than a trained cat would be, but to me that only demonstrates narrow intelligence (though, to be clear, farther along the spectrum of narrow-to-general than AI used to be). LLMs seem like the clearest demonstration of generality so far, for one thing because of their strength at few-shot and zero-shot, but their abilities are so qualitatively different from animal abilities that it’s hard to compare.
A cat-sim sounds like a really interesting idea. In some ways it’s actually unfair to the AI, because cats are benefiting from instincts that the AI wouldn’t have, so if an AI did perform well at it, that would be very impressive.
After some of this feedback I’ve extended the initial comparison section to focus more on the well grounded BNN vs ANN comparisons where we can really compare the two systems directly on the level of their functional computations and say “yes, X is mostly just computing a better version of Y.”
So you can compare ANN vision systems versus the relevant subset of animal visual cortex that computes classification (or other relevant tasks), or you can compare linguistic cortex neural outputs vs LLM, and the results of those experiments are—in my opinion—fairly decisive against any idea that brains are mysteriously superior. The ANNs are clearly computing the same things in the same ways and even training from the same objective now (self-supervised prediction), but just predictably better when they have more data/compute.
This is interesting, but I’m a bit stuck on the claim that there is already cat-level AI (and more generally, AI matching various animals). In my experience with cats, they are fairly dumb, but they seem to have the sort of general intelligence we have, just a lot less. My intuition is that no AI has yet achieved that generality.
For example, some cats can, with great patience from the trainer, learn to recognize commands and perform tricks, much like dogs (but with the training difficulty being higher). VPT can’t do that. In some sense, I’m not even sure what it would mean for VPT to be able to do that, since it doesn’t interact with the world in that way.
If you read the VPT post/paper, to get to diamond-crafting they actually did use reinforcement learning on top of the behavior cloning, which is actually much more similar to how a cat is trained through rewards. I think it’s pretty clear that a cat could not be trained to play minecraft to diamond-level. Of course that’s not really a fair comparison, but let’s make it more fair . ..
Imagine if the cat brain was more directly wired into a VR minecraft, so that translations of it’s neural commands for looking around, moving, etc, were all translated. Do you think through reward-training we could get cats to diamond-level? I doubt it—we are talking about a long chained sequence of sub-tasks.
Now imagine the other way—as in this comment—with VPT adapted to a cat-sim. Could the VPT approach (behavioral cloning + RL) learn to do well at a cat-sim game? I think so.
I agree the cat brain is more general, also more computationally efficient, we have more to learn from it, etc—but I think it’s far from clear that the cat is more capable than VPT in the sense that matters for AGI.
I agree that VPT is (very plausibly) better at playing Minecraft than a trained cat would be, but to me that only demonstrates narrow intelligence (though, to be clear, farther along the spectrum of narrow-to-general than AI used to be). LLMs seem like the clearest demonstration of generality so far, for one thing because of their strength at few-shot and zero-shot, but their abilities are so qualitatively different from animal abilities that it’s hard to compare.
A cat-sim sounds like a really interesting idea. In some ways it’s actually unfair to the AI, because cats are benefiting from instincts that the AI wouldn’t have, so if an AI did perform well at it, that would be very impressive.
After some of this feedback I’ve extended the initial comparison section to focus more on the well grounded BNN vs ANN comparisons where we can really compare the two systems directly on the level of their functional computations and say “yes, X is mostly just computing a better version of Y.”
So you can compare ANN vision systems versus the relevant subset of animal visual cortex that computes classification (or other relevant tasks), or you can compare linguistic cortex neural outputs vs LLM, and the results of those experiments are—in my opinion—fairly decisive against any idea that brains are mysteriously superior. The ANNs are clearly computing the same things in the same ways and even training from the same objective now (self-supervised prediction), but just predictably better when they have more data/compute.