Computer vision is just scanning for high probability matches between an area of the image and a set of tokenized segments that have an assigned label. No conceptual understanding of objects or actions in an image. No internal representation, and no expectations for what should “be there” a moment later. And no form of attention to drive focus (area of interest).
Canned performances and human control just off camera give the false impression of animal behaviors in what we see today, but there has been little progress since the mid-1980′s into behavior-driven research. *learning to play a video game with only 20 hours of real-time play would be a better measure than trying to understand (and match) animal minds (though good research in the direction of human-level will absolutely include that).
Computer vision is just scanning for high probability matches between an area of the image and a set of tokenized segments that have an assigned label. No conceptual understanding of objects or actions in an image. No internal representation, and no expectations for what should “be there” a moment later. And no form of attention to drive focus (area of interest).
Canned performances and human control just off camera give the false impression of animal behaviors in what we see today, but there has been little progress since the mid-1980′s into behavior-driven research. *learning to play a video game with only 20 hours of real-time play would be a better measure than trying to understand (and match) animal minds (though good research in the direction of human-level will absolutely include that).