If brute force is more effective… then it’s better. There’s probably a post in the sequences about this. Do you care about machine vision, or do you care about fancy algorithms? If you care about machine vision, then you should want the most effective approach (the one that is best at “winning”), whatever its nature. There is no such thing as “cheating” in pursuit of such a goal. On the other hand, if you care about fancy algorithms (which is a legitimate thing to care about!), then brute force is by definition uninteresting.
I don’t think that it’s quite this easy to depict the problem. In applications where I think the brute force SVM approach is fundamentally a more correct way to model and attack the problem, I’m all for its use. At the same time though, I don’t care at all about “fancy algorithms.” What I think is that the landscape of modern research is far too risk averse (there is a book coming out soon by Peter Thiel and Garry Kasparov of all people on this very topic—that human ingenuity and insight has actually slowed in the last few decades despite advances in computing technology).
I think that to solve hard machine vision problems, like perception and cognition, you have to depart from the machine learning paradigm. Machine learning might be a low-level tool for dealing with feature extraction and selection, which plays a role, but is far from effective for higher end problems. For example, I work on activity and behavior understanding. On any timescale above about 1-2 seconds, humans consistently recognize activities by projecting protagonist style goal structures onto their observations (i.e. the cue ball is trying to hit the black ball into that pocket… the cue ball becomes an agent with volition instead of merely a pawn in some sort of physics experiment). Currently, machine vision researchers approach the task of activity understanding as they approach everything else, just train a giant SVM (or some other kernel classifier, sparse coding, dictionary methods, etc. etc. etc.) The performance is state of the art for specific applications, but we don’t actually know anything about how to perceive activities. It offers no insight whatsoever into the real cognitive structure that’s underlying the task of perception. There are obviously many controversial opinions about a question like this, but my perspective is that probabilistic reasoning and graphical models are a much better way to work on this sort of problem, but even those methods need to be researched and extended to a much more theoretically mature level than where they currently are.
But no one (and I mean no one) will pay you money to do that style of research. Unless you produce a widget that performs task X for company Y at state of the art levels and can deliver it very soon, you get no funding and you get shunned by the major computer vision journals (ECCV, ICCV, CVPR, NIPS, SIGGRAPH). To get your work published and to win grants, you have to market it based primarily on the way you can promise to deliver pretty pictures in the short term. Machine learning is much better at this than advanced mathematical abstractions, and so advanced mathematical abstraction approaches to vision (which will benefit us all very much further into the future) are not being funded by risk averse funding sources.
Nobody is demanding that machine vision must succeed at preposterously difficult tasks anymore. Most consumers of this sort of research are saying that computing power is sufficient now that, as far as commercial applications go, we just need to hammer away on efficiency and performance in very controlled, specific settings. Being a “general vision theorist” has utterly no place anymore, in academia or in corporate research.
This is the root of my specific issue. My goal is to study difficult problem in computer vision that cannot be well-solved with the machine learning paradigm. I believe these need advanced theory to be well-solved, theory which doesn’t exist yet. Just like the A.I. winter though, someone would have to fund me without knowing that the end result will benefit them more than their competitors, or if there will be any commercial benefit at all. My goal is to understand perception better and to only start to worry about what that better understanding will do for me later on.
Also, can you be more specific about how this works at Wolfram Research. I frequently attend their career presentations here at my university and then try to talk to the technical representatives afterward and learn about specific opportunities. It doesn’t seem to be related to the kind of tool-creating research I’m talking about. In fact, I think that proprietary programming platforms like Matlab, Mathematica, and Maple, are a large part of the problem in some respects. These tools teach engineers how to be bad programmers and how to care more about hacks that produce pretty pictures than about really insightful methods that solve a problem in a way that provides more knowledge and more insight than merely pretty pictures.
When I mentioned “creating tools that go in the toolbox of an engineering” what I mean was inventing new techniques in say, calculus of variations, or new strategies for simulated annealing and Markov random field analysis. I mean theoretical tools that offer fundamentally more insightful ways to do modeling for engineering problems. I did not mean that I want to create software tools that make easier interfaces for doing machine learning, etc.
Machine learning is much better at this than advanced mathematical abstractions,
and so advanced mathematical abstraction approaches to vision (which will benefit
us all very much further into the future) are not being funded by risk averse funding sources.
Okay, that’s a “local maximum” objection.
Also, can you be more specific about how this works at Wolfram Research.
I don’t work there. Mathematica was just the first example that came to my mind of something that might be used in the sciences.
I mean theoretical tools that offer fundamentally more insightful ways to do modeling for engineering problems.
Oh, I was confused by your terminology. When I hear “tools”, I think “software”.
I don’t think that it’s quite this easy to depict the problem. In applications where I think the brute force SVM approach is fundamentally a more correct way to model and attack the problem, I’m all for its use. At the same time though, I don’t care at all about “fancy algorithms.” What I think is that the landscape of modern research is far too risk averse (there is a book coming out soon by Peter Thiel and Garry Kasparov of all people on this very topic—that human ingenuity and insight has actually slowed in the last few decades despite advances in computing technology).
I think that to solve hard machine vision problems, like perception and cognition, you have to depart from the machine learning paradigm. Machine learning might be a low-level tool for dealing with feature extraction and selection, which plays a role, but is far from effective for higher end problems. For example, I work on activity and behavior understanding. On any timescale above about 1-2 seconds, humans consistently recognize activities by projecting protagonist style goal structures onto their observations (i.e. the cue ball is trying to hit the black ball into that pocket… the cue ball becomes an agent with volition instead of merely a pawn in some sort of physics experiment). Currently, machine vision researchers approach the task of activity understanding as they approach everything else, just train a giant SVM (or some other kernel classifier, sparse coding, dictionary methods, etc. etc. etc.) The performance is state of the art for specific applications, but we don’t actually know anything about how to perceive activities. It offers no insight whatsoever into the real cognitive structure that’s underlying the task of perception. There are obviously many controversial opinions about a question like this, but my perspective is that probabilistic reasoning and graphical models are a much better way to work on this sort of problem, but even those methods need to be researched and extended to a much more theoretically mature level than where they currently are.
But no one (and I mean no one) will pay you money to do that style of research. Unless you produce a widget that performs task X for company Y at state of the art levels and can deliver it very soon, you get no funding and you get shunned by the major computer vision journals (ECCV, ICCV, CVPR, NIPS, SIGGRAPH). To get your work published and to win grants, you have to market it based primarily on the way you can promise to deliver pretty pictures in the short term. Machine learning is much better at this than advanced mathematical abstractions, and so advanced mathematical abstraction approaches to vision (which will benefit us all very much further into the future) are not being funded by risk averse funding sources.
Nobody is demanding that machine vision must succeed at preposterously difficult tasks anymore. Most consumers of this sort of research are saying that computing power is sufficient now that, as far as commercial applications go, we just need to hammer away on efficiency and performance in very controlled, specific settings. Being a “general vision theorist” has utterly no place anymore, in academia or in corporate research.
This is the root of my specific issue. My goal is to study difficult problem in computer vision that cannot be well-solved with the machine learning paradigm. I believe these need advanced theory to be well-solved, theory which doesn’t exist yet. Just like the A.I. winter though, someone would have to fund me without knowing that the end result will benefit them more than their competitors, or if there will be any commercial benefit at all. My goal is to understand perception better and to only start to worry about what that better understanding will do for me later on.
Also, can you be more specific about how this works at Wolfram Research. I frequently attend their career presentations here at my university and then try to talk to the technical representatives afterward and learn about specific opportunities. It doesn’t seem to be related to the kind of tool-creating research I’m talking about. In fact, I think that proprietary programming platforms like Matlab, Mathematica, and Maple, are a large part of the problem in some respects. These tools teach engineers how to be bad programmers and how to care more about hacks that produce pretty pictures than about really insightful methods that solve a problem in a way that provides more knowledge and more insight than merely pretty pictures.
When I mentioned “creating tools that go in the toolbox of an engineering” what I mean was inventing new techniques in say, calculus of variations, or new strategies for simulated annealing and Markov random field analysis. I mean theoretical tools that offer fundamentally more insightful ways to do modeling for engineering problems. I did not mean that I want to create software tools that make easier interfaces for doing machine learning, etc.
Okay, that’s a “local maximum” objection.
I don’t work there. Mathematica was just the first example that came to my mind of something that might be used in the sciences.
Oh, I was confused by your terminology. When I hear “tools”, I think “software”.