The issue with this argument is that the architectures and techniques that are best at “human-like” data processing are are now turning out to be very good at “inhuman” data processing. Some examples:
TABERT is a BERT-like transformer that interprets tabular data as a sequence of language tokens
Protein structure prediction (admittedly, humans are surprisingly good at this, but AlphaFold is better)
Protein folding problems are cool and all, but as a computational biologist, I do think people underrate simpler ML models on a variety of problems. Researchers have to make a lot of decisions—such as which hits to follow up on in a high throughput screen. The data to inform these kinds of tasks can be expressed as tabular data and decision trees can perform quite well!
I don’t disagree, as I said before, I’m focused on problem type not method.
The fact that human mimicking problems have loads of cheap training data and can lead to interesting architectures is something I didn’t think of that makes them more worthwhile.
The issue with this argument is that the architectures and techniques that are best at “human-like” data processing are are now turning out to be very good at “inhuman” data processing. Some examples:
TABERT is a BERT-like transformer that interprets tabular data as a sequence of language tokens
Weather prediction (specific example)
Protein structure prediction (admittedly, humans are surprisingly good at this, but AlphaFold is better)
Also, this paper shows that deep learnings relative weakness on tabular data can be overcome with careful choice of regularization.
Protein folding problems are cool and all, but as a computational biologist, I do think people underrate simpler ML models on a variety of problems. Researchers have to make a lot of decisions—such as which hits to follow up on in a high throughput screen. The data to inform these kinds of tasks can be expressed as tabular data and decision trees can perform quite well!
Got Any examples of this being used? I’m always on the lookout for these kind of usecases.
I don’t disagree, as I said before, I’m focused on problem type not method.
The fact that human mimicking problems have loads of cheap training data and can lead to interesting architectures is something I didn’t think of that makes them more worthwhile.