I think your response shows I understood it pretty well. I used an example that you directly admit is against what the bitter lesson tries to teach as my primary example. I also never said anything about being able to program something directly better.
I pointed out that I used the things people decided to let go of so that I could improve the results massively over the current state of the machine translation for my own uses, and then implied we should do things like give language models dictionaries and information about parts of speech that it can use as a reference or starting point. We can still use things as an improvement over pure deep learning, by simply letting the machine use them as a reference. It would have to be trained to do so, of course, but that seems relatively easy.
The bitter lesson is about ‘scale is everything,’ but AlphaGo and its follow-ups use massively less compute to get up to those levels! Their search is not an exhaustive one, but a heuristic one that requires very little compute comparatively. Heuristic searches are less general, not more. It should be noted that I only mentioned AlphaGo to show that even it wasn’t a victory of scale like some people commonly seem to believe. It involved taking advantage of the fact that we know the structure of the game to give it a leg up.
I think your response shows I understood it pretty well. I used an example that you directly admit is against what the bitter lesson tries to teach as my primary example. I also never said anything about being able to program something directly better.
I pointed out that I used the things people decided to let go of so that I could improve the results massively over the current state of the machine translation for my own uses, and then implied we should do things like give language models dictionaries and information about parts of speech that it can use as a reference or starting point. We can still use things as an improvement over pure deep learning, by simply letting the machine use them as a reference. It would have to be trained to do so, of course, but that seems relatively easy.
The bitter lesson is about ‘scale is everything,’ but AlphaGo and its follow-ups use massively less compute to get up to those levels! Their search is not an exhaustive one, but a heuristic one that requires very little compute comparatively. Heuristic searches are less general, not more. It should be noted that I only mentioned AlphaGo to show that even it wasn’t a victory of scale like some people commonly seem to believe. It involved taking advantage of the fact that we know the structure of the game to give it a leg up.