I agree with what Gwern said about things being behind-the-scenes, but it’s also worth noting that there are many impactful consumer technologies that use DL. In fact, some of the things that you don’t think exist actually do exist!
Google search gets less usable every year, even for Scholar, which has a much less adversarial search space. It’s better for very common searches like popular tv shows, but approaching worthlessness for long tail stuff. Maybe this is just “search is hard”, but improving the common case at the cost of the long tail is exactly what I’d expect AI search to do.
One thing I’d really like to see is reward for diversity of results. Bringing me the same listicle with slight rewrites 10 times provides no value while pushing out better results.
A friend of mine doing an ML PhD claims it’s possible to train a search engine to identify the shitty pages that might as well have been written by GPT-3, even if that’s not literally true. I’m skeptical this can be done in a way that keeps up with the adversarial adaptation, but it would be cool if it did.
Just ran into the listicle problem myself; it effectively slew searching Google for anything where I don’t already know most of what I need. It feels weird that in the name of ad revenue the algorithm promotes junk whose sole purpose is also to generate ad revenue. Process seems to be cannibalizing itself somehow.
It would be cool to filter GPT-3-ish things. It seems like we could get most of the diversity without anything very sophisticated; something like negatively weighting hits according to how many other results have similar/very similar content. If all the pages containing some variation of “Top #INT #VERB #NOUN” could get kicked to the bottom of the rankings forever, I’d be a happy camper.
I agree with what Gwern said about things being behind-the-scenes, but it’s also worth noting that there are many impactful consumer technologies that use DL. In fact, some of the things that you don’t think exist actually do exist!
Google Translate: https://www.washingtonpost.com/news/innovations/wp/2016/10/03/google-translate-is-getting-really-really-accurate/
Google Search: https://blog.google/products/search/search-language-understanding-bert/
PhotoShop: https://blog.adobe.com/en/publish/2020/10/20/photoshop-the-worlds-most-advanced-ai-application-for-creatives
Examples of other DL-powered consumer applications
Grammarly: https://www.grammarly.com/blog/how-grammarly-uses-ai/
Apple FaceID: https://support.apple.com/en-us/HT208108
JP Morgan Chase: https://www.jpmorgan.com/technology/applied-ai-and-ml
Google search gets less usable every year, even for Scholar, which has a much less adversarial search space. It’s better for very common searches like popular tv shows, but approaching worthlessness for long tail stuff. Maybe this is just “search is hard”, but improving the common case at the cost of the long tail is exactly what I’d expect AI search to do.
I wonder how we’d go about designing a reward signal for the long-tailed stuff.
One thing I’d really like to see is reward for diversity of results. Bringing me the same listicle with slight rewrites 10 times provides no value while pushing out better results.
A friend of mine doing an ML PhD claims it’s possible to train a search engine to identify the shitty pages that might as well have been written by GPT-3, even if that’s not literally true. I’m skeptical this can be done in a way that keeps up with the adversarial adaptation, but it would be cool if it did.
Just ran into the listicle problem myself; it effectively slew searching Google for anything where I don’t already know most of what I need. It feels weird that in the name of ad revenue the algorithm promotes junk whose sole purpose is also to generate ad revenue. Process seems to be cannibalizing itself somehow.
It would be cool to filter GPT-3-ish things. It seems like we could get most of the diversity without anything very sophisticated; something like negatively weighting hits according to how many other results have similar/very similar content. If all the pages containing some variation of “Top #INT #VERB #NOUN” could get kicked to the bottom of the rankings forever, I’d be a happy camper.
If adversarial adaptation means that shitty pages needs to appear as good pages with solid argumentation, it seems like a win to me.