One thing I’d really like to see is reward for diversity of results. Bringing me the same listicle with slight rewrites 10 times provides no value while pushing out better results.
A friend of mine doing an ML PhD claims it’s possible to train a search engine to identify the shitty pages that might as well have been written by GPT-3, even if that’s not literally true. I’m skeptical this can be done in a way that keeps up with the adversarial adaptation, but it would be cool if it did.
Just ran into the listicle problem myself; it effectively slew searching Google for anything where I don’t already know most of what I need. It feels weird that in the name of ad revenue the algorithm promotes junk whose sole purpose is also to generate ad revenue. Process seems to be cannibalizing itself somehow.
It would be cool to filter GPT-3-ish things. It seems like we could get most of the diversity without anything very sophisticated; something like negatively weighting hits according to how many other results have similar/very similar content. If all the pages containing some variation of “Top #INT #VERB #NOUN” could get kicked to the bottom of the rankings forever, I’d be a happy camper.
I wonder how we’d go about designing a reward signal for the long-tailed stuff.
One thing I’d really like to see is reward for diversity of results. Bringing me the same listicle with slight rewrites 10 times provides no value while pushing out better results.
A friend of mine doing an ML PhD claims it’s possible to train a search engine to identify the shitty pages that might as well have been written by GPT-3, even if that’s not literally true. I’m skeptical this can be done in a way that keeps up with the adversarial adaptation, but it would be cool if it did.
Just ran into the listicle problem myself; it effectively slew searching Google for anything where I don’t already know most of what I need. It feels weird that in the name of ad revenue the algorithm promotes junk whose sole purpose is also to generate ad revenue. Process seems to be cannibalizing itself somehow.
It would be cool to filter GPT-3-ish things. It seems like we could get most of the diversity without anything very sophisticated; something like negatively weighting hits according to how many other results have similar/very similar content. If all the pages containing some variation of “Top #INT #VERB #NOUN” could get kicked to the bottom of the rankings forever, I’d be a happy camper.
If adversarial adaptation means that shitty pages needs to appear as good pages with solid argumentation, it seems like a win to me.