I couldn’t find a good way to embed videos using ML, so I just scraped which videos recommend each other, and made a graph from that (which kinda is an embedding). Then I let users narrow down on some particular region of that graph. So you can not only avoid some nasty regions, but you can also decide what you want to watch right now, instead of the algorithm deciding for you. So this gives the user more autonomy.
The accuracy isn’t yet too satisfying. I think the biggest problem with systems like these is the network effect—you could get much better results with some collaborative filtering.
we can do this collectively—f.e. my clickbait is probably clickbaity for you too
This assumes good faith. As soon as enough people learn about the Guardian AI, I expect Twitter threads coordinating people: “let’s flag all outgroup content as ‘clickbait’”.
Just like people are abusing current systems by falsely labeling the content that want removed as “spam” or “porn” or “original research” or whichever label effectively means “this will be hidden from the audience”.
Oh yeah, definitely. I think such a system shouldn’t try to enforce one “truth”—which content is objectively good or bad.
I’d much rather see people forming groups, each with its own moderation rules. And let people be a part of multiple groups. There’s a lot of methods that could be tried out, f.e. some groups could use algorithms like EigenTrust, to decide how much to trust users.
But before we can get to that, I see a more prohibitive problem—that it will be hard to get enough people to get that system off the ground.
Cool post! I think the minimum viable “guardian” implementation, would be to
embed each post/video/tweet into some high-dimensional space
find out which regions of that space are nasty (we can do this collectively—f.e. my clickbait is probably clickbaity for you too)
filter out those regions
I tried to do something along these lines for youtube: https://github.com/filyp/yourtube
I couldn’t find a good way to embed videos using ML, so I just scraped which videos recommend each other, and made a graph from that (which kinda is an embedding). Then I let users narrow down on some particular region of that graph. So you can not only avoid some nasty regions, but you can also decide what you want to watch right now, instead of the algorithm deciding for you. So this gives the user more autonomy.
The accuracy isn’t yet too satisfying. I think the biggest problem with systems like these is the network effect—you could get much better results with some collaborative filtering.
This assumes good faith. As soon as enough people learn about the Guardian AI, I expect Twitter threads coordinating people: “let’s flag all outgroup content as ‘clickbait’”.
Just like people are abusing current systems by falsely labeling the content that want removed as “spam” or “porn” or “original research” or whichever label effectively means “this will be hidden from the audience”.
Oh yeah, definitely. I think such a system shouldn’t try to enforce one “truth”—which content is objectively good or bad.
I’d much rather see people forming groups, each with its own moderation rules. And let people be a part of multiple groups. There’s a lot of methods that could be tried out, f.e. some groups could use algorithms like EigenTrust, to decide how much to trust users.
But before we can get to that, I see a more prohibitive problem—that it will be hard to get enough people to get that system off the ground.