Filip Sondej comments on Guardian AI (Misaligned systems are all around us.)

Filip Sondej 25 Nov 2022 23:06 UTC
3 points
2
Cool post! I think the minimum viable “guardian” implementation, would be to
- embed each post/video/tweet into some high-dimensional space
- find out which regions of that space are nasty (we can do this collectively—f.e. my clickbait is probably clickbaity for you too)
- filter out those regions
I tried to do something along these lines for youtube: https://github.com/filyp/yourtube

I couldn’t find a good way to embed videos using ML, so I just scraped which videos recommend each other, and made a graph from that (which kinda is an embedding). Then I let users narrow down on some particular region of that graph. So you can not only avoid some nasty regions, but you can also decide what you want to watch right now, instead of the algorithm deciding for you. So this gives the user more autonomy.

The accuracy isn’t yet too satisfying. I think the biggest problem with systems like these is the network effect—you could get much better results with some collaborative filtering.
- Viliam 27 Nov 2022 13:19 UTC
  2 points
  0
  Parent
  we can do this collectively—f.e. my clickbait is probably clickbaity for you too
  This assumes good faith. As soon as enough people learn about the Guardian AI, I expect Twitter threads coordinating people: “let’s flag all outgroup content as ‘clickbait’”.
  Just like people are abusing current systems by falsely labeling the content that want removed as “spam” or “porn” or “original research” or whichever label effectively means “this will be hidden from the audience”.
  - Filip Sondej 28 Nov 2022 0:49 UTC
    1 point
    0
    Parent
    Oh yeah, definitely. I think such a system shouldn’t try to enforce one “truth”—which content is objectively good or bad.
    
    I’d much rather see people forming groups, each with its own moderation rules. And let people be a part of multiple groups. There’s a lot of methods that could be tried out, f.e. some groups could use algorithms like EigenTrust, to decide how much to trust users.
    
    But before we can get to that, I see a more prohibitive problem—that it will be hard to get enough people to get that system off the ground.