agreed. the worst adversarial examples discoverable by coordination of mere humans are pretty severe; partly because, well, it’s not that hard to come up with an adversarial example for a heavily correlational thinker like a human. causal learning is difficult; I’m pretty confident I’m smarter than the youtube recommender, and I’ve been able to steer it fairly well to give me useful content, I encourage others to put the effort in to only watch stuff that is worth your time because it’ll take note. but the majority of videos on youtube are created by a very unhealthy feedback loop that coordinates human intelligence into a system naturally misaligned with human flourishing, agreed.
but! don’t underestimate how much worse this would be if it were dramatically smarter than the viewer. can you imagine how bad it would be if the recommender was able to think of everything you might want to do next, give you subtly corrupted versions of it that promote a particular kind of blind spot that resonates with your fear emotions but doesn’t tell you any way to solve the problem besides to come back for more later, and then give you entertainment that calms you down just right to recover enough for the next hit? I mean, this already happens, but what if it happened to even more resilient people :P
agreed. the worst adversarial examples discoverable by coordination of mere humans are pretty severe; partly because, well, it’s not that hard to come up with an adversarial example for a heavily correlational thinker like a human. causal learning is difficult; I’m pretty confident I’m smarter than the youtube recommender, and I’ve been able to steer it fairly well to give me useful content, I encourage others to put the effort in to only watch stuff that is worth your time because it’ll take note. but the majority of videos on youtube are created by a very unhealthy feedback loop that coordinates human intelligence into a system naturally misaligned with human flourishing, agreed.
but! don’t underestimate how much worse this would be if it were dramatically smarter than the viewer. can you imagine how bad it would be if the recommender was able to think of everything you might want to do next, give you subtly corrupted versions of it that promote a particular kind of blind spot that resonates with your fear emotions but doesn’t tell you any way to solve the problem besides to come back for more later, and then give you entertainment that calms you down just right to recover enough for the next hit? I mean, this already happens, but what if it happened to even more resilient people :P