Like Wolfram present a diffusion model as a world of concepts. But remove the noise, make the generated concepts like pictures in an art galley (so make 2D pictures stand upright like pictures in this 3D simulated art gallery), this way gamers and YouTubers will see how dreadful those model really are inside. There is a new monster every month on YT, they get millions of views. We want the public to know that AI companies make real-life Frankenstein monsters with some very crazy stuff inside of their electronic “brains” (inside of AI models). It can help to spread the outrage, if people also see their personal photos are inside of those models. If they used the whole output of humanity to train their models, those models should benefit the whole humanity, not cost $200/month like paid ChatGPT. People should be able to see what’s in the model, right now a chatbot is like a librarian that spits quotes at you but doesn’t let you enter the library (the AI model).
Okay, so you propose a mechanistic interpretability program where you create a virtual gallery of AI concepts extracted from Stable Diffusion, represented as images. I am slightly skeptical that this would move the needle on AI safety significantly, we already have databases like LAION which are open-source databases of scraped images used to train AI models, and I don’t see that much outrage over it. I mean, there is some outrage, but not a significantly large amount to be a cornerstone of an AI safety plan.
gamers and YouTubers will see how dreadful those model really are inside. There is a new monster every month on YT, they get millions of views. We want the public to know that AI companies make real-life Frankenstein monsters with some very crazy stuff inside of their electronic “brains” (inside of AI models).
What exactly do you envision that is being hidden inside these Stable Diffusion concepts? What “crazy stuff” is in it? I’m currently not aware of anything about their inner representations that is especially concerning.
It can help to spread the outrage, if people also see their personal photos are inside of those models.
It is probably a lot more efficient to show that by modifying the LAION database and slapping some sort of image search on it, so people can see that their pictures were used to train the model.
I guess it’s just that the censors have not seen it yet.
There’s a lot of situations where a smaller website doesn’t get banned e.g. Substack is banned in China, but if you host your Substack blog on a custom URL, people in China can still read it.