There has been work on constructing adversarial examples for human brains, and some interesting demonstrations of considerable neural-level control even with our extremely limited ability to observe brains
Do you have a source for this? I would be interested in looking into it. I could see this happening for isolated neurons, at least, but it would be curious if it could happen for whole circuits in vivo.
Does this go beyond just manipulating how our brains process optical illusions? I don’t see how the brain would perceive the type of pixel-level adversarial perturbations most of us think of (e.g.: https://openai.com/blog/adversarial-example-research/) as anything other than noise, if it even reaches past the threshold of perception at all. The sorts of illusions humans fall prey to are qualitatively different, taking advantage of our perceptual assumptions like structural continuity or color shifts under changing lighting conditions or 3-dimensionality. We don’t tend to go from making good guesses about what something is to being wildly, confidently incorrect when the texture changes microscopically.
My guess would be that you could get rid of a lot of adversarial susceptibility from DL systems by adding in the right kind of recurrent connectivity (as in predictive coding, where hypotheses about what the network is looking at help it to interpret low-level features), or even by finding a less extremizing nonlinearity than ReLU (e.g.: https://towardsdatascience.com/neural-networks-an-alternative-to-relu-2e75ddaef95c). Such changes might get us closer to how the brain does things.
Overparameterization, such as through making the network arbitrarily deep, might be able to get you around some of these limitations eventually (just like a fully connected NN can do the same thing as a CNN in principle), but I think we’ll have to change how we design neural networks at a fundamental level in order to avoid these issues more effectively in the long term.
I don’t see how the brain would perceive the type of pixel-level adversarial perturbations most of us think of (e.g.: https://openai.com/blog/adversarial-example-research/) as anything other than noise, if it even reaches past the threshold of perception at all.
Do you have a source for this? I would be interested in looking into it. I could see this happening for isolated neurons, at least, but it would be curious if it could happen for whole circuits in vivo.
Does this go beyond just manipulating how our brains process optical illusions? I don’t see how the brain would perceive the type of pixel-level adversarial perturbations most of us think of (e.g.: https://openai.com/blog/adversarial-example-research/) as anything other than noise, if it even reaches past the threshold of perception at all. The sorts of illusions humans fall prey to are qualitatively different, taking advantage of our perceptual assumptions like structural continuity or color shifts under changing lighting conditions or 3-dimensionality. We don’t tend to go from making good guesses about what something is to being wildly, confidently incorrect when the texture changes microscopically.
My guess would be that you could get rid of a lot of adversarial susceptibility from DL systems by adding in the right kind of recurrent connectivity (as in predictive coding, where hypotheses about what the network is looking at help it to interpret low-level features), or even by finding a less extremizing nonlinearity than ReLU (e.g.: https://towardsdatascience.com/neural-networks-an-alternative-to-relu-2e75ddaef95c). Such changes might get us closer to how the brain does things.
Overparameterization, such as through making the network arbitrarily deep, might be able to get you around some of these limitations eventually (just like a fully connected NN can do the same thing as a CNN in principle), but I think we’ll have to change how we design neural networks at a fundamental level in order to avoid these issues more effectively in the long term.
Look through https://www.gwern.net/docs/ai/adversarial/index The theoretical work is the isoperimetry paper: https://arxiv.org/abs/2105.12806
Here is a paper showing that humans can classify pixel-level adversarial examples that look like noise at better than chance levels, see Experiment 4 (and also #5-6): https://www.nature.com/articles/s41467-019-08931-6
Thanks for the links!