A-priori thinking about inductive biases of super-intelligent agents looks very unintuitive to me as I cannot currently understand how ML literature could meaningfully inform inductive biases of intelligent agents which could tweak themselves[1].
Also, the scaling hypothesis seems more in the GPT rather than superintelligence category, it’s unclear to me how tools will achieve agency, and I’m not yet convinced/appreciate this. ↩︎
I read that first sentence several times and it’s still not clear what you mean, or how the footnote helps clarify. What do you mean by ‘tweak’? A tweak is a small incremental change. DL is about training networking with some flavour of SGD/bprop, which approximates bayesian updates, and is all about many small ‘tweaks’. So when you say “agents which could tweak themselves” at first glance you just seem to be saying “agents that can learn at all”, but that doesn’t seem to fit.
Your section on adversarial examples will not hold up well—that is a bet I am fairly confident on.
Adversarial examples are an artifact of the particular historical trajectory that DL took on GPUs where there is no performance advantage to sparsity. Adversarial attacks exploit the overfit,noisy internal representations that nearly all DL systems learn, as they almost never regularize internal activations and sparse weight regularization is still a luxury rather than default, and certainly isn’t tuned for adversarial defense. Proper sparse regularized internal weights and activations—which compress and thus filter out noise—can provide the same level of defense against adversarial pertubations that biological cortical vision/sensing provides.
I know this based on my own internal theory and experiments rather than an specific paper, but just a quick search on the literature reveals theoretical&experimental support 1,2,3,4
(all of those were found in just a few minutes while writing this comment)
The reason this isn’t more widely known/used is twofold: 1.) there isn’t much economic motivation—few are actually currently concerned with adversarial attacks outside theoretical curiosity and DL critics 2.) sparsity regularization (over activations especially) is a rather expensive luxury on current GPU software/hardware
I read that first sentence several times and it’s still not clear what you mean, or how the footnote helps clarify. What do you mean by ‘tweak’? A tweak is a small incremental change.
That’s correct, what I meant is say we state an agent has ‘x, y, z biases’, it can try to correct them. Now, the changes cannot be arbitrary, the constraints are that it has to be competitive and robust. But I think it can reduce the strength of the heuristic by going against it whenever it can to the extent those heuristics would have little usefulness. But it’s likely that here I’m having a wrong and weird conception of superintelligence.
Your section on adversarial examples will not hold up well—that is a bet I am fairly confident on.
Huh. I will find that very surprising, it should hold up to sparsity. Let me clarify my reasoning and then can you say what do you find I might be missing? Or why do you still think so?
Proper sparse regularized internal weights and activations—which compress and thus filter out noise—can provide the same level of defense against adversarial pertubations that biological cortical vision/sensing provides.
Notice that this noise is average case, whereas adversarial examples are worst case. This difference might be doing a lot of heavy-lifting here. Conventional deep networks have really nice noise stability properties, as in, they are able to filter out injected noise to a good extent, illustrated in Stronger generalization bounds for deep nets via a compression approach (ICML ’18). In the worst case, despite 3D vision, narrow-focus and other biases/limitations of human vision give a wide variety of adversarial examples. Some examples: ‘the the’ reading problem, not noticing big large objects crossing in a video or falling prey to a good variety of illusions are some varieties of adversarial examples for human vision. I’m not sure if human vision is a good example of a robust sensing pipeline.
I know this based on my own internal theory and experiments rather than an specific paper, but just a quick search on the literature reveals theoretical&experimental support 1,2,3,4
Err, I find citing literature to be often insufficient, especially in ML, to meet a reasonable bar of support (a surprising amount of papers accepted in top conferences are unfortunately lame). I usually have to carefully read and analyze them.
For these papers, quickly reading some of them, my comment is as follows—Notice that often: (a) they do not test with adaptive attacks (b) the degree of robustness they provide to weaker attacks is minimal (c) any simple defense like gaussian smoothing will do a lot better. Hence, they would provide little support about robustness.
Now, let me put forward the case against: Our current understanding of sparsity (works in G5, AR3) is that sparsity allows us to reduce parameterization, but only a certain extent. Effects in AR3 suggest we probably need more complex models, and not simpler ones (with sparsity/regularization) for robustness—i.e. the direction you think we should go towards seems to be opposite than what literature suggests (and this is indeed counter-intuitive!).
I think the reason people (including me) have been pessimistic about this direction and switched to doing research in other things is that it doesn’t seem to give many benefits except a certain reduction in memory/parameterization at the extra cost of code modifications.
Adversarial examples are an artifact of the particular historical trajectory that DL took on GPUs where there is no performance advantage to sparsity … sparsity regularization (over activations especially) is a rather expensive luxury on current GPU software/hardware
I don’t think this is true, to like a large degree. GPUs do take a lot of advantage of sparse patterns or maybe I have a lower bar of ‘sparsity works!’ than yours. Pytorch takes and speeds up memory and computations by a huge amount if you take sparse tensors! If you have structured sparsity (blocksparse structure, pointwise convolutions), it’s even better and there are some very fast CUDA kernels to leverage that.
It has limited upside, not contributing interesting/helpful inductive biases. It’s fairly common to sparsify and quantize deep networks in deployment phase, although often the non-sparse CUDA kernels work fine as they’re insanely optimized.
I read that first sentence several times and it’s still not clear what you mean, or how the footnote helps clarify. What do you mean by ‘tweak’? A tweak is a small incremental change. DL is about training networking with some flavour of SGD/bprop, which approximates bayesian updates, and is all about many small ‘tweaks’. So when you say “agents which could tweak themselves” at first glance you just seem to be saying “agents that can learn at all”, but that doesn’t seem to fit.
Your section on adversarial examples will not hold up well—that is a bet I am fairly confident on.
Adversarial examples are an artifact of the particular historical trajectory that DL took on GPUs where there is no performance advantage to sparsity. Adversarial attacks exploit the overfit,noisy internal representations that nearly all DL systems learn, as they almost never regularize internal activations and sparse weight regularization is still a luxury rather than default, and certainly isn’t tuned for adversarial defense. Proper sparse regularized internal weights and activations—which compress and thus filter out noise—can provide the same level of defense against adversarial pertubations that biological cortical vision/sensing provides.
I know this based on my own internal theory and experiments rather than an specific paper, but just a quick search on the literature reveals theoretical&experimental support 1,2,3,4
(all of those were found in just a few minutes while writing this comment)
The reason this isn’t more widely known/used is twofold: 1.) there isn’t much economic motivation—few are actually currently concerned with adversarial attacks outside theoretical curiosity and DL critics 2.) sparsity regularization (over activations especially) is a rather expensive luxury on current GPU software/hardware
Hi! Thanks for reading and interesting questions:
That’s correct, what I meant is say we state an agent has ‘x, y, z biases’, it can try to correct them. Now, the changes cannot be arbitrary, the constraints are that it has to be competitive and robust. But I think it can reduce the strength of the heuristic by going against it whenever it can to the extent those heuristics would have little usefulness. But it’s likely that here I’m having a wrong and weird conception of superintelligence.
Huh. I will find that very surprising, it should hold up to sparsity. Let me clarify my reasoning and then can you say what do you find I might be missing? Or why do you still think so?
Notice that this noise is average case, whereas adversarial examples are worst case. This difference might be doing a lot of heavy-lifting here. Conventional deep networks have really nice noise stability properties, as in, they are able to filter out injected noise to a good extent, illustrated in Stronger generalization bounds for deep nets via a compression approach (ICML ’18). In the worst case, despite 3D vision, narrow-focus and other biases/limitations of human vision give a wide variety of adversarial examples. Some examples: ‘the the’ reading problem, not noticing big large objects crossing in a video or falling prey to a good variety of illusions are some varieties of adversarial examples for human vision. I’m not sure if human vision is a good example of a robust sensing pipeline.
Err, I find citing literature to be often insufficient, especially in ML, to meet a reasonable bar of support (a surprising amount of papers accepted in top conferences are unfortunately lame). I usually have to carefully read and analyze them.
For these papers, quickly reading some of them, my comment is as follows—Notice that often: (a) they do not test with adaptive attacks (b) the degree of robustness they provide to weaker attacks is minimal (c) any simple defense like gaussian smoothing will do a lot better. Hence, they would provide little support about robustness.
For comparison, good empirical results in compression-gives-robustness look like: Robustness via Deep Low-Rank Representations (Arxiv). Although insufficient to be a good defense as adaptive attacks are likely to break it. Reference for adaptive attacks: On Adaptive Attacks to Adversarial Example Defenses (NeurIPS ’20)-- one of my favourite works in this area.
Now, let me put forward the case against: Our current understanding of sparsity (works in G5, AR3) is that sparsity allows us to reduce parameterization, but only a certain extent. Effects in AR3 suggest we probably need more complex models, and not simpler ones (with sparsity/regularization) for robustness—i.e. the direction you think we should go towards seems to be opposite than what literature suggests (and this is indeed counter-intuitive!).
I think the reason people (including me) have been pessimistic about this direction and switched to doing research in other things is that it doesn’t seem to give many benefits except a certain reduction in memory/parameterization at the extra cost of code modifications.
I don’t think this is true, to like a large degree. GPUs do take a lot of advantage of sparse patterns or maybe I have a lower bar of ‘sparsity works!’ than yours. Pytorch takes and speeds up memory and computations by a huge amount if you take sparse tensors! If you have structured sparsity (blocksparse structure, pointwise convolutions), it’s even better and there are some very fast CUDA kernels to leverage that.
It has limited upside, not contributing interesting/helpful inductive biases. It’s fairly common to sparsify and quantize deep networks in deployment phase, although often the non-sparse CUDA kernels work fine as they’re insanely optimized.