Dropout makes interpretation easier because it disincentivizes complicated features where you can only understand the function of the parts in terms of their high-order correlations with other parts. This is because if a feature relies on such correlations, it will be fragile to some of the pieces being dropped out.
Anti-dropout promotes consolidation of similar features into one, but it also incentivizes that one feature to be maximally complicated and fragile.
Re: first idea. Yeah, something like that. Basically just an attempt at formalization of “functionally similar neurons,” so that when you go to drop out a neuron, you actually drop out all functionally similar ones.
Dropout makes interpretation easier because it disincentivizes complicated features where you can only understand the function of the parts in terms of their high-order correlations with other parts. This is because if a feature relies on such correlations, it will be fragile to some of the pieces being dropped out.
Anti-dropout promotes consolidation of similar features into one, but it also incentivizes that one feature to be maximally complicated and fragile.
Re: first idea. Yeah, something like that. Basically just an attempt at formalization of “functionally similar neurons,” so that when you go to drop out a neuron, you actually drop out all functionally similar ones.