I think that linearly available XOR would occur if the model makes linearly available any boolean function which is “linearly independent” from the two values individually. So, maybe this could be implemented via something other than XOR, which is maybe more natural?
What is “this”? It sounds like you’re gesturing at the same thing I discuss in the section “Maybe a⊕b is represented “incidentally” because it’s possible to aggregate noisy signals from many features which are correlated with boolean functions of a and b”
The thing that remains confusing here is that for arbitrary features like these, it’s not obvious why the model is computing any nontrivial boolean function of them and storing it along a different direction. And if the answer is “the model computes this boolean function of arbitrary features” then the downstream consequences are the same, I think.
I edited my comment. I’m just trying to say that like how you get a∧b for free, you also get XOR for free if you compute anything else which is “linearly independent” frrom the components a and b. (For a slightly fuzzy notion of linear independence where we just need separability.)
I think that linearly available XOR would occur if the model makes linearly available any boolean function which is “linearly independent” from the two values individually. So, maybe this could be implemented via something other than XOR, which is maybe more natural?
What is “this”? It sounds like you’re gesturing at the same thing I discuss in the section “Maybe a⊕b is represented “incidentally” because it’s possible to aggregate noisy signals from many features which are correlated with boolean functions of a and b”
The thing that remains confusing here is that for arbitrary features like these, it’s not obvious why the model is computing any nontrivial boolean function of them and storing it along a different direction. And if the answer is “the model computes this boolean function of arbitrary features” then the downstream consequences are the same, I think.
I edited my comment. I’m just trying to say that like how you get a∧b for free, you also get XOR for free if you compute anything else which is “linearly independent” frrom the components a and b. (For a slightly fuzzy notion of linear independence where we just need separability.)