On the other hand, if we’re designing a bridge and there’s one particular strut which fails under load, then we’d randomly try changing hundreds or thousands of other struts before we tried changing the one which is failing.
This bridge example seems to be using a different algorithm than the e coli movement. The e coli moves in a random direction while the bridge adjustments always happen in the direction of a basis vector.
If you were altering the bridge in the same way that e coli moves, then every change to the bridge would alter that one particular strut by a little bit (in addition to altering every other aspect of the bridge).
Whereas if e coli moved in the same way that you describe the hypothetical bridge design, then it would only move purely along a single coordinate (such as from (0,0,0,0,0) to (0,0,0,1,0)) rather than in a random direction.
My intuition is that the efficiency of the bridge algorithm is O(1/n) and the e coli algorithm is O(1/sqrt(n)).
Which suggests that if you’re doing randomish exploration, you should try to shake things up and move in a bunch of dimensions at once rather than just moving along a single identified dimension.
Your intuition is exactly correct; the math indeed works out to O(1/sqrt(n)) for a random direction. I actually had a few paragraphs about that in an earlier draft, but cut it because high-dimensional problems in the real world basically never involve a space where Euclidean distance makes sense.
Seems like some changes are more like Euclidean distance while others are more like turning a single knob. If I go visit my cousin for a week and a bunch of aspects of my lifestyles shift towards his, that is more Euclidean than if I change my lifestyle by adding a new habit of jogging each morning. (Although both are in between the extremes of pure Euclidean or purely a single knob—you could think of it in terms of the dimensionality of the subspace that you’re moving in.)
And something similar can apply to work habits, thinking styles, etc.
The relevant aspect which makes a Euclidean distance metric relevant is not “all the dimensions change a bit at once”. The relevant aspect is “cost of testing the change is a function of Euclidean distance”—e.g. in the e-coli case, the energy expended during straight-line swimming should be roughly proportional to the distance traveled.
I’m not sure about your visit-the-cousin example. My intuition is that O(1/n) is the default cost to e-coli-style optimization when there’s no special structure to the exploration cost which we can exploit, but I don’t have a strong mathematical argument for that which would apply in an obvious way to the cousin example. My weak argument would be “absent any special knowledge about the problem structure, we have to treat visiting-the-cousin as an independent knob to turn same as all the other knobs”—i.e. we don’t have a prior reason to believe that the effects of the visit are a combination of other (available) knobs.
Which suggests that if you’re doing randomish exploration, you should try to shake things up and move in a bunch of dimensions at once rather than just moving along a single identified dimension.
If you can only do randomish exploration this sounds right, but I think this often isn’t the right approach (not saying you would disagree with this, just pointing it out). When we change things along basis vectors, we’re implicitly taking advantage of the fact that we have a built-in basis for the world (namely, our general world model). This lets us reason about things like causality, constraints, etc. since we already are parsing the world into a useful basis.
The distance between the n-dimensional points (0,0,...,0) and (1,1,...,1) is sqrt(n). So if you move sqrt(n) units along that diagonal, you move 1 unit along the dimension that matters. Or if you move 1 unit along the diagonal, you move 1/sqrt(n) units along that dimension. 1/sqrt(n) efficiency.
If you instead move 1 unit in a random direction then sometimes you’ll move more than that and sometimes you’ll move less, but I figured that was unimportant enough on net to leave it O(1/sqrt(n)).
This bridge example seems to be using a different algorithm than the e coli movement. The e coli moves in a random direction while the bridge adjustments always happen in the direction of a basis vector.
If you were altering the bridge in the same way that e coli moves, then every change to the bridge would alter that one particular strut by a little bit (in addition to altering every other aspect of the bridge).
Whereas if e coli moved in the same way that you describe the hypothetical bridge design, then it would only move purely along a single coordinate (such as from (0,0,0,0,0) to (0,0,0,1,0)) rather than in a random direction.
My intuition is that the efficiency of the bridge algorithm is O(1/n) and the e coli algorithm is O(1/sqrt(n)).
Which suggests that if you’re doing randomish exploration, you should try to shake things up and move in a bunch of dimensions at once rather than just moving along a single identified dimension.
Your intuition is exactly correct; the math indeed works out to O(1/sqrt(n)) for a random direction. I actually had a few paragraphs about that in an earlier draft, but cut it because high-dimensional problems in the real world basically never involve a space where Euclidean distance makes sense.
Seems like some changes are more like Euclidean distance while others are more like turning a single knob. If I go visit my cousin for a week and a bunch of aspects of my lifestyles shift towards his, that is more Euclidean than if I change my lifestyle by adding a new habit of jogging each morning. (Although both are in between the extremes of pure Euclidean or purely a single knob—you could think of it in terms of the dimensionality of the subspace that you’re moving in.)
And something similar can apply to work habits, thinking styles, etc.
The relevant aspect which makes a Euclidean distance metric relevant is not “all the dimensions change a bit at once”. The relevant aspect is “cost of testing the change is a function of Euclidean distance”—e.g. in the e-coli case, the energy expended during straight-line swimming should be roughly proportional to the distance traveled.
I’m not sure about your visit-the-cousin example. My intuition is that O(1/n) is the default cost to e-coli-style optimization when there’s no special structure to the exploration cost which we can exploit, but I don’t have a strong mathematical argument for that which would apply in an obvious way to the cousin example. My weak argument would be “absent any special knowledge about the problem structure, we have to treat visiting-the-cousin as an independent knob to turn same as all the other knobs”—i.e. we don’t have a prior reason to believe that the effects of the visit are a combination of other (available) knobs.
If you can only do randomish exploration this sounds right, but I think this often isn’t the right approach (not saying you would disagree with this, just pointing it out). When we change things along basis vectors, we’re implicitly taking advantage of the fact that we have a built-in basis for the world (namely, our general world model). This lets us reason about things like causality, constraints, etc. since we already are parsing the world into a useful basis.
Care to elaborate how did you get to the formula O(1/sqrt(n))?
The distance between the n-dimensional points (0,0,...,0) and (1,1,...,1) is sqrt(n). So if you move sqrt(n) units along that diagonal, you move 1 unit along the dimension that matters. Or if you move 1 unit along the diagonal, you move 1/sqrt(n) units along that dimension. 1/sqrt(n) efficiency.
If you instead move 1 unit in a random direction then sometimes you’ll move more than that and sometimes you’ll move less, but I figured that was unimportant enough on net to leave it O(1/sqrt(n)).
Why would the bridge algorithm be O(1/n)?