The second idea reminds me of a talk years back about swarm behavior. Some fish swim faster in the sunlight, which makes the entire swarm “seek out” the shady parts of the pond.
There is a second mechanism at play here, where fish try to keep close to their neighbors, so the entire swarm kind of turns into the direction of shade as soon as the part of the swarm in the shade slows down.
This suggests an optimizer for parallel training which doesn’t completely synchronize the weights on the different machines, but instead only tries to keep all sets of weights reasonably close to some of the other sets of weights.
The effect should be that the swarm of different weights turn into the direction of low noise.
Oh man, this is perfect. I’ve been looking for another very-different example of the phenomenon to think about, and this is exactly what I wanted. Thanks!
The second idea reminds me of a talk years back about swarm behavior. Some fish swim faster in the sunlight, which makes the entire swarm “seek out” the shady parts of the pond.
There is a second mechanism at play here, where fish try to keep close to their neighbors, so the entire swarm kind of turns into the direction of shade as soon as the part of the swarm in the shade slows down.
This suggests an optimizer for parallel training which doesn’t completely synchronize the weights on the different machines, but instead only tries to keep all sets of weights reasonably close to some of the other sets of weights.
The effect should be that the swarm of different weights turn into the direction of low noise.
Oh man, this is perfect. I’ve been looking for another very-different example of the phenomenon to think about, and this is exactly what I wanted. Thanks!