There is pretty much no use cases that benefit from high latency clusters of computers. We’re talking hundreds or thousands of times less efficient. Nice idea in theory, doesn’t hold up in practice.
Neural networks seem like they would benefit from high-latency clusters. If you divide the nodes up into 100 clusters during training, and you have ten layers, it might take each cluster 0.001s to process a single sample. So the processing time per cluster is maybe 100-1000 times less than the total latency, which is acceptable if you have 10,000,000 samples and can allow some weight updates to be a bit out of order. Also, if you just want the forward pass of the network, that’s the ideal case, since there are no state updates.
In general, long computations tend to be either stateless or have slowly changing state relative to the latency, so parallelism can work.
Sorry, I was using “high-latency clusters” as a term to refer to heterogeneous at-home consumer hardware networked over WANs, as the term is sometimes meant in this field. The problem isn’t always latency (although for some work loads it is), but rather efficiency. Consumer hardware is simply not energy efficient for most categories of scientific work. Your typical, average computer plugged into such a system is not going to have a top of the line GTX 1080 or Titan X card with lots of RAM. At best it will be a gaming system optimized for a different use case, and probably trades off energy efficiency at peak usage in favor of lowering idle power draw. It almost certainly doesn’t have the right hardware for the particular use case. SETI@Home for example is an ideal use case for high latency clusters, and by some metrics is one of the most powerful ‘supercomputers’ in existence. However it has also been estimated that the entire network could be replaced by a single rack of FPGAs processing in real-time at the source. SETI@Home and related projects work because it is “free” computation. But as soon as you start charging for the use of your computer equipment, it stops making any kind of economic sense.
SETI@Home for example is an ideal use case for high latency clusters, and by some metrics is one of the most powerful ‘supercomputers’ in existence. However it has also been estimated that the entire network could be replaced by a single rack of FPGAs processing in real-time at the source.
There is pretty much no use cases that benefit from high latency clusters of computers. We’re talking hundreds or thousands of times less efficient. Nice idea in theory, doesn’t hold up in practice.
Neural networks seem like they would benefit from high-latency clusters. If you divide the nodes up into 100 clusters during training, and you have ten layers, it might take each cluster 0.001s to process a single sample. So the processing time per cluster is maybe 100-1000 times less than the total latency, which is acceptable if you have 10,000,000 samples and can allow some weight updates to be a bit out of order. Also, if you just want the forward pass of the network, that’s the ideal case, since there are no state updates.
In general, long computations tend to be either stateless or have slowly changing state relative to the latency, so parallelism can work.
Sorry, I was using “high-latency clusters” as a term to refer to heterogeneous at-home consumer hardware networked over WANs, as the term is sometimes meant in this field. The problem isn’t always latency (although for some work loads it is), but rather efficiency. Consumer hardware is simply not energy efficient for most categories of scientific work. Your typical, average computer plugged into such a system is not going to have a top of the line GTX 1080 or Titan X card with lots of RAM. At best it will be a gaming system optimized for a different use case, and probably trades off energy efficiency at peak usage in favor of lowering idle power draw. It almost certainly doesn’t have the right hardware for the particular use case. SETI@Home for example is an ideal use case for high latency clusters, and by some metrics is one of the most powerful ‘supercomputers’ in existence. However it has also been estimated that the entire network could be replaced by a single rack of FPGAs processing in real-time at the source. SETI@Home and related projects work because it is “free” computation. But as soon as you start charging for the use of your computer equipment, it stops making any kind of economic sense.
I would be interested in a cite on that estimate.
Personal conversation with SETI.