Cerebras recently unveiled Andromeda—https://www.cerebras.net/andromeda/, an AI supercomputer that enables near linear scaling. Do I understand correctly that this might have a big impact on the large (language) models research, since it would significantly speed up the training? E.g. if current models take 30+ days long to train, we can just 10x the number of machines and have it done in three days? Also, it seems to be much simpler to use, thus decreasing the cost of development and the hassle with dstributed computing.
If so, I think its almost certain that large companies would do it, and this in turn would significantly speed up the research/training/algorithm development of large models such as GPT, GATO and similar? It seems like this type of development should affect the discussion about timelines, however I haven’t seen it mentioned anywhere else before.
[Question] Is the speed of training large models going to increase significantly in the near future due to Cerebras Andromeda?
Cerebras recently unveiled Andromeda—https://www.cerebras.net/andromeda/, an AI supercomputer that enables near linear scaling. Do I understand correctly that this might have a big impact on the large (language) models research, since it would significantly speed up the training? E.g. if current models take 30+ days long to train, we can just 10x the number of machines and have it done in three days? Also, it seems to be much simpler to use, thus decreasing the cost of development and the hassle with dstributed computing.
If so, I think its almost certain that large companies would do it, and this in turn would significantly speed up the research/training/algorithm development of large models such as GPT, GATO and similar? It seems like this type of development should affect the discussion about timelines, however I haven’t seen it mentioned anywhere else before.