I note that this is how Falcon from Abu Dhabi was trained. To quote:
Falcon is a 40 billion parameters autoregressive decoder-only model trained on 1 trillion tokens. It was trained on 384 GPUs on AWS over the course of two months.
I note that this is how Falcon from Abu Dhabi was trained. To quote: