Facebooks models use maybe 1⁄4 the compute (rough guess) and have more implementation issues and worse finetuning
Facebooks models use maybe 1⁄4 the compute (rough guess) and have more implementation issues and worse finetuning