Over all 150 tasks [in BIG-bench], 25% of tasks had discontinuity greater than +10%, and 15% of tasks had a discontinuity greater than +20%.
Discontinuity = (actual accuracy for 540B model) - (log-linear projection using 8b → 62b)
Over all 150 tasks [in BIG-bench], 25% of tasks had discontinuity greater than +10%, and 15% of tasks had a discontinuity greater than +20%.
Discontinuity = (actual accuracy for 540B model) - (log-linear projection using 8b → 62b)