Agree that this discussion is surprisingly often confusing and people use the terms interchangeably. Unfortunately, readers often referred to our training compute measurement as a measure of performance, rather than a quantity of executed operations. However, I don’t think that this is necessarily due to the abbreviations but also due to the lack of understanding of what one measures. Next to making the distinction more clear with the terms, one should probably also explain it more and use terms such as quantity and performance.
For my research, I’ve been trying to be consistent with FLOPs (smaller case s) referring to the quantity. While FLOPS or FLOP/s refer to the performance: operations per second. (FWIW, during my time in computer engineering, it has been the norm to use FLOPs for quantity and FLOPS for performance.)
The term Petaflop/s-days also helps—outlining how long the performance (Petaflop/s) runs for how many days, therefore measuring a quantity of operations.
Note it gets even more complicated once we take the number representation (floating point 32bit, 16bit, or even bfloat16) into consideration. Therefore, I’m also in favor of maybe switching at one point towards OPs and OP/s and also document the used number representation for actual technical documentation (such as reporting the compute of ML models).
Agree that this discussion is surprisingly often confusing and people use the terms interchangeably. Unfortunately, readers often referred to our training compute measurement as a measure of performance, rather than a quantity of executed operations. However, I don’t think that this is necessarily due to the abbreviations but also due to the lack of understanding of what one measures. Next to making the distinction more clear with the terms, one should probably also explain it more and use terms such as quantity and performance.
For my research, I’ve been trying to be consistent with FLOPs (smaller case s) referring to the quantity. While FLOPS or FLOP/s refer to the performance: operations per second. (FWIW, during my time in computer engineering, it has been the norm to use FLOPs for quantity and FLOPS for performance.)
The term Petaflop/s-days also helps—outlining how long the performance (Petaflop/s) runs for how many days, therefore measuring a quantity of operations.
Note it gets even more complicated once we take the number representation (floating point 32bit, 16bit, or even bfloat16) into consideration. Therefore, I’m also in favor of maybe switching at one point towards OPs and OP/s and also document the used number representation for actual technical documentation (such as reporting the compute of ML models).
Note that @lennart has since recommended using FLOP for quantity and FLOP/s (or FLOPS) for performance.