I think this framing conflates the question of input with that of presentation. The ‘e’ notation seems easiest to input to write—simple, unambiguous and reliable to parse, enterable everywhere—but it’s not a good one to output to read, because if nothing else now it looks like it’s multiplying variables & numbers etc.
They don’t have to be the same. If numbers are written uniformly, they can be parsed & rendered differently.
And they should be. For example, I think that one of the things that makes calculations or arguments hard to follow is that they shamelessly break human subitizing and intuitive numeracy by promiscuously mixing units, which makes it hard to do one of the most common things we do with numbers—compare them—while not really making anything easier. This leads to “number numbness”. (‘Microsecond or millisecond’? Well, what’s a factor of a thousand between friends?)
In much the same way that people sloppily will quote dollar amounts from decades apart as if they were the same thing (which is why I inflation adjust them automatically into current dollars), they will casually talk about “10 million” vs “20 billion”, imposing a burden of constant mental arithmetic as one tries to juggle back and forth all of these different base units. Sure, decimal numbers or metric units may not be as bad as trying to convert hogheads to long fathoms or swapping between binary and decimal, but it’s still not ideal.
It is no wonder that people constantly are off by orders of magnitude and embarrass themselves on social media when they turn out to be a factor of 10 off because they accidentally converted by 100 instead of 1000, or they convert milligrams and grams wrong and poison themselves on film. If someone is complaining about the US federal government, which is immediately more understandable: “of $20 billion, $10 million was spent on engineering a space pen” or “of $20,000 million, $10 million was spent on a space pen”? (And this is an easy case, with about the most familiar possible units. As soon as it becomes something like milligrams and grams...)
I mean, imagine if this was normal practice with statistical graphs: “oh, the blue and red bar columns, even though they are the same size in the image and describe the same thing, dollars, are actually 10× different. Didn’t you see in the legend where it clearly says that ‘blue = 1; red = 10’?” “Er, OK, but if they’re the same sort of thing, then why are some blue and some the larger red?” “No reason. I just didn’t feel like multiplying the blue datapoints by 10 before graphing.” ”...I see.”
So while it might look a little odd, I try to write with a single base-unit throughout a passage of writing, to enable immediate comparison.
(I think this helps a lot with DL scaling too, because somehow when you talk about a model having ’50 million parameters’ and are comparing it to multi-billion parameter models like a “GPT-3-175b”, that seems a lot bigger than if you had written ‘0.05b parameters’. Or if you compare, say, a Gato with 1b parameters to a GPT-4 with 1,400b parameters, the comparison feels a lot more intuitive than if I had written ‘a GPT-4 with 1.4 trillion parameters’.)
This practice might seem too annoying for the author (although if it is, that should be a warning sign: if it’s hard for you, the author, to corral these units while carefully writing them, how do you expect the reader to handle them while skimming and reading?), but it could just be automated. Write all numbers which are numerical in a standard format, whether it’s 10e2 or 1,000, and then a program can simply parse it for numbers, take the first number, extract the largest base that makes it a single-digit number (“thousand”) and then rewrite all following numbers with that as the unit, formatted in your preferred format as ‘1 × 102’ or whatever.
(And you can, for HTML, make them copy-paste as regular full-length numbers through a similar trick as we do to provide the original LaTeX for math formulas which were converted from LaTeX, so it can be fully compatible with copy-pasting into a REPL or other application.)
A gripe of mine in the same vein is that my old employer had this idea that in any public facing communication “numbers up to ten must be written in words, 11 or higher in digits”. I think its a common rule in (for example) newspapers. But it leads to ludicrous sentences like “There are either nine, ten or 11 devolved administrations depending on how they are counted.” It drives me completely crazy, either the whole list should be words or numerals, not a mix.
I’d like to second this comment, at least broadly. I’ve seen the e notation in blog posts and the like and I’ve struggled to put the × 10 in the right place.
One of the reasons why I dislike trying to understand numbers written in scientific notation is because I have trouble mapping them to normal numbers with lots of commas in them. Engineering notation helps a lot with this — at least for numbers greater than 1 — by having the exponent be a multiple of 3. Oftentimes, losing significant figures isn’t an issue in anything but the most technical scientific writing.
I think this framing conflates the question of input with that of presentation. The ‘e’ notation seems easiest to input to write—simple, unambiguous and reliable to parse, enterable everywhere—but it’s not a good one to output to read, because if nothing else now it looks like it’s multiplying variables & numbers etc.
They don’t have to be the same. If numbers are written uniformly, they can be parsed & rendered differently.
And they should be. For example, I think that one of the things that makes calculations or arguments hard to follow is that they shamelessly break human subitizing and intuitive numeracy by promiscuously mixing units, which makes it hard to do one of the most common things we do with numbers—compare them—while not really making anything easier. This leads to “number numbness”. (‘Microsecond or millisecond’? Well, what’s a factor of a thousand between friends?)
In much the same way that people sloppily will quote dollar amounts from decades apart as if they were the same thing (which is why I inflation adjust them automatically into current dollars), they will casually talk about “10 million” vs “20 billion”, imposing a burden of constant mental arithmetic as one tries to juggle back and forth all of these different base units. Sure, decimal numbers or metric units may not be as bad as trying to convert hogheads to long fathoms or swapping between binary and decimal, but it’s still not ideal.
It is no wonder that people constantly are off by orders of magnitude and embarrass themselves on social media when they turn out to be a factor of 10 off because they accidentally converted by 100 instead of 1000, or they convert milligrams and grams wrong and poison themselves on film. If someone is complaining about the US federal government, which is immediately more understandable: “of $20 billion, $10 million was spent on engineering a space pen” or “of $20,000 million, $10 million was spent on a space pen”? (And this is an easy case, with about the most familiar possible units. As soon as it becomes something like milligrams and grams...)
I mean, imagine if this was normal practice with statistical graphs: “oh, the blue and red bar columns, even though they are the same size in the image and describe the same thing, dollars, are actually 10× different. Didn’t you see in the legend where it clearly says that ‘blue = 1; red = 10’?” “Er, OK, but if they’re the same sort of thing, then why are some blue and some the larger red?” “No reason. I just didn’t feel like multiplying the blue datapoints by 10 before graphing.” ”...I see.”
So while it might look a little odd, I try to write with a single base-unit throughout a passage of writing, to enable immediate comparison. (I think this helps a lot with DL scaling too, because somehow when you talk about a model having ’50 million parameters’ and are comparing it to multi-billion parameter models like a “GPT-3-175b”, that seems a lot bigger than if you had written ‘0.05b parameters’. Or if you compare, say, a Gato with 1b parameters to a GPT-4 with 1,400b parameters, the comparison feels a lot more intuitive than if I had written ‘a GPT-4 with 1.4 trillion parameters’.)
This practice might seem too annoying for the author (although if it is, that should be a warning sign: if it’s hard for you, the author, to corral these units while carefully writing them, how do you expect the reader to handle them while skimming and reading?), but it could just be automated. Write all numbers which are numerical in a standard format, whether it’s
10e2
or1,000
, and then a program can simply parse it for numbers, take the first number, extract the largest base that makes it a single-digit number (“thousand”) and then rewrite all following numbers with that as the unit, formatted in your preferred format as ‘1 × 102’ or whatever.(And you can, for HTML, make them copy-paste as regular full-length numbers through a similar trick as we do to provide the original LaTeX for math formulas which were converted from LaTeX, so it can be fully compatible with copy-pasting into a REPL or other application.)
Good ideas.
A gripe of mine in the same vein is that my old employer had this idea that in any public facing communication “numbers up to ten must be written in words, 11 or higher in digits”. I think its a common rule in (for example) newspapers. But it leads to ludicrous sentences like “There are either nine, ten or 11 devolved administrations depending on how they are counted.” It drives me completely crazy, either the whole list should be words or numerals, not a mix.
I’d like to second this comment, at least broadly. I’ve seen the e notation in blog posts and the like and I’ve struggled to put the
× 10
in the right place.One of the reasons why I dislike trying to understand numbers written in scientific notation is because I have trouble mapping them to normal numbers with lots of commas in them. Engineering notation helps a lot with this — at least for numbers greater than 1 — by having the exponent be a multiple of 3. Oftentimes, losing significant figures isn’t an issue in anything but the most technical scientific writing.