why can’t I do some variant of ls piped through cut to get just the file sizes of all the files in a directory?
Nerd sniped. After some fiddling, the problem with ls | cut is that cut in delimiter mode treats multiple spaces in a row as multiple delimiters. You could put cut in bytes or character mode instead, but then you have the problem that ls uses “as much as necessary” spacing, which means that if the largest file in your directory needs one more digit to represent then ls will push everything to the right one more digit.
If you want to handle ls output then awk would be easier, because it collapses multiple successive delimiters [1] but normally I’d just use du [2]. Though I have a vague memory that du and ls -l define file size differently.
(This doesn’t counter your point at all—unix tools are kind of a mess—but I was curious.)
Your vague memory is probably that ls -l gives file size, while du give “disk usage”—the number of blocks used. On my computer the blocksize is 4k, so du only reports multiples of this size. (In particular, the default behavior is to report units of historical blocksize, so it only reports multiples of 8.)
A huge difference that I doubt you forget is how they define the size of directories—just metadata vs recursively. But that means that du is expensive. I use it all the time, but not everywhere.
Nerd sniped. After some fiddling, the problem with
ls | cut
is that cut in delimiter mode treats multiple spaces in a row as multiple delimiters. You could put cut in bytes or character mode instead, but then you have the problem that ls uses “as much as necessary” spacing, which means that if the largest file in your directory needs one more digit to represent then ls will push everything to the right one more digit.If you want to handle ls output then awk would be easier, because it collapses multiple successive delimiters [1] but normally I’d just use du [2]. Though I have a vague memory that du and ls -l define file size differently.
(This doesn’t counter your point at all—unix tools are kind of a mess—but I was curious.)
[1] ls -l | awk ‘{print $5}’ [2] du -hs *
Your vague memory is probably that ls -l gives file size, while du give “disk usage”—the number of blocks used. On my computer the blocksize is 4k, so du only reports multiples of this size. (In particular, the default behavior is to report units of historical blocksize, so it only reports multiples of 8.)
A huge difference that I doubt you forget is how they define the size of directories—just metadata vs recursively. But that means that du is expensive. I use it all the time, but not everywhere.