FWIW, I tried replacing gzip by better compressors. These files are so small that bzip and LZMA do worse, but paq was consistently 10% better than gzip. In particular, it didn’t change the order. (I’m surprised that LZMA ever does worse than LZ. Maybe xz, the implementation I used, has big headers?)
I tried and that’s only 8 bytes, a single checksum. I’m seeing 50-100 more bytes for xz than gz on all of them, from the smallest hex-sm.py to the largest checkers.py.
Then I read the man page. It mentions “the filter chain...which normally would be stored in the container headers,” which could be it, but doesn’t sound like a lot of bytes. Also, I discovered another option: --format=lzma gets checkers almost all the way down to gz (xz,lzma,gz=1864,1826,1821), but gets hex-sm only halfway (xz,lzma,gz=428,391,356). (xz meaning without checksum)
FWIW, I tried replacing gzip by better compressors. These files are so small that bzip and LZMA do worse, but paq was consistently 10% better than gzip. In particular, it didn’t change the order. (I’m surprised that LZMA ever does worse than LZ. Maybe xz, the implementation I used, has big headers?)
Maybe. Does performance improve when you disable the default CRC-64 integrity check with
--check=none
?I tried and that’s only 8 bytes, a single checksum. I’m seeing 50-100 more bytes for xz than gz on all of them, from the smallest hex-sm.py to the largest checkers.py.
Then I read the man page. It mentions “the filter chain...which normally would be stored in the container headers,” which could be it, but doesn’t sound like a lot of bytes. Also, I discovered another option: --format=lzma gets checkers almost all the way down to gz (xz,lzma,gz=1864,1826,1821), but gets hex-sm only halfway (xz,lzma,gz=428,391,356). (xz meaning without checksum)