Java – find the size of the file in the gzip file
Is there any way to find out the size of the original file in the gzip file in Java?
As in, I have a 15 MB file a.txt, which has been compressed by gzip to 3gb a.gz I want to know the size of a.txt in a.gz without decompressing a.gz
Solution
There is no really reliable way except to shoot the stream You do not need to save the extracted results, so you can simply read and decode the entire file to determine the size without occupying the space of the extracted results
There is an unreliable way to determine the uncompressed size, that is, to look at the last four bytes of the gzip file, that is, the uncompressed length of module 232 sorted in small end order
This is unreliable because a) the uncompressed data may be longer than 232 bytes, and b) the gzip file may contain multiple gzip streams, in which case you will find the length of only the last stream
If you control the source code of gzip files, you know that they are composed of a single gzip stream, and you know that their compression rate is less than 232 bytes, then you can use the last four bytes with confidence
Pigz (can be in http://zlib.net/pigz/ Found) can be completed in both directions Pigz - l will soon give you unreliable lengths Pigz - LT decodes the entire input and gives you a reliable length