Java – how to parse / decompress / decompress Maven repository indexes generated by nexus
I've been from http://mirrors.ibiblio.org/pub/mirrors/maven2/dot-index/nexus-maven-repository-index.gz Downloaded the index generated for Maven central
I want to list artifact information from these index files (for example, groupid, artifactid, version) I've seen a high level of API It seems that I must use the following Maven dependencies However, I don't know what entry point to use (which class?) And how to use it to access these files:
<dependency> <groupId>org.sonatype.nexus</groupId> <artifactId>nexus-indexer</artifactId> <version>3.0.4</version> </dependency>
Solution
have a look https://github.com/cstamas/maven-indexer-examples project
In short, you don't need to download Gz / zip (New / old format) manually, so the indexer does it for you (in addition, it handles incremental updates if possible)
GZ is the "new" format, independent of the Lucene index format that contains only data (therefore, independent of the Lucene version), while zip is the "old" format, which is actually Lucene 2.4 X index zipper At present, there is no change in data content, but there are plans for the future
As I said, there is no difference in data content between the two, but some fields (as you noticed) are indexed but not stored in the index, so if you use ZIP format, it will be searchable but not retrievable