Java – Apache Tika’s C / C + + alternative

I am looking for a C / C + + alternative to the Apache Tika framework based on Java Specifically, I am searching for file metadata and structured text extraction under a framework After some online search and browsing, I am closest to GNU libextractor and a bunch of separate file filters, parsing documents to extract text data (pdftoext, xls2csv.. etc)

Can anyone recommend a good library comparable to Apache's Tika?

thank you

Solution

Tika has a web server mode, so you can always use it to start Tika and send requests from your C code?

Alternatively, Tika has cli mode, so you can start a new Tika process every time and read data from the pipe

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>