Java – Lucene highlighter
Lucene 4.3. How does a highlighter work? I want to print out the search results from the document (as the search word and the 8 words after the word) How do I use fluorescent pens to do this? I have added the complete TXT, HTML and XML documents to the file and added them to my index. Now I have a search formula from which I may add a highlighter function:
String index = "index";
String field = "contents";
String queries = null;
int repeat = 1;
boolean raw = true; //not sure what raw really does???
String queryString = null; //keep null,prompt user later for it
int hitsPerPage = 10; //leave it at 10,go from there later
//need to add all files to same directory
index = "C:\\Users\\plib\\Documents\\index";
repeat = 4;
IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index)));
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_43);
BufferedReader in = null;
if (queries != null) {
  in = new BufferedReader(new InputStreamReader(new FileInputStream(queries),"UTF-8"));
} else {
  in = new BufferedReader(new InputStreamReader(system.in,"UTF-8"));
}
QueryParser parser = new QueryParser(Version.LUCENE_43,field,analyzer);
while (true) {
  if (queries == null && queryString == null) {                        // prompt the user
    System.out.println("Enter query. 'quit' = quit: ");
  }
  String line = queryString != null ? queryString : in.readLine();
  if (line == null || line.length() == -1) {
    break;
  }
  line = line.trim();
  if (line.length() == 0 || line.equalsIgnoreCase("quit")) {
    break;
  }
  Query query = parser.parse(line);
  System.out.println("Searching for: " + query.toString(field));
  if (repeat > 0) {                           // repeat & time as benchmark
    Date start = new Date();
    for (int i = 0; i < repeat; i++) {
      searcher.search(query,null,100);
    }
    Date end = new Date();
    System.out.println("Time: "+(end.getTime()-start.getTime())+"ms");
  }
  doPagingSearch(in,searcher,query,hitsPerPage,raw,queries == null && queryString == null);
  if (queryString != null) {
    break;
  }
}
reader.close();
}
Solution
I had the same problem and finally came across this article
http://vnarcher.blogspot.ca/2012/04/highlighting-text-with-lucene.html
The key part is that when you iterate over the result, gethighlightedfield. Is called on the result value to highlight
private String getHighlightedField(Query query,Analyzer analyzer,String fieldName,String fieldValue) throws IOException,InvalidTokenOffsetsException {
    Formatter formatter = new SimpleHTMLFormatter("<span class="\"MatchedText\"">","</span>");
    Queryscorer queryscorer = new Queryscorer(query);
    Highlighter Highlighter = new Highlighter(formatter,queryscorer);
    Highlighter.setTextFragmenter(new SimpleSpanFragmenter(queryscorer,Integer.MAX_VALUE));
    Highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);
    return Highlighter.getBestFragment(this.analyzer,fieldName,fieldValue);
}
In this case, it assumes that the output will be HTML, which simply wraps the highlighted text with < span > Use the CSS class of matchedtext You can then define custom CSS rules to perform whatever you want to highlight
