Java – how to index and search numbers in Lucene 4.1
In my 3.6 code, I add numeric fields to my index as follows:
public void addNumericField(IndexField field,Integer value) { addField(field,NumericUtils.intToPrefixCoded(value)); }
But now you need to pass a bytesref parameter, and you don't know what your intention for the next value is, so I change it to (in progress)
public void addNumericField(IndexField field,Integer value) { FieldType ft = new FieldType(); ft.setStored(true); ft.setIndexed(true); ft.setNumericType(FieldType.NumericType.INT); doc.add(new IntField(field.getName(),value,ft)); }
It looks cleaner
In 3.6, I also added override queryparser to make it suitable for numerical range search,
package org.musicbrainz.search.servlet; import org.apache.lucene.index.Term; import org.apache.lucene.queryparser.classic.MultiFieldQueryParser; import org.apache.lucene.search.Query; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TermRangeQuery; import org.apache.lucene.util.NumericUtils; import org.musicbrainz.search.LuceneVersion; import org.musicbrainz.search.index.LabelIndexField; import org.musicbrainz.search.servlet.mmd1.LabelType; public class LabelQueryParser extends MultiFieldQueryParser { public LabelQueryParser(java.lang.String[] strings,org.apache.lucene.analysis.Analyzer analyzer) { super(LuceneVersion.LUCENE_VERSION,strings,analyzer); } protected Query newTermQuery(Term term) { if( (term.field() == LabelIndexField.CODE.getName()) ){ try { int number = Integer.parseInt(term.text()); TermQuery tq = new TermQuery(new Term(term.field(),NumericUtils.intToPrefixCoded(number))); return tq; } catch (NumberFormatException nfe) { //If not provided numeric argument just leave as is,//won't give matches return super.newTermQuery(term); } } else { return super.newTermQuery(term); } } /** * * Convert Numeric Fields * * @param field * @param part1 * @param part2 * @param inclusive * @return */ @Override public Query newRangeQuery(String field,String part1,String part2,boolean inclusive) { if ( (field.equals(LabelIndexField.CODE.getName())) ) { part1 = NumericUtils.intToPrefixCoded(Integer.parseInt(part1)); part2 = NumericUtils.intToPrefixCoded(Integer.parseInt(part2)); } TermRangeQuery query = (TermRangeQuery) super.newRangeQuery(field,part1,part2,inclusive); return query; } }
So I figured it out. I don't need it anymore, but unfortunately, this intfield doesn't have any questions now
Further reading, it seems that intfields is only used for range queries, so I don't know how you do matching queries and whether numericrangequery is compatible with the classic query parser I'm using
So I then try to add my value as an encoded string
public void addNumericField(IndexField field,Integer value) { FieldType fieldType = new FieldType(); fieldType.setStored(true); fieldType.setIndexed(true); BytesRef bytes = new BytesRef(NumericUtils.BUF_SIZE_INT); NumericUtils.intToPrefixCoded(value,bytes); doc.add(new Field(field.getName(),bytes,fieldType)); }
But at runtime I now receive an error!
java.lang.IllegalArgumentException: Fields with BytesRef values cannot be indexed
But I need index fields, so how do I index numeric fields, as I did in 3.6, so I can search them
Solution
Just use the appropriate fields For example, intfield, longfield, etc
See e.g http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/document/IntField.html
For information about querying these fields, see Lucene longfield exact search with query