Java – how to index and search numbers in Lucene 4.1
In my 3.6 code, I add numeric fields to my index as follows:
public void addNumericField(IndexField field,Integer value) {
addField(field,NumericUtils.intToPrefixCoded(value));
}
But now you need to pass a bytesref parameter, and you don't know what your intention for the next value is, so I change it to (in progress)
public void addNumericField(IndexField field,Integer value) {
FieldType ft = new FieldType();
ft.setStored(true);
ft.setIndexed(true);
ft.setNumericType(FieldType.NumericType.INT);
doc.add(new IntField(field.getName(),value,ft));
}
It looks cleaner
In 3.6, I also added override queryparser to make it suitable for numerical range search,
package org.musicbrainz.search.servlet;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryparser.classic.MultiFieldQueryParser;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TermRangeQuery;
import org.apache.lucene.util.NumericUtils;
import org.musicbrainz.search.LuceneVersion;
import org.musicbrainz.search.index.LabelIndexField;
import org.musicbrainz.search.servlet.mmd1.LabelType;
public class LabelQueryParser extends MultiFieldQueryParser {
public LabelQueryParser(java.lang.String[] strings,org.apache.lucene.analysis.Analyzer analyzer)
{
super(LuceneVersion.LUCENE_VERSION,strings,analyzer);
}
protected Query newTermQuery(Term term) {
if(
(term.field() == LabelIndexField.CODE.getName())
){
try {
int number = Integer.parseInt(term.text());
TermQuery tq = new TermQuery(new Term(term.field(),NumericUtils.intToPrefixCoded(number)));
return tq;
}
catch (NumberFormatException nfe) {
//If not provided numeric argument just leave as is,//won't give matches
return super.newTermQuery(term);
}
} else {
return super.newTermQuery(term);
}
}
/**
*
* Convert Numeric Fields
*
* @param field
* @param part1
* @param part2
* @param inclusive
* @return
*/
@Override
public Query newRangeQuery(String field,String part1,String part2,boolean inclusive) {
if (
(field.equals(LabelIndexField.CODE.getName()))
)
{
part1 = NumericUtils.intToPrefixCoded(Integer.parseInt(part1));
part2 = NumericUtils.intToPrefixCoded(Integer.parseInt(part2));
}
TermRangeQuery query = (TermRangeQuery)
super.newRangeQuery(field,part1,part2,inclusive);
return query;
}
}
So I figured it out. I don't need it anymore, but unfortunately, this intfield doesn't have any questions now
Further reading, it seems that intfields is only used for range queries, so I don't know how you do matching queries and whether numericrangequery is compatible with the classic query parser I'm using
So I then try to add my value as an encoded string
public void addNumericField(IndexField field,Integer value) {
FieldType fieldType = new FieldType();
fieldType.setStored(true);
fieldType.setIndexed(true);
BytesRef bytes = new BytesRef(NumericUtils.BUF_SIZE_INT);
NumericUtils.intToPrefixCoded(value,bytes);
doc.add(new Field(field.getName(),bytes,fieldType));
}
But at runtime I now receive an error!
java.lang.IllegalArgumentException: Fields with BytesRef values cannot be indexed
But I need index fields, so how do I index numeric fields, as I did in 3.6, so I can search them
Solution
Just use the appropriate fields For example, intfield, longfield, etc
See e.g http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/document/IntField.html
For information about querying these fields, see Lucene longfield exact search with query
