PhraseQuery使用位置信息來進(jìn)行相關(guān)查詢,比如TermQuery使用“我們”和“祖國”進(jìn)行查詢,那么文檔中含有這兩個詞的所有記錄都會被查詢出來。但是有一種情況,我們可能需要查詢“我們”和“中國”之間只隔一個字和兩個字或者兩個字等,而不是它們之間字距相差十萬八千里,就可以使用PhraseQuery。比如下面的情況:
doc.add(Field.Text("field", "the quick brown fox jumped over the lazy dog"));
那么:
String[] phrase = new String[] {"quick", "fox"};
assertFalse("exact phrase not found", matched(phrase, 0));
assertTrue("close enough", matched(phrase, 1));
multi-terms:
assertFalse("not close enough", matched(new String[] {"quick", "jumped", "lazy"}, 3));
assertTrue("just enough", matched(new String[] {"quick", "jumped", "lazy"}, 4));
assertFalse("almost but not quite", matched(new String[] {"lazy", "jumped", "quick"}, 7));
assertTrue("bingo", matched(new String[] {"lazy", "jumped", "quick"}, 8));
數(shù)字表示slop,通過如下方式設(shè)置,表示按照順序從第一個字段到第二個字段之間間隔的term個數(shù)。
query.setSlop(slop);
順序很重要:
String[] phrase = new String[] {"fox", "quick"};
assertFalse("hop flop", matched(phrase, 2));
assertTrue("hop hop slop", matched(phrase, 3));
原理如下圖所示:
對于查詢關(guān)鍵字quick和fox,只需要fox移動一個位置即可匹配quick brown fox。而對于fox和quick這兩個關(guān)鍵字
需要將fox移動三個位置。移動的距離越大,那么這項記錄的score就越小,被查詢出來的可能行就越小了。
SpanQuery利用位置信息查詢更有意思的查詢:
SpanQuery type Description
SpanTermQuery Used in conjunction with the other span query types. On its own, it’s
functionally equivalent to TermQuery.
SpanFirstQuery Matches spans that occur within the first part of a field.
SpanNearQuery Matches spans that occur near one another.
SpanNotQuery Matches spans that don’t overlap one another.
SpanOrQuery Aggregates matches of span queries.
SpanFirstQuery:To query for spans that occur within the first n positions of a field, use Span-FirstQuery.
quick = new SpanTermQuery(new Term("f", "quick"));
brown = new SpanTermQuery(new Term("f", "brown"));
red = new SpanTermQuery(new Term("f", "red"));
fox = new SpanTermQuery(new Term("f", "fox"));
lazy = new SpanTermQuery(new Term("f", "lazy"));
sleepy = new SpanTermQuery(new Term("f", "sleepy"));
dog = new SpanTermQuery(new Term("f", "dog"));
cat = new SpanTermQuery(new Term("f", "cat"));
SpanFirstQuery sfq = new SpanFirstQuery(brown, 2);
assertNoMatches(sfq);
sfq = new SpanFirstQuery(brown, 3);
assertOnlyBrownFox(sfq);
SpanNearQuery:
彼此相鄰的跨度
3.PhrasePrefixQuery 主要用來進(jìn)行同義詞查詢的:
IndexWriter writer = new IndexWriter(directory, new WhitespaceAnalyzer(), true);
Document doc1 = new Document();
doc1.add(Field.Text("field", "the quick brown fox jumped over the lazy dog"));
writer.addDocument(doc1);
Document doc2 = new Document();
doc2.add(Field.Text("field","the fast fox hopped over the hound"));
writer.addDocument(doc2);
PhrasePrefixQuery query = new PhrasePrefixQuery();
query.add(new Term[] {new Term("field", "quick"), new Term("field", "fast")});
query.add(new Term("field", "fox"));
Hits hits = searcher.search(query);
assertEquals("fast fox match", 1, hits.length());
query.setSlop(1);
hits = searcher.search(query);
assertEquals("both match", 2, hits.length());