posts - 40,  comments - 7,  trackbacks - 0

          1. 實現一個簡單的search feature

          ?? 在本章中只限于討論簡單Lucene 搜索API, 有下面幾個相關的類:

          ?Lucene 基本搜索API:

          功能

          IndexSearcher

          搜索一個index的入口.所有的searches都是通過IndexSearcher 實例的幾個重載的方法實現的.

          Query (and subclasses)

          各個子類封裝了特定搜索類型的邏輯(logic),Query實例傳遞給IndexSearchersearch方法.

          QueryParser

          處理一個可讀的表達式,轉換為一個具體的Query實例.

          Hits

          包含了搜索的結果.IndexSearchersearch函數返回.

          下面我們來看幾個書中的例子:

          LiaTestCase.java? 一個繼承自 TestCase 并且擴展了 TestCase 的類 , 下面的幾個例子都繼承自該類 .

          01 ?package?lia.common;
          02 ?
          03 ?import?junit.framework.TestCase;
          04 ?import?org.apache.lucene.store.FSDirectory;
          05 ?import?org.apache.lucene.store.Directory;
          06 ?import?org.apache.lucene.search.Hits;
          07 ?import?org.apache.lucene.document.Document;
          08 ?
          09 ?import?java.io.IOException;
          10 ?import?java.util.Date;
          11 ?import?java.text.ParseException;
          12 ?import?java.text.SimpleDateFormat;
          13 ?
          14 ?/**
          15 ??*?LIA?base?class?for?test?cases.
          16 ??*/
          17 ?public?abstract?class?LiaTestCase?extends?TestCase?{
          18 ???private?String?indexDir?=?System.getProperty("index.dir");? // 測試 index 已經建立好了
          19 ???protected?Directory?directory;
          20 ?
          21 ???protected?void?setUp()?throws?Exception?{
          22 ?????directory?=?FSDirectory.getDirectory(indexDir,?false);
          23 ???}
          24 ?
          25 ???protected?void?tearDown()?throws?Exception?{
          26 ?????directory.close();
          27 ???}
          28 ?
          29 ???/**
          30 ????*?For?troubleshooting 為了 解決問題的方法
          31 ????*/
          32 ???protected?final?void?dumpHits(Hits?hits)?throws?IOException?{
          33 ?????if?(hits.length()?==?0)?{
          34 ???????System.out.println("No?hits");
          35 ?????}
          36 ?
          37 ?????for?(int?i=0;?i?<?hits.length();?i++)?{
          38 ???????Document?doc?=?hits.doc(i);
          39 ???????System.out.println(hits.score(i)?+?":"?+?doc.get("title"));
          40 ?????}
          41 ???}
          42 ?
          43 ???protected?final?void?assertHitsIncludeTitle(
          44 ???????????????????????????????????????????Hits?hits,?String?title)
          45 ?????throws?IOException?{
          46 ?????for?(int?i=0;?i?<?hits.length();?i++)?{
          47 ???????Document?doc?=?hits.doc(i);
          48 ???????if?(title.equals(doc.get("title")))?{
          49 ?????????assertTrue(true);
          50 ?????????return;
          51 ???????}
          52 ?????}
          53 ?
          54 ?????fail("title?'"?+?title?+?"'?not?found");
          55 ???}
          56 ?
          57 ???protected?final?Date?parseDate(String?s)?throws?ParseException?{
          58 ???????return?new?SimpleDateFormat("yyyy-MM-dd").parse(s);
          59 ???}
          60 ?}

          ? I. 搜索一個特定的Term 和利用QueryParser 解析用戶輸入的表達式

          ? 要利用一個特定的term搜索,使用QueryTerm就可以了,單個term 尤其適合Keyword搜索. 解析用戶輸入的表達式可以更適合用戶的使用方式,搜索表達式的解析有QueryParser來完成.如果表達式解析錯誤 會有異常拋出, 可以取得相信的錯誤信息 以便給用戶適當的提示.在解析表達式時,還需要一個Analyzer 來分析用戶的輸入, 并根據不同的Analyzer來生產相應的Term然后構成Query實例.

          下面看個例子吧: BasicSearchingTest.java

          01 ?package?lia.searching;
          02 ?
          03 ?import?lia.common.LiaTestCase;
          04 ?import?org.apache.lucene.analysis.SimpleAnalyzer;
          05 ?import?org.apache.lucene.document.Document;
          06 ?import?org.apache.lucene.index.Term;
          07 ?import?org.apache.lucene.queryParser.QueryParser;
          08 ?import?org.apache.lucene.search.Hits;
          09 ?import?org.apache.lucene.search.IndexSearcher;
          10 ?import?org.apache.lucene.search.Query;
          11 ?import?org.apache.lucene.search.TermQuery;
          12 ?
          13 ?public?class?BasicSearchingTest?extends?LiaTestCase?{
          14 ?
          15 ???public?void?testTerm()?throws?Exception?{
          16 ?????IndexSearcher?searcher?=?new?IndexSearcher(directory);
          17 ?????Term?t?=?new?Term("subject",?"ant");??????????????? // 構造一個 Term
          18 ?????Query?query?=?new?TermQuery(t);
          19 ?????Hits?hits?=?searcher.search(query);???????????????? // 搜索
          20 ?????assertEquals("JDwA",?1,?hits.length());???????????? // 測試結果
          21 ?
          22 ?????t?=?new?Term("subject",?"junit");
          23 ?????hits?=?searcher.search(new?TermQuery(t));??????????????????
          24 ?????assertEquals(2,?hits.length());
          25 ?
          26 ?????searcher.close();
          27 ???}
          28 ?
          29 ???public?void?testKeyword()?throws?Exception?{? // 測試關鍵字搜索
          30 ?????IndexSearcher?searcher?=?new?IndexSearcher(directory);
          31 ?????Term?t?=?new?Term("isbn",?"1930110995");???????????????? // 關鍵字 term
          32 ?????Query?query?=?new?TermQuery(t);
          33 ?????Hits?hits?=?searcher.search(query);
          34 ?????assertEquals("JUnit?in?Action",?1,?hits.length());
          35 ???}
          36 ?
          37 ???public?void?testQueryParser()?throws?Exception?{? // 測試 QueryParser.
          38 ?????IndexSearcher?searcher?=?new?IndexSearcher(directory);
          39 ?
          40 ?????Query?query?=?QueryParser.parse("+JUNIT?+ANT?-MOCK",
          41 ?????????????????????????????????????"contents",
          42 ?????????????????????????????????????new?SimpleAnalyzer());? // 通過解析搜索表達式 返回一個 Query 實例
          43 ?????Hits?hits?=?searcher.search(query);
          44 ?????assertEquals(1,?hits.length());
          45 ?????Document?d?=?hits.doc(0);
          46 ?????assertEquals("Java?Development?with?Ant",?d.get("title"));
          47 ?
          48 ?????query?=?QueryParser.parse("mock?OR?junit",
          49 ???????????????????????????????"contents",
          50 ???????????????????????????????new?SimpleAnalyzer());????????????? // 通過解析搜索表達式 返回一個 Query 實例
          51 ?????hits?=?searcher.search(query);
          52 ?????assertEquals("JDwA?and?JIA",?2,?hits.length());
          53 ???}
          54 ?}

          2. 使用IndexSearcher

          ? 既然IndexSearcher 是那么的重要 下面我們來看看如何使用吧. 在構造IndexSearcher時 有兩種方法:

          ■ By Directory
          ■ By a file system path

          推薦使用Directory 這樣就會Index 存放的位置 無關了, 在上面的 LiaTestCase.java 中我們構造了一個 Directory:

          ?? directory?=?FSDirectory.getDirectory(indexDir,? false );

          利用她構造一個 IndexSearch :

          IndexSearcher searcher = new IndexSearcher(directory);

          然后可以利用 searchersearch方法來搜索了 (6個重載的方法,參考doc 看看什么時候使用合適:) ,然后可以得到Hits, Hits中包含了搜索的結果 下面來看看Hits:

          I.Working with Hits

          Hits 4個方法, 如下

          Hits methods for efficiently accessing search results

          Hits method

          Return value

          length()

          Number of documents in the Hits collection

          doc(n)

          Document instance of the nth top-scoring document

          id(n)

          Document ID of the nth top-scoring document

          score(n)

          Normalized score (based on the score of the topmost document) of the nth top-scoring document, guaranteed to be greater than 0 and less than or equal to 1

          通過這幾個方法 可以得到搜索結果的相關信息, Hits也會caches 一些Documents 以便提升性能, 默認caches 100的被認為常用的結果.

          注意:

          ? The methods doc(n), id(n), and score(n) require documents to be loaded
          from the index when they aren’t already cached. This leads us to recommend
          only calling these methods for documents you truly need to display or access;
          defer calling them until needed.

          II.Paging through Hits

          Paging Hits時 用兩種方法可以使用:

          ■ Keep the original Hits and IndexSearcher instances available while theuser is navigating the search results.
          ■ Requery each time the user navigates to a new page.

          推薦使用第二種 ,這樣基于無狀態協議時 會簡單些,Http 搜索(google search)

          III.reading index into memory

          有時 為了充分利用系統資源,提高性能 可以把index 讀入到內存中搜索, :

          RAMDirectory ramDir = new RAMDirectory(dir);

          該構造函數有幾個重載實現,根據不同的數據來源構造RAMDirectory 看看doc.

          3.Understanding Lucene Scoring

          Lucene 搜索返回的Hits中 的結果根據默認的Score 排序,score 是根據如下公式計算的.

          上面公式的參數解釋如下:

          Factor

          Description

          tf(t in d)

          Term frequency factor for the term (t) in the document (d).

          idf(t)

          Inverse document frequency of the term.

          boost(t.field in d)

          Field boost, as set during indexing.

          lengthNorm(t.field in d)

          Normalization value of a field, given the number of terms within the field. This value is computed during indexing and stored in the index.

          coord(q, d)

          Coordination factor, based on the number of query terms the document contains.

          queryNorm(q)

          Normalization value for a query, given the sum of the squared weights of each of the query terms.

          關于Score的更多內容參考 Similarity 類的 docs.

          通過 Explanation 類可以了解到 document 各個 score 的參數細節 , toString 函數可以打印出來 , 可以有 IndexSearch 得到 Explanation: 如下 :

          01 ?package?lia.searching;
          02 ?
          03 ?import?org.apache.lucene.analysis.SimpleAnalyzer;
          04 ?import?org.apache.lucene.document.Document;
          05 ?import?org.apache.lucene.queryParser.QueryParser;
          06 ?import?org.apache.lucene.search.Explanation;
          07 ?import?org.apache.lucene.search.Hits;
          08 ?import?org.apache.lucene.search.IndexSearcher;
          09 ?import?org.apache.lucene.search.Query;
          10 ?import?org.apache.lucene.store.FSDirectory;
          11 ?
          12 ?public?class?Explainer?{
          13 ???public?static?void?main(String[]?args)?throws?Exception?{
          14 ?????if?(args.length?!=?2)?{
          15 ???????System.err.println("Usage:?Explainer?<index?dir>?<query>");
          16 ???????System.exit(1);
          17 ?????}
          18 ?
          19 ?????String?indexDir?=?args[0];
          20 ?????String?queryExpression?=?args[1];
          21 ?
          22 ?????FSDirectory?directory?=
          23 ?????????FSDirectory.getDirectory(indexDir,?false);
          24 ?
          25 ?????Query?query?=?QueryParser.parse(queryExpression,
          26 ?????????"contents",?new?SimpleAnalyzer());
          27 ?
          28 ?????System.out.println("Query:?"?+?queryExpression);
          29 ?
          30 ?????IndexSearcher?searcher?=?new?IndexSearcher(directory);
          31 ?????Hits?hits?=?searcher.search(query);
          32 ?
          33 ?????for?(int?i?=?0;?i?<?hits.length();?i++)?{
          34 ???????Explanation?explanation?=????????????????? // Generate Explanation of single Document for query
          35 ???????????????????????????????searcher.explain(query,?hits.id(i));
          36 ?
          37 ???????System.out.println("----------");
          38 ???????Document?doc?=?hits.doc(i);
          39 ???????System.out.println(doc.get("title"));
          40 ???????System.out.println(explanation.toString());? // 打印出來結果
          41 ?????}
          42 ???}
          43 ?}

          結果如下:

          Query: junit

          ----------

          JUnit in Action

          0.65311843 = fieldWeight(contents:junit in 2), product of:

          ??? 1.4142135 = tf(termFreq(contents:junit)=2) // (1)junit contents 中出現兩次

          ??? 1.8472979 = idf(docFreq=2)

          ??? 0.25 = fieldNorm(field=contents, doc=2)

          ----------

          Java Development with Ant

          0.46182448 = fieldWeight(contents:junit in 1), product of:

          ??? 1.0 = tf(termFreq(contents:junit)=1)?? // (2)junit contents 中出現一次

          ??? 1.8472979 = idf(docFreq=2)

          ??? 0.25 = fieldNorm(field=contents, doc=1)

          (1) JUnit in Action has the term junit twice in its contents field. The contents field in

          our index is an aggregation of the title and subject fields to allow a single field

          for searching.

          (2) Java Development with Ant has the term junit only once in its contents field.

          還可以使用toHtml 方法轉換為Html代碼, Nutch 項目的核心就是利用Explanation(請參考Nutch 項目文檔).

          4.creating queries programmatically

          IndexSearch search函數需要一個Query實例, Query有不同的子類,分別應用不同的場合,下面來看看各種Query:

          TermQuery
          TermQuery
          最簡單(上文提到過), Term t=new Term("contents","junit"); new TermQuery(t)就可以構造
          TermQuery
          把查詢條件視為一個keyword, 要求和查詢內容完全匹配,比如Field.Keyword類型就可以使用TermQuery

          RangeQuery
          RangeQuery
          看名字就知道是表示一個范圍的搜索條件,RangeQuery query = new RangeQuery(begin, end, included);
          boolean
          參數表示是否包含邊界條件本身, 用字符表示為"[begin TO end]"()包含邊界值 或者"{begin TO end}"(不包含邊界值)

          PrefixQuery
          顧名思義,就是表示以XX開頭的查詢, 字符表示為"something*"

          BooleanQuery
          邏輯組合的Query,你可以把各種Query添加進去并標明他們的邏輯關系,添加條件用如下方法

          public void add(Query query, boolean required, boolean prohibited)

          ? 后兩個boolean變量是標示AND OR NOT三種關系(如果同時取true的話是不和邏輯的哦 ) 字符表示為" AND OR NOT" "+ -" ,一個BooleanQuery中可以添加多個Query, 如果超過setMaxClauseCount(int)的值(默認1024)的話,會拋出TooManyClauses錯誤.

          ?? 3:兩個參數的組合

          ?

          required

          false

          true

          prohibited

          false

          Clause is optional

          Clause must match

          true

          Clause must not

          match

          Invalid

          PhraseQuery
          表示不嚴格語句的查詢,比如"quick fox"要匹配"quick brown fox","quick brown high fox",PhraseQuery所以提供了一個setSlop()參數,在查詢中,lucene會嘗試調整單詞的距離和位置,這個參數表示可以接受調整次數限制,如果實際的內容可以在這么多步內調整為完全匹配,那么就被視為匹配.在默認情況下slop的值是0, 所以默認是不支持非嚴格匹配的, 通過設置slop參數(比如"quick fox"匹配"quick brown fox"就需要1slop來把fox后移動1),我們可以讓lucene來模糊查詢. 值得注意的是,PhraseQuery不保證前后單詞的次序,在上面的例子中,"fox quick"需要2slop,也就是如果slop如果大于等于2,那么"fox quick"也會被認為是匹配的.如果是多個Term的搜索,slop指最大的所以的用到次數.看個例子就更明白了:

          01 ?package?lia.searching;
          02 ?
          03 ?import?junit.framework.TestCase;
          04 ?import?org.apache.lucene.analysis.WhitespaceAnalyzer;
          05 ?import?org.apache.lucene.document.Document;
          06 ?import?org.apache.lucene.document.Field;
          07 ?import?org.apache.lucene.index.IndexWriter;
          08 ?import?org.apache.lucene.index.Term;
          09 ?import?org.apache.lucene.search.Hits;
          10 ?import?org.apache.lucene.search.IndexSearcher;
          11 ?import?org.apache.lucene.search.PhraseQuery;
          12 ?import?org.apache.lucene.store.RAMDirectory;
          13 ?
          14 ?import?java.io.IOException;
          15 ?
          16 ?public?class?PhraseQueryTest?extends?TestCase?{
          17 ???private?IndexSearcher?searcher;
          18 ?
          19 ???protected?void?setUp()?throws?IOException?{
          20 ?????//?set?up?sample?document
          21 ?????RAMDirectory?directory?=?new?RAMDirectory();
          22 ?????IndexWriter?writer?=?new?IndexWriter(directory,
          23 ?????????new?WhitespaceAnalyzer(),?true);
          24 ?????Document?doc?=?new?Document();
          25 ?????doc.add(Field.Text("field",
          26 ???????????????"the?quick?brown?fox?jumped?over?the?lazy?dog"));
          27 ?????writer.addDocument(doc);
          28 ?????writer.close();
          29 ?
          30 ?????searcher?=?new?IndexSearcher(directory);
          31 ???}
          32 ?
          33 ???private?boolean?matched(String[]?phrase,?int?slop)
          34 ???????throws?IOException?{
          35 ?????PhraseQuery?query?=?new?PhraseQuery();
          36 ?????query.setSlop(slop);
          37 ?
          38 ?????for?(int?i=0;?i?<?phrase.length;?i++)?{
          39 ???????query.add(new?Term("field",?phrase[i]));
          40 ?????}
          41 ?
          42 ?????Hits?hits?=?searcher.search(query);
          43 ?????return?hits.length()?>?0;
          44 ???}
          45 ?
          46 ???public?void?testSlopComparison()?throws?Exception?{
          47 ?????String[]?phrase?=?new?String[]?{"quick",?"fox"};
          48 ?
          49 ?????assertFalse("exact?phrase?not?found",?matched(phrase,?0));
          50 ?
          51 ?????assertTrue("close?enough",?matched(phrase,?1));
          52 ???}
          53 ?
          54 ???public?void?testReverse()?throws?Exception?{
          55 ?????String[]?phrase?=?new?String[]?{"fox",?"quick"};
          56 ?
          57 ?????assertFalse("hop?flop",?matched(phrase,?2));
          58 ?????assertTrue("hop?hop?slop",?matched(phrase,?3));
          59 ???}
          60 ?
          61 ???public?void?testMultiple()?throws?Exception?{???? // 測試多個 Term 的搜索
          62 ?????assertFalse("not?close?enough",
          63 ?????????matched(new?String[]?{"quick",?"jumped",?"lazy"},?3));
          64 ?
          65 ?????assertTrue("just?enough",
          66 ?????????matched(new?String[]?{"quick",?"jumped",?"lazy"},?4));
          67 ?
          68 ?????assertFalse("almost?but?not?quite",
          69 ?????????matched(new?String[]?{"lazy",?"jumped",?"quick"},?7));
          70 ?
          71 ?????assertTrue("bingo",
          72 ?????????matched(new?String[]?{"lazy",?"jumped",?"quick"},?8));
          73 ?
          74 ???}
          75 ?
          76 ?} ????

          WildcardQuery
          使用?(0或者一個字符)*(0 或者多個字符)來表示,比如?ild*可以匹配 wild ,mild ,wildcard ...,值得注意的是,wildcard,只要是匹配上的紀錄,他們的相關度都是一樣的,比如wildcard mild的對于?ild的相關度就是一樣的.

          FuzzyQuery
          他能模糊匹配英文單詞,比如fuzzywuzzy他們可以看成類似, 對于英文的各種時態變化和復數形式,這個FuzzyQuery還算有用,匹配結果的相關度是不一樣的.字符表示為 "fuzzy~".特別是你忘記了一個單詞如何寫了的時候最為有用, 比如 用google search 來搜索liceue? google 在搜索不到結果時候 會提醒你 是不是搜索Lucene? . 但是這個Query對中文沒有什么用處.

          5.parsing query expressions: QueryParser

          對于一個讓普通用戶使用的產品來說,使用搜索表達式還是比較人性化的.下面看看如何使用QueryParser來處理搜索表達式.

          注意: Whenever special characters are used in a query expression, you need to provide an escaping mechanism so that the special characters can be used in a normal fashion. QueryParser uses a backslash (\) to escape special characters within terms. The escapable characters are as follows: \ + - ! ( ) : ^ ] { } ~ * ???????? (特殊字符要用轉移字符表示)

          QueryParser 把用戶輸入的各種查詢條件轉為Query, 利用Query's toString方法可以打印出QueryParser解析后的等價的結果.通過該方式 可以了解 QueryParser是否安裝你的意愿工作.注意: QueryParser用到了Analyzer,不同的Analyzer可能會忽略stop word,所以QueryParser parse過后的QuerytoString未必和原來的String一樣.

          boolean 操作:

          or and not (或者+ - )表示 ,很容易理解

          分組:Groupping
          比如"(a AND b) or c",就是括號分組,也很容易理解

          域選擇:FieldSelectiong
          QueryParser
          的查詢條件是對默認的Field進行的, 它在QueryParser解析的時候編碼指定, 如果用戶需要在查詢條件中選用另外的Field, 可以使用如下語法: fieldname:a, 如果是多個分組,可以用fieldname:(a b c)表示.
           

          范圍搜索:range search

          使用[ begin? TO end](包括邊界條件) {begin TO end} 實現.

          注意: Nondate range queries use the beginning and ending terms as the user entered them, without modification. In other words, the beginning and ending terms are not analyzed. Start and end terms must not contain whitespace, or parsing fails. In our example index, the field pubmonth isn’t a date field; it’s text of the format YYYYMM.

          在處理日期時 可以通過QueryParsersetLocale方法設置地區 處理I18N問題. 見下面的例子:

          Phrase query:

          用雙引號引住的字符串 可以創建一個PhraseQuery, 在隱含之間的內容被分析后創建Query可能把一些Stop word 忽略掉.如下:

          094 ???public?void?testPhraseQuery()?throws?Exception?{
          095 ?????Query?q?=?QueryParser.parse("\"This?is?Some?Phrase*\"",? // this is StandardAnalyzer 中為 stop word
          096 ?????????"field",?new?StandardAnalyzer());
          097 ?????assertEquals("analyzed",
          098 ?????????"\"some?phrase\"",?q.toString("field"));?? // 沒有 this is 出現
          099 ?
          100 ?????q?=?QueryParser.parse("\"term\"",?"field",?analyzer);
          101 ?????assertTrue("reduced?to?TermQuery",?q?instanceof?TermQuery);?
          102 ???}

          通配符搜索
          關于通配符搜索注意:QueryParser默認不允許*號出現在開始部分,這樣做的目的主要是為了防止用戶誤輸入* 從而導致嚴重的性能問題

          Fuzzy query:

          ?~ 結尾代表一個Fuzzy.

          關于使用通配符 和模糊搜索都有不同的性能問題.以后會討論到

          boosting query

          通過使用符號^后面跟個浮點值 可以設置該termboost.: junit^2.0 testing 設置 junit TermQuery boost值為 2.0
          testing TermQueryboost值還是默認值1.0. 大家可以試試google search 有沒有該特性. :)

          QueryParser
          確實很好友 但是不是總是適合你的情況 來看看作者的觀點吧:

          To QueryParse or not to QueryParse?

          QueryParser is a quick and effortless way to give users powerful query construction,

          but it isn’t right for all scenarios. QueryParser can’t create every type of

          query that can be constructed using the API . In chapter 5, we detail a handful of

          API -only queries that have no QueryParser expression capability. You must keep

          in mind all the possibilities available when exposing free-form query parsing to

          an end user; some queries have the potential for performance bottlenecks, and

          the syntax used by the built-in QueryParser may not be suitable for your needs.

          You can exert some limited control by subclassing QueryParser (see section 6.3.1).

          Should you require different expression syntax or capabilities beyond what

          QueryParser offers, technologies such as ANTLR 7 and JavaCC 8 are great options.

          We don’t discuss the creation of a custom query parser; however, the source code

          for Lucene’s QueryParser is freely available for you to borrow from.

          You can often obtain a happy medium by combining a QueryParser -parsed

          query with API -created queries as clauses in a BooleanQuery . This approach is

          demonstrated in section 5.5.4. For example, if users need to constrain searches

          to a particular category or narrow them to a date range, you can have the user

          interface separate those selections into a category chooser or separate daterange

          fields.

          OK ch3 到此就結束了 現在可以在Application中添加其本的搜索功能了.慶賀啊!

          來個總結:)

          Lucene rapidly provides highly relevant search results to queries. Most applications

          need only a few Lucene classes and methods to enable searching. The most

          fundamental things for you to take from this chapter are an understanding of

          the basic query types (of which TermQuery , RangeQuery , and BooleanQuery are the

          primary ones) and how to access search results.

          Although it can be a bit daunting, Lucene’s scoring formula (coupled with the

          index format discussed in appendix B and the efficient algorithms) provides the

          magic of returning the most relevant documents first. Lucene’s QueryParser

          parses human-readable query expressions, giving rich full-text search power to

          end users. QueryParser immediately satisfies most application requirements;

          however, it doesn’t come without caveats, so be sure you understand the rough

          edges. Much of the confusion regarding QueryParser stems from unexpected

          analysis interactions; chapter 4 goes into great detail about analysis, including

          more on the QueryParser issues.

          And yes, there is more to searching than we’ve covered in this chapter, but

          understanding the groundwork is crucial. Chapter 5 delves into Lucene’s more

          elaborate features, such as constraining (or filtering) the search space of queries

          and sorting search results by field values; chapter 6 explores the numerous

          ways you can extend Lucene’s searching capabilities for custom sorting and

          query parsing.
          posted on 2007-01-05 10:11 Lansing 閱讀(845) 評論(0)  編輯  收藏 所屬分類: 搜索引擎
          <2007年1月>
          31123456
          78910111213
          14151617181920
          21222324252627
          28293031123
          45678910

          歡迎探討,努力學習Java哈

          常用鏈接

          留言簿(3)

          隨筆分類

          隨筆檔案

          文章分類

          文章檔案

          Lansing's Download

          Lansing's Link

          我的博客

          搜索

          •  

          最新評論

          閱讀排行榜

          評論排行榜

          主站蜘蛛池模板: 长乐市| 盐源县| 四平市| 商南县| 武宁县| 福安市| 榕江县| 海口市| 措勤县| 东兰县| 禹州市| 永新县| 永登县| 丹江口市| 邹城市| 齐河县| 和龙市| 策勒县| 怀安县| 通许县| 阆中市| 天津市| 临安市| 封丘县| 上林县| 小金县| 荥经县| 定安县| 扶绥县| 汨罗市| 贵阳市| 开远市| 望谟县| 鹤峰县| 宜黄县| 吉林市| 桃园县| 凤庆县| 辰溪县| 长兴县| 正宁县|