首先我們要從 Sphinx 官網(wǎng)上 http://www.sphinxsearch.com/downloads.html 下載
mysql-5.0.45-sphinxse-0.9.8-win32.zip 和
sphinx-0.9.8.1-win32.zip,假設(shè)你已經(jīng)安裝好了 MySQL
先將 mysql 服務(wù)停掉 解壓 mysql-5.0.45-sphinxse-0.9.8-win32.zip 將 bin 和 share 覆蓋掉 mysql 目錄中的 bin 和 share 解壓 sphinx-0.9.8.1-win32.zip 到獨(dú)立的目錄,如:d:/www/sphinx/中
接著開啟 mysql 服務(wù),建立 "test" 數(shù)據(jù)庫(kù),并導(dǎo)入 sql 語(yǔ)句,如下:
-----------------------------------------------------------
CREATE TABLE `documents` (
`id` int(11) NOT NULL auto_increment,
`group_id` int(11) NOT NULL,
`group_id2` int(11) NOT NULL,
`date_added` datetime NOT NULL,
`title` varchar(255) NOT NULL,
`content` text NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5;
INSERT INTO `documents` VALUES ('1', '1', '5', '2008-09-13 21:37:47', 'test one', 'this is my test document number one. also checking search within phrases.');
INSERT INTO `documents` VALUES ('2', '1', '6', '2008-09-13 21:37:47', 'test two', 'this is my test document number two');
INSERT INTO `documents` VALUES ('3', '2', '7', '2008-09-13 21:37:47', 'another doc', 'this is another group');
INSERT INTO `documents` VALUES ('4', '2', '8', '2008-09-13 21:37:47', 'doc number four', 'this is to test groups');
-------------------------------------------實(shí)際上,這個(gè)新建立的表就是 Sphinx 中的 example.sql
我們的測(cè)試表已經(jīng)建立完成,接下來我們要配置 sphinx-doc.conf 文件(重要)
先將 sphinx 下的 sphinx-min.conf 復(fù)制一份改名為 sphinx-doc.conf,接著 修改它:
----------------------------------------------------------------------
#
# Minimal Sphinx configuration sample (clean, simple, functional)
#
# type----------------------------------------數(shù)據(jù)庫(kù)類型,目前支持 mysql 與 pgsql
# strip_html--------------------------------是否去掉html 標(biāo)簽
# sql_host----------------------------------數(shù)據(jù)庫(kù)主機(jī)地址
# sql_user----------------------------------數(shù)據(jù)庫(kù)用戶名
# sql_pass----------------------------------數(shù)據(jù)庫(kù)密碼
# sql_db-------------------------------------數(shù)據(jù)庫(kù)名稱
# sql_port-----------------------------------數(shù)據(jù)庫(kù)采用的端口
# sql_query_pre--------------------------執(zhí)行sql前要設(shè)置的字符集,用utf8必須SET NAMES utf8
# sql_query---------------------------------全文檢索要顯示的內(nèi)容,在這里盡可能不使用where或 group by,將 where 與 groupby 的內(nèi)容交給 sphinx,由 sphinx 進(jìn)行條件過濾與 groupby 效率會(huì)更高
# 注意: select 出來的字段必須至少包括一個(gè)唯一主鍵 (ARTICLESID) 以及要全文檢索的字段,你計(jì)劃原本在 where 中要用到的字段也要 select 出來
# 這里不用使用orderby
# sql_attr_ 開頭的表示一些屬性字段,你原計(jì)劃要用在 where, orderby, groupby 中的字段要在這里定義(# 為自己添加的注釋內(nèi)容)
#source 數(shù)據(jù)源名:
source documents
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass = yourpassword
sql_db = test
sql_port = 3306 # optional, default is 3306
sql_query_pre = SET NAMES utf8
sql_query = \
SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content \
FROM documents
sql_attr_uint = group_id
sql_attr_timestamp = date_added
sql_query_info = SELECT * FROM documents WHERE id=$id
}
index documents
{
source = documents
#path 索引記錄存放目錄,如 d:/sphinx/data/cgfinal ,實(shí)際存放時(shí)會(huì)存放在 d:/sphinx/data 目錄,然后創(chuàng)建多個(gè) cgfinal 名稱,不同擴(kuò)展名的索引文件。
path = d:/www/sphinx/data/doc
docinfo = extern
enable_star = 1
min_word_len = 3
min_prefix_len = 0
min_infix_len = 3
charset_type = sbcs
# 其他的配置如 min_word_len, charset_type, charset_table, ngrams_chars, ngram_len 這些則是支持中文檢索需要設(shè)置的內(nèi)容。
# 如果檢索的不是中文,則 charset_table, ngrams_chars, min_word_len 就要設(shè)置不同的內(nèi)容,具體官方網(wǎng)站的論壇中有很多,大家可以去搜索看看。
}
# mem_limit 索引使用內(nèi)存最大限制,根據(jù)機(jī)器情況而定,默認(rèn)是32M,太小的會(huì)影響索引的性能。
indexer
{
mem_limit = 32M
}
# 搜索的守護(hù)進(jìn)程配置
# 在進(jìn)行全文檢索過程中,searchd要先開啟,mysql在全文檢索時(shí)才能連接到sphinx,由sphinx進(jìn)行全文檢索,再將結(jié)果返回給mysql
# address 偵聽請(qǐng)求的地址,不設(shè)置則偵聽所有地址
# port 偵聽端口
searchd
{
port = 3312
log =d:/www/sphinx/logs/searched_doc.log
query_log = d:/www/sphinx/logs/query_doc.log
read_timeout = 5
max_children = 30
pid_file = d:/www/sphinx/logs/searched-doc.pid
max_matches = 1000
seamless_rotate = 0
preopen_indexes = 0
unlink_old = 1
}
----------------------------------------------------------------------
為了測(cè)試,我們的 Sphinx 配置文件已經(jīng)寫好,確保我們的 Mysql 數(shù)據(jù)庫(kù)已經(jīng)啟動(dòng),如果沒有啟動(dòng)則在 cmd 中鍵入" net start mysql "
接下來,我們的測(cè)試正式開始:
1,生成數(shù)據(jù)索引或重建索引:
(最好再?gòu)?fù)制一個(gè) sphinx-doc.conf 配置文件,并把它放入 bin 文件夾中,下面的舉例 假設(shè)我們已經(jīng)這樣做):
在 cmd 模式下:輸入:
d:/www/sphinx/bin/indexer.exe --config d:/www/sphinx/bin/sphinx-doc.conf documents
2,運(yùn)行檢索守護(hù)進(jìn)程 searchd.exe:
d:/www/sphinx/bin/searchd.exe --config d:/www/sphinx/bin/sphinx-doc.conf
如過這兩步?jīng)]有報(bào)錯(cuò)的話,說明我們的 Sphinx 已經(jīng)正常運(yùn)行了!可以通過 netstat -an 查看是否 3312 端口是否處如監(jiān)聽狀態(tài)。
3,現(xiàn)在來用 sphinx 自帶的工具 search.exe 來測(cè)試一下:
測(cè)試:
索引關(guān)鍵字: this is m
D:\www\sphinx\bin>search.exe -c d:/www/sphinx/bin/sphinx-doc.conf this is m
結(jié)果:
Sphinx 0.9.8-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff
using config file 'd:/www/sphinx/bin/sphinx-doc.conf'...
WARNING: index 'documents': invalid morphology option 'extern' - IGNORED
index 'documents': query 'this is m ': returned 4 matches of 4 total in 0.000 s
c
displaying matches:
1. document=1, weight=1, group_id=1, date_added=Sat Sep 13 21:37:47 2008
id=1
group_id=1
group_id2=5
date_added=2008-09-13 21:37:47
title=test one
content=this is my test document number one. also checking search withi
phrases.
2. document=2, weight=1, group_id=1, date_added=Sat Sep 13 21:37:47 2008
id=2
group_id=1
group_id2=6
date_added=2008-09-13 21:37:47
title=test two
content=this is my test document number two
3. document=3, weight=1, group_id=2, date_added=Sat Sep 13 21:37:47 2008
id=3
group_id=2
group_id2=7
date_added=2008-09-13 21:37:47
title=another doc
content=this is another group
4. document=4, weight=1, group_id=2, date_added=Sat Sep 13 21:37:47 2008
id=4
group_id=2
group_id2=8
date_added=2008-09-13 21:37:47
title=doc number four
content=this is to test groups
words:
1. 'this': 4 documents, 4 hits
-------------------
索引關(guān)鍵字: this is another group
D:\www\sphinx\bin>search.exe -c d:/www/sphinx/bin/sphinx-doc.conf this is another group
結(jié)果:
Sphinx 0.9.8-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff
-------------------
到此sphinx在win上算正常運(yùn)行了,sphinx-doc.conf文件配置比較靈活,根據(jù)你需要索引的數(shù)據(jù)庫(kù)進(jìn)行靈活配置來達(dá)到你需要的效果
如果配置過程中出現(xiàn)運(yùn)行參數(shù)配置問題可以查看 doc/sphinx.html文件,里面對(duì)各種參數(shù)都要詳細(xì)的說明
using config file 'd:/www/sphinx/bin/sphinx-doc.conf'...
WARNING: index 'documents': invalid morphology option 'extern' - IGNORED
index 'documents': query 'this is another group ': returned 1 matches of 1 total
in 0.000 sec
displaying matches:
1. document=3, weight=4, group_id=2, date_added=Sat Sep 13 21:37:47 2008
id=3
group_id=2
group_id2=7
date_added=2008-09-13 21:37:47
title=another doc
content=this is another group
words:
1. 'this': 4 documents, 4 hits
2. 'another': 1 documents, 2 hits
3. 'group': 1 documents, 1 hits
先將 mysql 服務(wù)停掉 解壓 mysql-5.0.45-sphinxse-0.9.8-win32.zip 將 bin 和 share 覆蓋掉 mysql 目錄中的 bin 和 share 解壓 sphinx-0.9.8.1-win32.zip 到獨(dú)立的目錄,如:d:/www/sphinx/中
接著開啟 mysql 服務(wù),建立 "test" 數(shù)據(jù)庫(kù),并導(dǎo)入 sql 語(yǔ)句,如下:
-----------------------------------------------------------
CREATE TABLE `documents` (
`id` int(11) NOT NULL auto_increment,
`group_id` int(11) NOT NULL,
`group_id2` int(11) NOT NULL,
`date_added` datetime NOT NULL,
`title` varchar(255) NOT NULL,
`content` text NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5;
INSERT INTO `documents` VALUES ('1', '1', '5', '2008-09-13 21:37:47', 'test one', 'this is my test document number one. also checking search within phrases.');
INSERT INTO `documents` VALUES ('2', '1', '6', '2008-09-13 21:37:47', 'test two', 'this is my test document number two');
INSERT INTO `documents` VALUES ('3', '2', '7', '2008-09-13 21:37:47', 'another doc', 'this is another group');
INSERT INTO `documents` VALUES ('4', '2', '8', '2008-09-13 21:37:47', 'doc number four', 'this is to test groups');
-------------------------------------------實(shí)際上,這個(gè)新建立的表就是 Sphinx 中的 example.sql
我們的測(cè)試表已經(jīng)建立完成,接下來我們要配置 sphinx-doc.conf 文件(重要)
先將 sphinx 下的 sphinx-min.conf 復(fù)制一份改名為 sphinx-doc.conf,接著 修改它:
----------------------------------------------------------------------
#
# Minimal Sphinx configuration sample (clean, simple, functional)
#
# type----------------------------------------數(shù)據(jù)庫(kù)類型,目前支持 mysql 與 pgsql
# strip_html--------------------------------是否去掉html 標(biāo)簽
# sql_host----------------------------------數(shù)據(jù)庫(kù)主機(jī)地址
# sql_user----------------------------------數(shù)據(jù)庫(kù)用戶名
# sql_pass----------------------------------數(shù)據(jù)庫(kù)密碼
# sql_db-------------------------------------數(shù)據(jù)庫(kù)名稱
# sql_port-----------------------------------數(shù)據(jù)庫(kù)采用的端口
# sql_query_pre--------------------------執(zhí)行sql前要設(shè)置的字符集,用utf8必須SET NAMES utf8
# sql_query---------------------------------全文檢索要顯示的內(nèi)容,在這里盡可能不使用where或 group by,將 where 與 groupby 的內(nèi)容交給 sphinx,由 sphinx 進(jìn)行條件過濾與 groupby 效率會(huì)更高
# 注意: select 出來的字段必須至少包括一個(gè)唯一主鍵 (ARTICLESID) 以及要全文檢索的字段,你計(jì)劃原本在 where 中要用到的字段也要 select 出來
# 這里不用使用orderby
# sql_attr_ 開頭的表示一些屬性字段,你原計(jì)劃要用在 where, orderby, groupby 中的字段要在這里定義(# 為自己添加的注釋內(nèi)容)
#source 數(shù)據(jù)源名:
source documents
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass = yourpassword
sql_db = test
sql_port = 3306 # optional, default is 3306
sql_query_pre = SET NAMES utf8
sql_query = \
SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content \
FROM documents
sql_attr_uint = group_id
sql_attr_timestamp = date_added
sql_query_info = SELECT * FROM documents WHERE id=$id
}
index documents
{
source = documents
#path 索引記錄存放目錄,如 d:/sphinx/data/cgfinal ,實(shí)際存放時(shí)會(huì)存放在 d:/sphinx/data 目錄,然后創(chuàng)建多個(gè) cgfinal 名稱,不同擴(kuò)展名的索引文件。
path = d:/www/sphinx/data/doc
docinfo = extern
enable_star = 1
min_word_len = 3
min_prefix_len = 0
min_infix_len = 3
charset_type = sbcs
# 其他的配置如 min_word_len, charset_type, charset_table, ngrams_chars, ngram_len 這些則是支持中文檢索需要設(shè)置的內(nèi)容。
# 如果檢索的不是中文,則 charset_table, ngrams_chars, min_word_len 就要設(shè)置不同的內(nèi)容,具體官方網(wǎng)站的論壇中有很多,大家可以去搜索看看。
}
# mem_limit 索引使用內(nèi)存最大限制,根據(jù)機(jī)器情況而定,默認(rèn)是32M,太小的會(huì)影響索引的性能。
indexer
{
mem_limit = 32M
}
# 搜索的守護(hù)進(jìn)程配置
# 在進(jìn)行全文檢索過程中,searchd要先開啟,mysql在全文檢索時(shí)才能連接到sphinx,由sphinx進(jìn)行全文檢索,再將結(jié)果返回給mysql
# address 偵聽請(qǐng)求的地址,不設(shè)置則偵聽所有地址
# port 偵聽端口
searchd
{
port = 3312
log =d:/www/sphinx/logs/searched_doc.log
query_log = d:/www/sphinx/logs/query_doc.log
read_timeout = 5
max_children = 30
pid_file = d:/www/sphinx/logs/searched-doc.pid
max_matches = 1000
seamless_rotate = 0
preopen_indexes = 0
unlink_old = 1
}
----------------------------------------------------------------------
為了測(cè)試,我們的 Sphinx 配置文件已經(jīng)寫好,確保我們的 Mysql 數(shù)據(jù)庫(kù)已經(jīng)啟動(dòng),如果沒有啟動(dòng)則在 cmd 中鍵入" net start mysql "
接下來,我們的測(cè)試正式開始:
1,生成數(shù)據(jù)索引或重建索引:
(最好再?gòu)?fù)制一個(gè) sphinx-doc.conf 配置文件,并把它放入 bin 文件夾中,下面的舉例 假設(shè)我們已經(jīng)這樣做):
在 cmd 模式下:輸入:
d:/www/sphinx/bin/indexer.exe --config d:/www/sphinx/bin/sphinx-doc.conf documents
2,運(yùn)行檢索守護(hù)進(jìn)程 searchd.exe:
d:/www/sphinx/bin/searchd.exe --config d:/www/sphinx/bin/sphinx-doc.conf
如過這兩步?jīng)]有報(bào)錯(cuò)的話,說明我們的 Sphinx 已經(jīng)正常運(yùn)行了!可以通過 netstat -an 查看是否 3312 端口是否處如監(jiān)聽狀態(tài)。
3,現(xiàn)在來用 sphinx 自帶的工具 search.exe 來測(cè)試一下:
測(cè)試:
索引關(guān)鍵字: this is m
D:\www\sphinx\bin>search.exe -c d:/www/sphinx/bin/sphinx-doc.conf this is m
結(jié)果:
Sphinx 0.9.8-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff
using config file 'd:/www/sphinx/bin/sphinx-doc.conf'...
WARNING: index 'documents': invalid morphology option 'extern' - IGNORED
index 'documents': query 'this is m ': returned 4 matches of 4 total in 0.000 s
c
displaying matches:
1. document=1, weight=1, group_id=1, date_added=Sat Sep 13 21:37:47 2008
id=1
group_id=1
group_id2=5
date_added=2008-09-13 21:37:47
title=test one
content=this is my test document number one. also checking search withi
phrases.
2. document=2, weight=1, group_id=1, date_added=Sat Sep 13 21:37:47 2008
id=2
group_id=1
group_id2=6
date_added=2008-09-13 21:37:47
title=test two
content=this is my test document number two
3. document=3, weight=1, group_id=2, date_added=Sat Sep 13 21:37:47 2008
id=3
group_id=2
group_id2=7
date_added=2008-09-13 21:37:47
title=another doc
content=this is another group
4. document=4, weight=1, group_id=2, date_added=Sat Sep 13 21:37:47 2008
id=4
group_id=2
group_id2=8
date_added=2008-09-13 21:37:47
title=doc number four
content=this is to test groups
words:
1. 'this': 4 documents, 4 hits
-------------------
索引關(guān)鍵字: this is another group
D:\www\sphinx\bin>search.exe -c d:/www/sphinx/bin/sphinx-doc.conf this is another group
結(jié)果:
Sphinx 0.9.8-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff
-------------------
到此sphinx在win上算正常運(yùn)行了,sphinx-doc.conf文件配置比較靈活,根據(jù)你需要索引的數(shù)據(jù)庫(kù)進(jìn)行靈活配置來達(dá)到你需要的效果
如果配置過程中出現(xiàn)運(yùn)行參數(shù)配置問題可以查看 doc/sphinx.html文件,里面對(duì)各種參數(shù)都要詳細(xì)的說明
using config file 'd:/www/sphinx/bin/sphinx-doc.conf'...
WARNING: index 'documents': invalid morphology option 'extern' - IGNORED
index 'documents': query 'this is another group ': returned 1 matches of 1 total
in 0.000 sec
displaying matches:
1. document=3, weight=4, group_id=2, date_added=Sat Sep 13 21:37:47 2008
id=3
group_id=2
group_id2=7
date_added=2008-09-13 21:37:47
title=another doc
content=this is another group
words:
1. 'this': 4 documents, 4 hits
2. 'another': 1 documents, 2 hits
3. 'group': 1 documents, 1 hits