WEBUS2.0 In Action - 开始搜索 [代码示例]
2013-09-07 18:18
176 查看
上一篇:WEBUS2.0 In Action - 编制索引 |
下一篇:WEBUS2.0 In Action - 解析索引文件结构(1)
当索引建好之后,要利用WEBUS2.0实现基本搜索功能,至少需要用到如下几个类和接口:
Webus.Index.IQueriable (接口)
Webus.Index.IndexManager (类,实现IQueriable)
Webus.Analysis.IAnalyzer (接口)
Webus.Analysis.MyWordAnalyzer (类,实现IAnalyzer)
Webus.Search.ISearcher (接口)
Webus.Search.IndexSearcher (类,实现ISearcher)
Webus.Search.Query (类,用于构造搜索表达式)
下面我用一个文件搜索的小例程来说明如何开发搜索功能:
FileSearcher.exe
× 首先选择要编制索引的文件(文本文件,源代码等等皆可):
![](http://images.cnblogs.com/cnblogs_com/iamzyf/FileSearcher_0.JPG)
× 直接输入关键词(多个关键词用空格区分)进行搜索:
![](http://images.cnblogs.com/cnblogs_com/iamzyf/FileSearcher_1.JPG)
要实现如上功能其实很简单,首先我们要声明几个需要用到的对象:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
IIndexer writer; //索引writer
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
ISearcher searcher; //搜索器
并在窗体构造函数中构造它们:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
public frmMain()
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
InitializeComponent();
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
writer = new IndexManager(new MyWordAnalyzer());
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
searcher = new IndexSearcher(writer);
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
注意这里使用了SimpleWordAnalyzer去构造writer,这是因为WEBUS在编制索引时需要将文件内容分切成多个语汇单元(token)再对每个语汇单元编制索引。利用不同的分析器(Analyzer)可以分切出不同的语汇单元,从而实现各种各样的分析效果。我们这里使用的最简单的内置分析器SimpleWordAnalyzer,它会将英文分割成单词,将中文分割成字,并将所有语汇单元转换成小写形式。
当对象构造完成之后,我们读取选定的文件内容并将其添加到索引中:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
writer.New(indexPath);
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
foreach (string file in openFileDialog1.FileNames)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
using (StreamReader sr = new StreamReader(file))
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Document doc = new Document();
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
doc.Fields.Add(new Field("FileName", file));
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
doc.Fields.Add(new Field("Content", sr.ReadToEnd(),
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
FieldAttributes.Analyse | FieldAttributes.Index | FieldAttributes.Compress));
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
writer.Add(doc);
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
writer.Close();
对于FileName字段,我们采用Default(Default=Index|Sort)的方式编制索引;对于Content字段,我们需要首先分析(Analyse)然后编制索引(Index)并且要压缩保存(Compress)。
当索引编制完成之后,我们需要关闭writer(writer.Close())才能够用reader读取它:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
writer.Open(indexPath, IndexOpenMode.Read);
OK,一切就绪,现在我们来为TextBox控件的TextChanged事件添加搜索代码,如此一来就可以实现一输入关键词立马显示相关搜索结果的功能:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
//用空格分割用户输入的关键词
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
tvResult.Nodes.Clear();
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
char[] keys = txtKeyword.Text.ToCharArray();
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
//构造一个 "key1 AND key2 AND key3
![](http://www.cnblogs.com/Images/dot.gif)
" 类型的Query
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
Query query = null;
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
foreach (char key in keys)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Query q = new TermQuery(new Term("Content", key.ToString()));
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if (query == null)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
query = q;
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
else
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
query &= q; //用"&"
(AND)操作符对两个Query进行计算,结果为一个新的Query对象
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
Hits hits = searcher.Search(query);//搜索
这里需要注意的是Query的运算。目前在WEBUS中有7种Query:
TermQuery 基本Term搜索
PrefixQuery 前缀搜索
PostfixQuery 后缀搜索
WildcardQuery 通配符搜索
RegexQuery 正则表达式搜索
RangeQuery 范围搜索
BooleanQuery 布尔搜索
它们之间可以支持如下运算符:
+ - ! & |
其中 !a 就是非 a 的意思;
a + b 就是 a AND b 的意思;
a - b 就是 a AND !b 的意思;
a | b 就是 a OR b 的意思;
& 和 + 是相同的效果。
任何两个Query之间通过运算之后的结果都将是一个BooleanQuery对象,通过这种方式,我们可以实现十分复杂的搜索效果!
到此为止,绝大部分代码我们已经完成了,只需要将搜索结果显示出来这个程序就OK了:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
for (int i = 0; i < hits.Count; i++)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Document doc = hits.GetDoc(i);
//获取Document对象
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
TreeNode node = tvResult.Nodes.Add([b]doc.GetField("FileName").Value.ToString()[/b]);
//从Doc中获取FileName字段的值
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
string content = doc.GetField("Content").Value.ToString();
//获取Content字段的值
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
foreach (Position pos in hits[i].Positions)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
int h = content.LastIndexOf("
", pos.Start);
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
int t = content.IndexOf("
", pos.Start + pos.Length);
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
h = h >= 0 ? h : 0;
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
int l = t - h;
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
l = l >= 0 ? l : 0;
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
node.Nodes.Add(string.Format("[Position: {0}] {1}", pos.Start, content.Substring(h, l).Trim()));
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
完整代码下载:FileSearch c# .Net4.0
相关信息及WEBUS2.0 SDK下载:继续我的代码,分享我的快乐 - WEBUS2.0
下一篇:WEBUS2.0 In Action - 解析索引文件结构(1)
当索引建好之后,要利用WEBUS2.0实现基本搜索功能,至少需要用到如下几个类和接口:
Webus.Index.IQueriable (接口)
Webus.Index.IndexManager (类,实现IQueriable)
Webus.Analysis.IAnalyzer (接口)
Webus.Analysis.MyWordAnalyzer (类,实现IAnalyzer)
Webus.Search.ISearcher (接口)
Webus.Search.IndexSearcher (类,实现ISearcher)
Webus.Search.Query (类,用于构造搜索表达式)
下面我用一个文件搜索的小例程来说明如何开发搜索功能:
FileSearcher.exe
× 首先选择要编制索引的文件(文本文件,源代码等等皆可):
× 直接输入关键词(多个关键词用空格区分)进行搜索:
要实现如上功能其实很简单,首先我们要声明几个需要用到的对象:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
IIndexer writer; //索引writer
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
ISearcher searcher; //搜索器
并在窗体构造函数中构造它们:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
public frmMain()
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
InitializeComponent();
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
writer = new IndexManager(new MyWordAnalyzer());
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
searcher = new IndexSearcher(writer);
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
注意这里使用了SimpleWordAnalyzer去构造writer,这是因为WEBUS在编制索引时需要将文件内容分切成多个语汇单元(token)再对每个语汇单元编制索引。利用不同的分析器(Analyzer)可以分切出不同的语汇单元,从而实现各种各样的分析效果。我们这里使用的最简单的内置分析器SimpleWordAnalyzer,它会将英文分割成单词,将中文分割成字,并将所有语汇单元转换成小写形式。
当对象构造完成之后,我们读取选定的文件内容并将其添加到索引中:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
writer.New(indexPath);
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
foreach (string file in openFileDialog1.FileNames)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
using (StreamReader sr = new StreamReader(file))
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Document doc = new Document();
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
doc.Fields.Add(new Field("FileName", file));
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
doc.Fields.Add(new Field("Content", sr.ReadToEnd(),
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
FieldAttributes.Analyse | FieldAttributes.Index | FieldAttributes.Compress));
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
writer.Add(doc);
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
writer.Close();
对于FileName字段,我们采用Default(Default=Index|Sort)的方式编制索引;对于Content字段,我们需要首先分析(Analyse)然后编制索引(Index)并且要压缩保存(Compress)。
当索引编制完成之后,我们需要关闭writer(writer.Close())才能够用reader读取它:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
writer.Open(indexPath, IndexOpenMode.Read);
OK,一切就绪,现在我们来为TextBox控件的TextChanged事件添加搜索代码,如此一来就可以实现一输入关键词立马显示相关搜索结果的功能:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
//用空格分割用户输入的关键词
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
tvResult.Nodes.Clear();
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
char[] keys = txtKeyword.Text.ToCharArray();
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
//构造一个 "key1 AND key2 AND key3
![](http://www.cnblogs.com/Images/dot.gif)
" 类型的Query
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
Query query = null;
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
foreach (char key in keys)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Query q = new TermQuery(new Term("Content", key.ToString()));
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if (query == null)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
query = q;
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
else
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
query &= q; //用"&"
(AND)操作符对两个Query进行计算,结果为一个新的Query对象
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
Hits hits = searcher.Search(query);//搜索
这里需要注意的是Query的运算。目前在WEBUS中有7种Query:
TermQuery 基本Term搜索
PrefixQuery 前缀搜索
PostfixQuery 后缀搜索
WildcardQuery 通配符搜索
RegexQuery 正则表达式搜索
RangeQuery 范围搜索
BooleanQuery 布尔搜索
它们之间可以支持如下运算符:
+ - ! & |
其中 !a 就是非 a 的意思;
a + b 就是 a AND b 的意思;
a - b 就是 a AND !b 的意思;
a | b 就是 a OR b 的意思;
& 和 + 是相同的效果。
任何两个Query之间通过运算之后的结果都将是一个BooleanQuery对象,通过这种方式,我们可以实现十分复杂的搜索效果!
到此为止,绝大部分代码我们已经完成了,只需要将搜索结果显示出来这个程序就OK了:
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
for (int i = 0; i < hits.Count; i++)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Document doc = hits.GetDoc(i);
//获取Document对象
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
TreeNode node = tvResult.Nodes.Add([b]doc.GetField("FileName").Value.ToString()[/b]);
//从Doc中获取FileName字段的值
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
string content = doc.GetField("Content").Value.ToString();
//获取Content字段的值
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
foreach (Position pos in hits[i].Positions)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
int h = content.LastIndexOf("
", pos.Start);
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
int t = content.IndexOf("
", pos.Start + pos.Length);
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
h = h >= 0 ? h : 0;
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
int l = t - h;
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
l = l >= 0 ? l : 0;
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
node.Nodes.Add(string.Format("[Position: {0}] {1}", pos.Start, content.Substring(h, l).Trim()));
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
完整代码下载:FileSearch c# .Net4.0
相关信息及WEBUS2.0 SDK下载:继续我的代码,分享我的快乐 - WEBUS2.0
相关文章推荐
- WEBUS2.0 In Action - 开始搜索 [代码示例]
- WEBUS2.0 In Action - 开始搜索 [代码示例]
- WEBUS2.0 In Action - 搜索操作指南 - (2)
- WEBUS2.0 In Action - 搜索操作指南 - (3)
- WEBUS2.0 In Action - 搜索操作指南 - (3)
- WEBUS2.0 In Action - 搜索操作指南 - (4)
- WEBUS2.0 In Action - 搜索操作指南 - (2)
- WEBUS2.0 In Action - [源代码] - C#代码搜索器
- WEBUS2.0 In Action - 搜索操作指南 - (1)
- WEBUS2.0 In Action - [源代码] - C#代码搜索器
- WEBUS2.0 In Action - 搜索操作指南 - (4)
- WEBUS2.0 In Action - [源代码] - C#代码搜索器
- WEBUS2.0 In Action - 搜索操作指南 - (1)
- Ajax 实现在WebForm中拖动控件并即时在服务端保存状态数据 (Asp.net 2.0)(示例代码下载)
- AjaxPro.NET框架生成高效率的Tree(Asp.net 2.0)(示例代码下载)
- 发布:Visual Studio 2010 一站式示例代码搜索扩展
- Asp.net 2.0 自定义控件开发专题[详细探讨页面状态(视图状态和控件状态)机制及其使用场景](示例代码下载)
- AjaxPro.NET框架生成高效率的Tree(Asp.net 2.0)(示例代码下载)
- AjaxPro.NET完成TextBox智能获取服务端数据功能(Asp.net 2.0)(示例代码下载)
- Asp.net 2.0 自定义控件开发[创建自定义浮动菜单FloadMenu控件][示例代码下载]