关于脏字典过滤问题-用正则表达式来过滤脏数据
2007-09-26 09:06
423 查看
方法一:使用正则表达式
1
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
//脏字典数据存放文件路径
2
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
private static string FILE_NAME="zang.txt";
3
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
//脏数据字典表,如:脏数据一|脏数据二|脏数据三
4
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
public static string dirtyStr="";
5
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
6
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
public ValidDirty()
7
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
8
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if (HttpRuntime.Cache["Regex"]==null)
9
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
10
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
dirtyStr=ReadDic();
11
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
//用于检测脏字典的正则表达式
12
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Regex validateReg= new Regex("^((?!"+dirtyStr+").(?<!"+dirtyStr+"))*$",RegexOptions.Compiled|RegexOptions.ExplicitCapture);
13
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
HttpRuntime.Cache.Insert("Regex" ,validateReg,null,DateTime.Now.AddMinutes(20) ,TimeSpan.Zero);
14
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
15
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
16
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
17
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
private string ReadDic()
18
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
19
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
FILE_NAME=Environment.CurrentDirectory+"//"+FILE_NAME;
20
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
21
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if (!File.Exists(FILE_NAME))
22
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
23
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Console.WriteLine("{0} does not exist.", FILE_NAME);
24
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return "";
25
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
26
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
StreamReader sr = File.OpenText(FILE_NAME);
27
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
String input="";
28
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
while (sr.Peek() > -1)
29
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
30
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
input += sr.ReadLine() ;
31
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
32
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
33
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
sr.Close();
34
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return input;
35
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
36
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
37
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
38
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
39
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
public bool ValidByReg(string str)
40
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
41
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Regex reg=(Regex)HttpRuntime.Cache["Regex"];
42
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return reg.IsMatch(str) ;
43
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
44
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
感觉这种方法的执行效率不是很高,简单的测试了一下 1000字的文章,脏字典有800多个关键字
式了一下是 1.238秒,大家有没有更好的方法,请不吝赐教!
方法二:普通循环查找方法
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
public bool ValidGeneral(string str)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if(!File.Exists(FILE_NAME))
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Console.WriteLine("文件路径或者文件路径不存在错误信息") ;
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return false;
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
else
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
StreamReader objReader = new StreamReader(FILE_NAME,System.Text.Encoding.GetEncoding("gb2312"));
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
string sLine="";
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
ArrayList arrText = new ArrayList();
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
while (sLine != null)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
sLine = objReader.ReadLine();
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if (sLine != null)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
arrText.Add(sLine);
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
objReader.Close();
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
foreach (string sOutput in arrText)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
string[] strArr=sOutput.Split('|');
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
for (int i = 0; i < strArr.Length; i++)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if (str.IndexOf(strArr[i])!=-1)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return false;
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return true;
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
以下是测试的方法,有什么问题还大家请指出!
1
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
DateTime t1 =DateTime.Now;
2
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
string str="213";
3
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
4
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
5
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
6
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
7
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
8
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
9
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
10
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
11
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
12
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
13
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
14
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
15
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
16
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
17
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
18
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
19
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
20
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
21
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
22
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
23
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
24
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
25
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
26
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
27
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
28
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
29
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
30
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
31
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
32
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
33
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
34
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
35
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
36
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
37
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
38
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
39
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
40
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
41
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
42
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
43
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
44
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
45
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
46
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
47
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
48
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
49
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
50
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
51
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
52
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
53
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
ValidDirty vd=new ValidDirty() ;
54
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
Console.WriteLine(vd.ValidByReg(str)) ;
55
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
DateTime t2 =DateTime.Now;
56
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
TimeSpan ts=t2-t1;
57
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
Console.WriteLine(ts.TotalMilliseconds) ;
58
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
Console.Read() ;
脏字典下载
1
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
//脏字典数据存放文件路径
2
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
private static string FILE_NAME="zang.txt";
3
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
//脏数据字典表,如:脏数据一|脏数据二|脏数据三
4
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
public static string dirtyStr="";
5
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
6
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
public ValidDirty()
7
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
8
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if (HttpRuntime.Cache["Regex"]==null)
9
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
10
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
dirtyStr=ReadDic();
11
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
//用于检测脏字典的正则表达式
12
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Regex validateReg= new Regex("^((?!"+dirtyStr+").(?<!"+dirtyStr+"))*$",RegexOptions.Compiled|RegexOptions.ExplicitCapture);
13
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
HttpRuntime.Cache.Insert("Regex" ,validateReg,null,DateTime.Now.AddMinutes(20) ,TimeSpan.Zero);
14
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
15
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
16
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
17
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
private string ReadDic()
18
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
19
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
FILE_NAME=Environment.CurrentDirectory+"//"+FILE_NAME;
20
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
21
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if (!File.Exists(FILE_NAME))
22
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
23
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Console.WriteLine("{0} does not exist.", FILE_NAME);
24
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return "";
25
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
26
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
StreamReader sr = File.OpenText(FILE_NAME);
27
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
String input="";
28
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
while (sr.Peek() > -1)
29
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
30
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
input += sr.ReadLine() ;
31
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
32
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
33
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
sr.Close();
34
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return input;
35
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
36
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
37
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
38
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
39
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
public bool ValidByReg(string str)
40
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
41
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Regex reg=(Regex)HttpRuntime.Cache["Regex"];
42
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return reg.IsMatch(str) ;
43
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
44
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
感觉这种方法的执行效率不是很高,简单的测试了一下 1000字的文章,脏字典有800多个关键字
式了一下是 1.238秒,大家有没有更好的方法,请不吝赐教!
方法二:普通循环查找方法
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
public bool ValidGeneral(string str)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if(!File.Exists(FILE_NAME))
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
Console.WriteLine("文件路径或者文件路径不存在错误信息") ;
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return false;
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
else
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
StreamReader objReader = new StreamReader(FILE_NAME,System.Text.Encoding.GetEncoding("gb2312"));
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
string sLine="";
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
ArrayList arrText = new ArrayList();
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
while (sLine != null)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
sLine = objReader.ReadLine();
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if (sLine != null)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
arrText.Add(sLine);
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
objReader.Close();
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
foreach (string sOutput in arrText)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
string[] strArr=sOutput.Split('|');
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
for (int i = 0; i < strArr.Length; i++)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
if (str.IndexOf(strArr[i])!=-1)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](http://www.cnblogs.com/Images/dot.gif)
{
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return false;
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
return true;
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
}
![](http://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
![](http://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
}
以下是测试的方法,有什么问题还大家请指出!
1
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
DateTime t1 =DateTime.Now;
2
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
string str="213";
3
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
4
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
5
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
6
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
7
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
8
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
9
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
10
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
11
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
12
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
13
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
14
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
15
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
16
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
17
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
18
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
19
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
20
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
21
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
22
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
23
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
24
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
25
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
26
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
27
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
28
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
29
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
30
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
31
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
32
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
33
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
34
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
35
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
36
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
37
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
38
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
39
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
40
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
41
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
42
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
43
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
44
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
45
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
46
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
47
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
48
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
49
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
50
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
51
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
52
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
str+="珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋珍惜水晶之恋";
53
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
ValidDirty vd=new ValidDirty() ;
54
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
Console.WriteLine(vd.ValidByReg(str)) ;
55
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
DateTime t2 =DateTime.Now;
56
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
TimeSpan ts=t2-t1;
57
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
Console.WriteLine(ts.TotalMilliseconds) ;
58
![](http://www.cnblogs.com/Images/OutliningIndicators/None.gif)
Console.Read() ;
[align=center]算法[/align] | [align=center]检索文本文件长度 / 耗费时间(ms)[/align] | ||
[align=center]正则算法[/align] | 10个汉字/ 980 | 100个汉字/999 | 1000个汉字/1234 |
[align=center]普通算法[/align] | 10个汉字/ 234 | 100个汉字/234 | 1000个汉字/265 |
相关文章推荐
- 关于脏字典过滤问题-用正则表达式来过滤脏数据
- 转:关于脏字典过滤问题-用正则表达式来过滤脏数据
- 关于脏字典过滤问题-用正则表达式来过滤脏数据
- asp.net 脏字典过滤问题 用正则表达式来过滤脏数据
- asp.net 脏字典过滤问题 用正则表达式来过滤脏数据
- 关于正则表达式的$问题
- 正则表达式关于多个数字匹配的问题
- jquery下载地址:https://code.jquery.com/jquery/ 影响范围: 版本低于1.7的jQuery过滤用户输入数据所使用的正则表达式存在缺陷,可能导致LOCA
- 一个关于正则表达式的问题
- 关于Access中“标准表达式中数据类型不匹配”的问题
- MySQL学习足迹记录07--数据过滤--用正则表达式进行检索
- 关于正则表达式的怪问题
- 关于TreeView的问题(数据过滤后取不到数据)
- 关于spark读取elasticsearch中数据,但是无法实现过滤数据的问题
- 俊鸟的数据输入校验专题(二)控件的正则表达式输入过滤【摘】
- 关于 regcomp()、regexec() 正则表达式的问题
- 使用正则表达式对xml文件中数据字典进行整理
- 关于java 正则表达式 与 fastjson的兼容性问题
- 用正则表达式过滤数据中的html标签
- selenium关于正则表达式匹配webdriver.Chrome().page_source中文的问题