抓取必应在线词典的英语音标
2015-04-03 15:12
1271 查看
最近要做一个音标词典,却苦于找不到好的音标词库,而必应词典的音标比较好,所以写了这个自动抓取必应词典的英语词汇音标程序。
基本原理:
1. WEBBROWSER元素内容的抓取
2.正则表达式精确提取目标内容
编程语言:
VB6
准备工作:
1. 引用WEBBROWSER控件
2. 引用正则表达式功能
3. 引用EXCEL功能
4. 在程序同一目录中创建一个dictionary.xls的EXCEL文件,表中第一列存储好英语单词,自动搜索后的结果会保存到第二列
界面FORM:
1. 浏览器webbrowser1
2. 按钮command1,command2
3. 文本框text1,text2,text3
全部代码:
Dim xlApp As Object
Dim xlBook As Object
Dim xlSheet As Object
Dim pointer As Long
Dim word As String
Dim actionFlag As Boolean
Dim startNumber, endNumber As Long
Function getPhoneticString() As String
'从WEBBROWSER中获取音标
Dim i As Object
Dim strHtml, resultStr As String
For Each i In WebBrowser1.Document.All
strHtml = i.innerHTML
If bTest(strHtml, "英\[[^\]]*\]") = True Then
resultStr = getMatch(strHtml, "英\[[^\]]*\]")
Exit For
End If
If bTest(strHtml, "美\[[^\]]*\]") = True Then
resultStr = getMatch(strHtml, "美\[[^\]]*\]")
Exit For
End If
Next
getPhoneticString = resultStr
End Function
Function mainLoop() As Integer
xlSheet.Cells(pointer, 2).Value = getPhoneticString()
xlBook.Save
Text1.Text = getPhoneticString()
If pointer < endNumber Then
pointer = pointer + 1
Else
Beep
MsgBox "搜索完毕!"
Unload Me
End
End If
word = xlSheet.Cells(pointer, 1).Value
searchWord (word)
mainLoop = 0
End Function
Function bTest(ByVal s As String, ByVal p As String) As Boolean
Dim re As RegExp
Set re = New RegExp
re.IgnoreCase = False
re.Pattern = p
bTest = re.Test(s)
End Function
Function getMatch(ByVal s As String, ByVal p As String) As String
Dim reg As RegExp
Dim mhs As MatchCollection
Dim mh As Match
Set reg = New RegExp
reg.IgnoreCase = True
reg.MultiLine = True
reg.Global = True
reg.Pattern = p
Dim str
str = s
If reg.Test(str) Then
Set mhs = reg.Execute(str)
Set mh = mhs(0)
getMatch = mh
End If
End Function
Function searchWord(ByVal w As String)
Dim searchString As String
searchString = "http://cn.bing.com/dict/search?q=" & w
WebBrowser1.Navigate searchString
End Function
Private Sub Form_Load()
'启动程序准备工作
WebBrowser1.Navigate "about:blank"
Set xlApp = Nothing
Set xlBook = Nothing
Set xlSheet = Nothing
Set xlApp = CreateObject("Excel.Application")
Set xlBook = xlApp.Workbooks.Open(App.Path & "/dictionary.xls")
Set xlSheet = xlBook.Worksheets(1)
actionFlag = False
startNumber = Val(Text2.Text)
endNumber = Val(Text3.Text)
pointer = 1
pointer = startNumber
End Sub
Private Sub Command1_Click()
'开始搜索
startNumber = Val(Text2.Text)
endNumber = Val(Text3.Text)
pointer = startNumber
actionFlag = True
word = xlSheet.Cells(pointer, 1).Value
searchWord (word)
End Sub
Private Sub Command2_Click()
'停止搜索
Unload Me
End
actionFlag = False
End Sub
Private Sub WebBrowser1_DownloadComplete()
'每次网页加载完毕执行主搜索程序
If actionFlag = True Then
Dim a As Integer
a = mainLoop()
End If
End Sub
Private Sub Form_Unload(Cancel As Integer)
'退出程序要保存并关闭EXCEL
xlBook.Save
xlBook.Close
xlApp.Quit
Set xlApp = Nothing
Set xlBook = Nothing
Set xlSheet = Nothing
End Sub
基本原理:
1. WEBBROWSER元素内容的抓取
2.正则表达式精确提取目标内容
编程语言:
VB6
准备工作:
1. 引用WEBBROWSER控件
2. 引用正则表达式功能
3. 引用EXCEL功能
4. 在程序同一目录中创建一个dictionary.xls的EXCEL文件,表中第一列存储好英语单词,自动搜索后的结果会保存到第二列
界面FORM:
1. 浏览器webbrowser1
2. 按钮command1,command2
3. 文本框text1,text2,text3
全部代码:
Dim xlApp As Object
Dim xlBook As Object
Dim xlSheet As Object
Dim pointer As Long
Dim word As String
Dim actionFlag As Boolean
Dim startNumber, endNumber As Long
Function getPhoneticString() As String
'从WEBBROWSER中获取音标
Dim i As Object
Dim strHtml, resultStr As String
For Each i In WebBrowser1.Document.All
strHtml = i.innerHTML
If bTest(strHtml, "英\[[^\]]*\]") = True Then
resultStr = getMatch(strHtml, "英\[[^\]]*\]")
Exit For
End If
If bTest(strHtml, "美\[[^\]]*\]") = True Then
resultStr = getMatch(strHtml, "美\[[^\]]*\]")
Exit For
End If
Next
getPhoneticString = resultStr
End Function
Function mainLoop() As Integer
xlSheet.Cells(pointer, 2).Value = getPhoneticString()
xlBook.Save
Text1.Text = getPhoneticString()
If pointer < endNumber Then
pointer = pointer + 1
Else
Beep
MsgBox "搜索完毕!"
Unload Me
End
End If
word = xlSheet.Cells(pointer, 1).Value
searchWord (word)
mainLoop = 0
End Function
Function bTest(ByVal s As String, ByVal p As String) As Boolean
Dim re As RegExp
Set re = New RegExp
re.IgnoreCase = False
re.Pattern = p
bTest = re.Test(s)
End Function
Function getMatch(ByVal s As String, ByVal p As String) As String
Dim reg As RegExp
Dim mhs As MatchCollection
Dim mh As Match
Set reg = New RegExp
reg.IgnoreCase = True
reg.MultiLine = True
reg.Global = True
reg.Pattern = p
Dim str
str = s
If reg.Test(str) Then
Set mhs = reg.Execute(str)
Set mh = mhs(0)
getMatch = mh
End If
End Function
Function searchWord(ByVal w As String)
Dim searchString As String
searchString = "http://cn.bing.com/dict/search?q=" & w
WebBrowser1.Navigate searchString
End Function
Private Sub Form_Load()
'启动程序准备工作
WebBrowser1.Navigate "about:blank"
Set xlApp = Nothing
Set xlBook = Nothing
Set xlSheet = Nothing
Set xlApp = CreateObject("Excel.Application")
Set xlBook = xlApp.Workbooks.Open(App.Path & "/dictionary.xls")
Set xlSheet = xlBook.Worksheets(1)
actionFlag = False
startNumber = Val(Text2.Text)
endNumber = Val(Text3.Text)
pointer = 1
pointer = startNumber
End Sub
Private Sub Command1_Click()
'开始搜索
startNumber = Val(Text2.Text)
endNumber = Val(Text3.Text)
pointer = startNumber
actionFlag = True
word = xlSheet.Cells(pointer, 1).Value
searchWord (word)
End Sub
Private Sub Command2_Click()
'停止搜索
Unload Me
End
actionFlag = False
End Sub
Private Sub WebBrowser1_DownloadComplete()
'每次网页加载完毕执行主搜索程序
If actionFlag = True Then
Dim a As Integer
a = mainLoop()
End If
End Sub
Private Sub Form_Unload(Cancel As Integer)
'退出程序要保存并关闭EXCEL
xlBook.Save
xlBook.Close
xlApp.Quit
Set xlApp = Nothing
Set xlBook = Nothing
Set xlSheet = Nothing
End Sub
相关文章推荐
- “不作死就不会死”等词收入知名在线英语词典
- JAVA写的网页爬虫爬取必应词典的词汇音标
- ecosphere是什么意思_ecosphere的翻译_音标_读音_用法_例句 - 必应 Bing 词典
- 一个英语在线小词典 -- 使用WPF开发
- 市场分析(一)——现今各大在线英语词典比较
- on call是什么意思_on call的翻译_音标_读音_用法_例句 - 必应 Bing 词典
- 必应词典--英语学习APP案例分析
- 中文英语词典太烂,推荐两个在线的纯英文字典
- 利用在线词典批量查询英语单词
- 英语在线词典
- 用AHK做个强大的在线英语词典
- 英语音标-元音学习小结
- 必应词典手机版(IOS版)与有道词典(IOS版)之软件分析【功能篇】【用户体验篇】
- 必应词典使用体验及改进建议
- 适合初学者开发的C#在线英汉词典小程序
- 解决金山词霸2007中美国传统词典音标乱码问题
- 【C#】WinForm 之 DOTA2英雄搭配助手(网页抓取+在线绿色版+源码开放)
- 在线词典2月1日计划
- 模拟有道词典在线查询
- BlackBerry 9850 应用:英语词典 海词