您的位置:首页 > 其它

一篇综述:A brief survey of web data extraction tools

2009-01-07 14:54 453 查看
一篇经典综述,scholar.google.cn上显示该文被引用超过300次

Laender, A. H. F.; Ribeiro-Neto, B. A.; da Silva, A. S. & Teixeira, J. S. A brief survey of web data extraction tools. SIGMOD Rec., ACM, 2002, 31, 84-93

Abstract:In the last few years, several works in the literature have addressed
the problem of data extraction from Web pages. The importance of this
problem derives from the fact that, once extracted, the data can be
handled in a way similar to instances of a traditional database. The
approaches proposed in the literature to address the problem of Web
data extraction use techniques borrowed from areas such as natural
language processing, languages and grammars, machine learning,
information retrieval, databases, and ontologies. As a consequence,
they present very distinct features and capabilities which make a
direct comparison difficult to be done. In this paper, we propose a
taxonomy for characterizing Web data extraction fools, briefly survey
major Web data extraction tools described in the literature, and
provide a qualitative analysis of them. Hopefully, this work will
stimulate other studies aimed at a more comprehensive analysis of data
extraction approaches and tools for Web data.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: