一种通用数据采集的schema定义形式
2015-02-09 16:39
218 查看
{ "name": "凤凰金融", "notice": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ], "comments": "网站通告" }, "url": { "data": "attribute", "value": "http://www.fengjr.com/financing/list?type=cx" "comments": "本平台数据的采集URL" }, "project": { "data": "url", "url": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ], "template": "" }, "title": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] }, "detail": { "title": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] }, "amount": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] } } }, "member": { "data": "sub_item", "sub_item": { "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ], "src-save": 0, "url": { "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ], "template": "" } }, "detail": { "title": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] }, "amount": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] } } }, "src-save": 1 }
补充:
{ "name": "凤凰金融", "notice": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] }, "url": { "data": "attribute", "value": "http://www.fengjr.com/financing/list?type=cx" }, "project": { "data": "url", "url": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ], "template": "" }, "title": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] }, "detail": { "name": "网贷列表", "title": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] }, "amount": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] } } }, "member": { "data": "sub_item", "sub_item": { "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ], "src-save": 0, "url": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ], "template": "" } }, "detail": { "name": "会员材料", "title": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] }, "amount": { "data": "attribute", "matcher": [ { "match": "xpath", "pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]" } ] } } }, "src-save": 1, "crawler": {
"handler":"httpClient|selenium",
"results":"html|json|text",
"next_page": {
"matcher": [
{
"match": "xpath",
"pattern": "//*[@id=\"page-financing\"]/div[1]/div[5]/div/div/div[3]"
}
],
"template": ""
},
"history": "re-crawl|skip|stop"
}
}
相关文章推荐
- 利用存储过程实现交叉表格式数据查询的一种通用方法
- JSP中表单数据存储的一种通用方法
- 通用数据采集平台,从架构到代码
- 一种通用的数据访问对象模式
- 一种通用的数据访问对象模式
- 换个角度做设计:基于schema的全局业务数据定义。
- 第14章 结构和其他数据形式 14.3 定义结构变量
- 二分法查找和快速排序 二分法是分治算法的一种特殊形式,利用分治策略求解时,所需时间取决于分解后子问题的个数、子问题的规模大小等因素,而二分法,由于其划分的简单和均匀的特点,是查找数据时经常采用的一种有
- 通用数据采集平台,从架构到代码
- 游戏数据分析指标定义 | 通用
- JSP中表单数据存储的一种通用方法
- 【干货】.NET开发通用组件发布(三) 简易数据采集组件
- 通用权限管理平台--数据模型定义
- XML 文档定义的两种形式(DTD,SCHEMA)
- 有五个学生,每个学生有3门课的成绩,定义一种比较直观的文本文件格式, 输入学生姓名和成绩,输入的格式:name,30,30,30从键盘输入以上数据(包括姓名,三门课成绩), 按总分数从高到低的顺序将学
- 利用存储过程实现交叉表格式数据查询的一种通用方法
- 一种matlab调用signaltap采集数据的方法
- 一种通用查询语言的定义与实践
- 一种通用的数据访问对象模式
- 数据模式(Schema)定义