【Nutch2.2.1基础教程之1】nutch相关异常
2015-06-16 15:59
274 查看
1、在任务一开始运行,注入Url时即出现以下错误。
InjectorJob: Injecting urlDir: urls
InjectorJob: Using class org.apache.gora.hbase.store.HBaseStore as the Gora storage class.
InjectorJob: java.lang.RuntimeException: job failed: name=[20140000]inject urls, jobid=job_local1629320149_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:233)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
原因是regex-urlfilter.txt配置错误
InjectorJob: Injecting urlDir: urls
InjectorJob: Using class org.apache.gora.hbase.store.HBaseStore as the Gora storage class.
InjectorJob: java.lang.RuntimeException: job failed: name=[20140000]inject urls, jobid=job_local1629320149_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:233)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
原因是regex-urlfilter.txt配置错误
相关文章推荐
- 【Nutch2.2.1基础教程之6】Nutch2.2.1抓取流程
- Hadoop1.2.1伪分布模式安装指南
- Hadoop基本原理之一:MapReduce
- 8大排序算法图文讲解
- 【Nutch2.2.1基础教程之3】Nutch2.2.1配置文件
- Hadoop配置文件
- Hadoop入门经典:WordCount
- 使用ToolRunner运行Hadoop程序基本原理分析
- LeetCode188:Best Time to Buy and Sell Stock IV
- 深入浅出JMS(一)——JMS简要
- struts2 redirect 配置动态传递参数
- platform创建说明
- 一段时间加载的js函数
- 禁止Sublime 3自动更新提示
- LINUX系统中的一次性定时任务
- 关于步进电机的快速上手
- 【solr基础教程之一】Solr相关知识点串讲
- 使用LAMP创建基于wordpress的个从博客网站
- 【solr专题之四】在Tomcat 中部署Solr4.x
- 【solr基础教程之二】索引