您的位置:首页 > 运维架构

Attempt time threshold of job running in hadoop

2014-01-03 02:19 363 查看
Attempt time threshold is a good feature to avoid time out or too slowly running of the job. However, you may need to run some scripts by forking a new sub-process sometimes. You may not get the progress of the process. In hadoop, if you do not report the progress during a time interval (defalut as 600s), it will consider the attemp is timeout, kill the current attemp process, and fork a new attemp.

In streaming way, you can set the mapred.task.timeout as much as you wish in command line running your job, e.g.,

hadoop jar test.jar -jobconf mapred.task.timeout=3600000

To develop a jar, you can set it like this:

long milliSeconds = 1000*60*60;
conf.setLong("mapred.task.timeout", milliSeconds);

NOTICE: The unit of timeout property is millisecond

Or, if you could, you need to report the progress to the job to aviod time out, e.g.,

context.progress()

本文出自 “maxwell” 博客,请务必保留此出处http://drmaxwell.blog.51cto.com/394635/1347904
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐