命令行实现hadoop上传JAR文件,运行《hadoop权威指南第二版》测试最大气温的应用
2014-06-24 19:29
489 查看
本人的配置环境如下:
Ubuntu13.10
hadoop1.1.2
jdk8
1.按照《hadoop权威指南第二版》中查找最高气温,程序如下:
2.生成JAR文件,直接用Eclipse生成
右键JAVA工程-->Export-->java-->jar file
3.上传到hadoop安装目录下
我的路径为:/usr/local/hadoop
4.把需要测试的数据添加到hdfs下(可以从网上下载:https://github.com/tomwhite/hadoop-book)
在hadoop目录下: bin/hadoop dfs -put /home/test/sample.txt /input/sample.txt
5.运行JAR文件
在hadoop目录下:bin/hadoop jar newtemperature.jar com.sun.hadoop.mapreduce.NewMaxTemperature /input/sample.txt /output/
其中:newtemperature.jar为生成的jar包名,com.sun.hadoop.mapreduce为包名,NewMaxTemperature为类名,如果没有用包,直接为 类名,不用加包名。
如果出现以下结果,表明运行成功:
Ubuntu13.10
hadoop1.1.2
jdk8
1.按照《hadoop权威指南第二版》中查找最高气温,程序如下:
package com.sun.hadoop.mapreduce; import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; public class NewMaxTemperature { static class NewMaxTemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable> { private static final int MISSING = 9999; @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { // TODO Auto-generated method stub String line = value.toString(); String year = line.substring(15, 19); int airTemperature; if (line.charAt(87) == '+') { airTemperature = Integer.parseInt(line.substring(88, 92)); } else { airTemperature = Integer.parseInt(line.substring(87, 92)); } String quality = line.substring(92, 93); if (airTemperature != MISSING && quality.matches("[01459]")) { context.write(new Text(year), new IntWritable(airTemperature)); } } } static class NewMaxTemperatureReduce extends Reducer<Text, IntWritable, Text, IntWritable> { @Override public void reduce(Text key, Iterable<IntWritable> values, Context content) throws IOException, InterruptedException { // TODO Auto-generated method stub int maxValue = Integer.MIN_VALUE; for (IntWritable value : values) { maxValue = Math.max(maxValue, value.get()); } content.write(key, new IntWritable(maxValue)); } } public static void main(String[] args) throws Exception { // TODO Auto-generated method stub if (args.length != 2) { System.out .println("Usage:MaxTemperature <input path> <output path>"); System.exit(-1); } Job job = new Job(); job.setJarByClass(NewMaxTemperature.class); FileInputFormat.addInputPath(job, new org.apache.hadoop.fs.Path(args[0])); org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath( job, new org.apache.hadoop.fs.Path(args[1])); job.setMapperClass(NewMaxTemperatureMapper.class); job.setReducerClass(NewMaxTemperatureReduce.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); System.exit(job.waitForCompletion(true) ? 0 : 1); } }
2.生成JAR文件,直接用Eclipse生成
右键JAVA工程-->Export-->java-->jar file
3.上传到hadoop安装目录下
我的路径为:/usr/local/hadoop
4.把需要测试的数据添加到hdfs下(可以从网上下载:https://github.com/tomwhite/hadoop-book)
在hadoop目录下: bin/hadoop dfs -put /home/test/sample.txt /input/sample.txt
5.运行JAR文件
在hadoop目录下:bin/hadoop jar newtemperature.jar com.sun.hadoop.mapreduce.NewMaxTemperature /input/sample.txt /output/
其中:newtemperature.jar为生成的jar包名,com.sun.hadoop.mapreduce为包名,NewMaxTemperature为类名,如果没有用包,直接为 类名,不用加包名。
如果出现以下结果,表明运行成功:
14/06/24 19:28:45 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/06/24 19:28:45 INFO input.FileInputFormat: Total input paths to process : 1 14/06/24 19:28:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library 14/06/24 19:28:45 WARN snappy.LoadSnappy: Snappy native library not loaded 14/06/24 19:28:46 INFO mapred.JobClient: Running job: job_201406241439_0006 14/06/24 19:28:47 INFO mapred.JobClient: map 0% reduce 0% 14/06/24 19:28:52 INFO mapred.JobClient: map 100% reduce 0% 14/06/24 19:28:59 INFO mapred.JobClient: map 100% reduce 33% 14/06/24 19:29:01 INFO mapred.JobClient: map 100% reduce 100% 14/06/24 19:29:01 INFO mapred.JobClient: Job complete: job_201406241439_0006 14/06/24 19:29:01 INFO mapred.JobClient: Counters: 29 14/06/24 19:29:01 INFO mapred.JobClient: Map-Reduce Framework 14/06/24 19:29:01 INFO mapred.JobClient: Spilled Records=16 14/06/24 19:29:01 INFO mapred.JobClient: Map output materialized bytes=94 14/06/24 19:29:01 INFO mapred.JobClient: Reduce input records=8 14/06/24 19:29:01 INFO mapred.JobClient: Virtual memory (bytes) snapshot=3776598016 14/06/24 19:29:01 INFO mapred.JobClient: Map input records=8 14/06/24 19:29:01 INFO mapred.JobClient: SPLIT_RAW_BYTES=97 14/06/24 19:29:01 INFO mapred.JobClient: Map output bytes=72 14/06/24 19:29:01 INFO mapred.JobClient: Reduce shuffle bytes=94 14/06/24 19:29:01 INFO mapred.JobClient: Physical memory (bytes) snapshot=270008320 14/06/24 19:29:01 INFO mapred.JobClient: Reduce input groups=1 14/06/24 19:29:01 INFO mapred.JobClient: Combine output records=0 14/06/24 19:29:01 INFO mapred.JobClient: Reduce output records=1 14/06/24 19:29:01 INFO mapred.JobClient: Map output records=8 14/06/24 19:29:01 INFO mapred.JobClient: Combine input records=0 14/06/24 19:29:01 INFO mapred.JobClient: CPU time spent (ms)=2070 14/06/24 19:29:01 INFO mapred.JobClient: Total committed heap usage (bytes)=240123904 14/06/24 19:29:01 INFO mapred.JobClient: File Input Format Counters 14/06/24 19:29:01 INFO mapred.JobClient: Bytes Read=1080 14/06/24 19:29:01 INFO mapred.JobClient: FileSystemCounters 14/06/24 19:29:01 INFO mapred.JobClient: HDFS_BYTES_READ=1177 14/06/24 19:29:01 INFO mapred.JobClient: FILE_BYTES_WRITTEN=104412 14/06/24 19:29:01 INFO mapred.JobClient: FILE_BYTES_READ=94 14/06/24 19:29:01 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=9 14/06/24 19:29:01 INFO mapred.JobClient: Job Counters 14/06/24 19:29:01 INFO mapred.JobClient: Launched map tasks=1 14/06/24 19:29:01 INFO mapred.JobClient: Launched reduce tasks=1 14/06/24 19:29:01 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=8655 14/06/24 19:29:01 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 14/06/24 19:29:01 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=4923 14/06/24 19:29:01 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 14/06/24 19:29:01 INFO mapred.JobClient: Data-local map tasks=1 14/06/24 19:29:01 INFO mapred.JobClient: File Output Format Counters 14/06/24 19:29:01 INFO mapred.JobClient: Bytes Written=9
相关文章推荐
- SpringBoot 简单文件上传实现以及jar包方式运行项目
- Hadoop中通过ToolRunner和Configured实现直接读取命令行动态出入reduce task数量,jar文件等
- 使用Hadoop命令行执行jar包详解(生成jar、将文件上传到dfs、执行命令、下载dfs文件至本地)
- 使用Hadoop命令行执行jar包详解(生成jar、将文件上传到dfs、执行命令、下载dfs文件至本地)
- ANT编译java文件并上传到hadoop环境jar并运行
- smartupload.jar 实现文件上传下载
- 利用discuz实现PHP大文件上传应用实例代码
- 通过命令行运行jar文件
- Struts2文件上传的运行过程实现
- 利用discuz实现PHP大文件上传应用实例代码
- 在UpdatePanel中应用UserControl页面FileUpload控件实现文件上传
- hadoop工作流引擎解压jar文件,并运行出现类型不匹配的情况
- 上传大文件,本地运行没问题,可是上传到IIS上就提示“超过了最大请求长度”
- Android应用开发之实现视频文件的上传
- Android下的应用编程——用HTTP协议实现文件上传功能
- JAVA应用XFire框架来实现WebServie的大文件传输功能之二(上传)
- 使用apache commons-fileupload.jar 实现文件上传
- smartupload.jar 实现文件上传下载
- Android下的应用编程——用HTTP协议实现文件上传功能
- 今天使用jspsmartupload.jar实现上传文件的功能,发现中文乱码,于是总结了下解决方法