hadoop-mapreduce-(1)-统计单词数量
2017-11-22 11:35
417 查看
编写map程序
编写reduce程序
编写main函数
把wordcount.txt放在hdfs的/dyh/data/input/目录下
执行:hadoop jar hdfs.jar com.cvicse.ump.hadoop.mapreduce.WordCount /dyh/data/input/wordcount.txt /dyh/data/output/1
package com.cvicse.ump.hadoop.mapreduce.map; import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class WordCountMap extends Mapper<LongWritable, Text, Text, IntWritable> { @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String[] words = line.split(" "); for(String word:words){ context.write(new Text(word), new IntWritable(1)); } } }
编写reduce程序
package com.cvicse.ump.hadoop.mapreduce.reduce; import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; public class WordCountReduce extends Reducer<Text, IntWritable, Text, IntWritable> { @Override protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { Integer count = 0; for(IntWritable value:values){ count+=value.get(); } context.write(key, new IntWritable(count)); } }
编写main函数
package com.cvicse.ump.hadoop.mapreduce; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import com.cvicse.ump.hadoop.mapreduce.map.WordCountMap; import com.cvicse.ump.hadoop.mapreduce.reduce.WordCountReduce; public class WordCount { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf,"wordCount"); job.setJarByClass(WordCount.class); job.setMapperClass(WordCountMap.class); job.setReducerClass(WordCountReduce.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); boolean bb = job.waitForCompletion(true); if(!bb){ System.out.println("wrodcount task fail!"); }else{ System.out.println("wordcount task success!"); } } }
把wordcount.txt放在hdfs的/dyh/data/input/目录下
执行:hadoop jar hdfs.jar com.cvicse.ump.hadoop.mapreduce.WordCount /dyh/data/input/wordcount.txt /dyh/data/output/1
相关文章推荐
- 023_数量类型练习——Hadoop MapReduce手机流量统计
- (11) Hadoop Java 实现MapReduce HelloWord 单词统计
- hadoop基础教程(二) MapReduce 单词统计
- Hadoop示例程序之单词统计MapReduce
- MapReduce操作Hbase 进行单词数量统计Demo
- Hadoop2.5.2学习01--mapreduce统计单词数
- (12) Hadoop Java 实现MapReduce HelloWord 单词统计 更新版
- Hadoop示例程序之单词统计MapReduce
- Hadoop示例程序之单词统计MapReduce
- hadoop基础----hadoop实战(三)-----hadoop运行MapReduce---对单词进行统计--经典的自带例子wordcount
- Hadoop示例程序之单词统计MapReduce
- 【hadoop】 3002-mapreduce程序统计单词个数示例
- Hadoop:使用原生python编写MapReduce来统计文本文件中所有单词出现的频率功能
- Hadoop(4-1)-MapReduce程序案例-统计销售商品数量
- hadoop基础----hadoop实战(三)-----hadoop运行MapReduce---对单词进行统计--经典的自带例子wordcount
- 和我一起学Hadoop(五):MapReduce的单词统计,wordcount
- Hadoop-MapReduce初步应用-统计单词个数
- (13) Hadoop Java 实现MapReduce HelloWord 单词统计 更新版 2
- HADOOP(1)__Mapreduce_WordCount统计单词数