Hadoop demo 找出共同好友
2017-06-21 16:59
357 查看
需求
以下是qq的好友列表数据,冒号前是一个用,冒号后是该用户的所有好友(数据中的好友关系是单向的)A:B,C,D,F,E,O
B:A,C,E,K
C:F,A,D,I
D:A,E,F,L
E:B,C,D,M,L
F:A,B,C,D,E,O,M
G:A,C,D,E,F
H:A,C,D,E,O
I:A,O
J:B,O
K:A,C,D
L:D,E,F
M:E,F,G
O:A,H,I,J
求出哪些人两两之间有共同好友,及他俩的共同好友都有谁?
第一步
map
读一行 A:B,C,D,F,E,O
输出 <B,A><C,A><D,A><F,A><E,A><O,A>
在读一行 B:A,C,E,K
输出 <A,B><C,B><E,B><K,B>
REDUCE
拿到的数据比如<C,A><C,B><C,E><C,F><C,G>......
输出:
<A-B,C>
<A-E,C>
<A-F,C>
<A-G,C>
<B-E,C>
<B-F,C>.....
第二步
map
读入一行<A-B,C>
直接输出<A-B,C>
reduce
读入数据 <A-B,C><A-B,F><A-B,G>.......
输出: A-B C,F,G,.....
第一步
package com.asin.hdp.commfriend; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class CommFriendDemo { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf); job.setJarByClass(CommFriendDemo.class); job.setMapperClass(CommFriendMapper.class); job.setReducerClass(CommFriendReduce.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); FileInputFormat.addInputPath(job, new Path("F:/friend.txt")); FileOutputFormat.setOutputPath(job, new Path("F:/outputFriend1")); System.exit(job.waitForCompletion(true) ? 0 : 1); } } class CommFriendMapper extends Mapper<LongWritable, Text, Text, Text> { @Override protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, Text>.Context context) throws IOException, InterruptedException { String line = value.toString(); String[] split = line.split(":"); String user = split[0]; String[] friends = split[1].split(","); for (String friend : friends) { context.write(new Text(friend), new Text(user)); } } } class CommFriendReduce extends Reducer<Text, Text, Text, Text> { @Override protected void reduce(Text key, Iterable<Text> value, Reducer<Text, Text, Text, Text>.Context context) throws IOException, InterruptedException { String users = ""; for (Text text : value) { users += text + ","; } context.write(key, new Text(users)); } }
部分结果
A I,K,C,B,G,F,H,O,D,
B A,F,J,E,
C A,E,B,H,F,G,K,
D G,C,K,A,L,F,E,H,
E G,M,L,H,A,F,B,D,
F L,M,D,C,G,A,
第二步
package com.asin.hdp.commfriend; import java.io.IOException; import java.util.Arrays; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.hbase.util.IterableUtils; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class CommFriendDemo2 { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf); job.setJarByClass(CommFriendDemo2.class); job.setMapperClass(CommFriendMapperS.class); job.setReducerClass(CommFriendReduceS.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); FileInputFormat.addInputPath(job, new Path("F:/outputFriend1/part-r-00000")); FileOutputFormat.setOutputPath(job, new Path("F:/outputFriend2")); System.exit(job.waitForCompletion(true) ? 0 : 1); } } class CommFriendMapperS extends Mapper<LongWritable, Text, Text, Text> { @Override protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, Text>.Context context) throws IOException, InterruptedException { String line = value.toString(); String[] split = line.split("\t"); String friend = split[0]; String users = split[1]; String[] userArr = users.split(","); Arrays.sort(userArr); for (int i = 0; i < userArr.length - 2; i++) { for (int j = i + 1; j < userArr.length - 1; j++) { String user_user = userArr[i] + "-" + userArr[j]; context.write(new Text(user_user), new Text(friend)); } } } } class CommFriendReduceS extends Reducer<Text, Text, Text, Text> { @Override protected void reduce(Text key, Iterable<Text> value, Reducer<Text, Text, Text, Text>.Context context) throws IOException, InterruptedException { String user = ""; for (Text text : value) { user += text + ","; } context.write(key, new Text(user)); } }
部分结果
A-B C,E,
A-C F,D,
A-D E,F,
A-E B,C,D,
A-F C,D,B,E,O,
A-G D,E,F,C,
相关文章推荐
- mapreduce应用-找出扣扣共同好友
- MapReducer-找共同好友 分类: Java hadoop 2015-06-25 22:31 71人阅读 评论(0) 收藏
- hadoop项目:社交粉丝,共同好友数据分析实现
- Hadoop案例(三)找博客共同好友
- 寻找共同好友(hadoop解决方案)
- Hadoop/MapReduce 共同好友解决方案:求大量集合的两两交集
- Hadoop/Spark推荐系统(一)——共同好友
- hadoop求共同好友
- 3 weekend110的hadoop中的RPC框架实现机制 + hadoop中的RPC应用实例demo
- MapReduce分析共同好友
- QQ也可以和微信一样只能共同好友见评论
- hive在hadoop中的一个demo运行过程总结
- hadoop的InputFormat简单demo
- 共同好友居民和南方工本费问题和建议您如果
- 黑科技 Python脚本帮你找出微信上删除你好友的人
- Python 脚本帮你找出微信上删除了你的“好友“
- hadoop streaming编程小demo(python版)
- 给定a、b两个文件,各存放50亿个url,每个url各占用64字节,内存限制是4G,如何找出a、b文件共同的url?
- 大数据--hadoop入门级demo01及相关API介绍
- MapReduce 求两个人的共同好友算法