hive数据文件简单合并
2015-12-05 13:38
316 查看
MR代码:
Eclipse自动生成.class文件,打包命令:
合并命令:
package merge; import java.io.IOException; import java.util.Iterator; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.FileInputFormat; import org.apache.hadoop.mapred.FileOutputFormat; import org.apache.hadoop.mapred.JobClient; import org.apache.hadoop.mapred.JobConf; import org.apache.hadoop.mapred.MapReduceBase; import org.apache.hadoop.mapred.Mapper; import org.apache.hadoop.mapred.OutputCollector; import org.apache.hadoop.mapred.Reducer; import org.apache.hadoop.mapred.Reporter; import org.apache.hadoop.mapred.TextInputFormat; import org.apache.hadoop.mapred.TextOutputFormat; public class merge { public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text> { private Text word=new Text(""); public void map(LongWritable key, Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException { output.collect(value,word); } } public static void main(String[] args) throws Exception { JobConf conf = new JobConf(merge.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(Text.class); conf.setMapperClass(Map.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf); } }
Eclipse自动生成.class文件,打包命令:
jar打包:在项目的bin目录下 Dev-Fac:bin ce-pc$ jar -cvf hive-merge.jar -C ../ .
合并命令:
hadoop jar /tmp/hive-merge.jar merge.merge /user/hive/warehouse/table1 /user/hive/warehouse/table1/out #merge.merge 表示merge包下的merge类
相关文章推荐
- sqlserver 中的异常捕获
- LeetCode(168) Excel Sheet Column Title
- [Canvas绘图] 第13节 线条勾勒
- ubuntu自动挂载方法
- android camera系统3A模式及其状态转换(二)
- Java初始化顺序
- swift中控制流相关
- 杭电acm4530
- 字符串比较必须使用strcmp
- 8张图理解Java
- Linux下的一些基础命令
- 黑马程序员---IO3(File类、递归、IO其他类)
- Mat 转 IplImage
- SpringMVC访问静态资源
- 产生m个n以内的随机数
- Total Commander hotkey[1]
- placeholder 兼容问题
- Leetcode181:Number of Digit One
- nginxlinux下虚拟主机设置
- [LeetCode]Implement Stack using Queues