自己定义一个outputFormat
2012-07-13 21:23
225 查看
自定义一个OutputFormat,用于输出<Text, MapWritable>格式的数据
MapWritable的内容是 [Text: LongWritable]
输出格式 [url url2:times2,url3:times3,...]
参考TextOutputFormat,修改简化之
Java代码
public class TextAndMapWritableOutputFormat extends
FileOutputFormat<Text, MapWritable> {
@Override
public RecordWriter<Text, MapWritable> getRecordWriter(FileSystem ignored,
JobConf job, String name, Progressable progress) throws IOException {
Path file = FileOutputFormat.getTaskOutputPath(job, name);
FileSystem fs = file.getFileSystem(job);
FSDataOutputStream fileOut = fs.create(file, progress);
return new TextAndMapWritableRecordWriter(fileOut);
}
protected static class TextAndMapWritableRecordWriter implements RecordWriter<Text, MapWritable> {
private static final String utf8 = "UTF-8";
private static final byte[] newline;
private static final byte[] keyValueSeparator;
private static final byte[] colon;
private static final byte[] comma;
static {
try {
newline = "\n".getBytes(utf8);
keyValueSeparator = "\t".getBytes(utf8);
colon = ":".getBytes(utf8);
comma = ",".getBytes(utf8);
} catch (UnsupportedEncodingException uee) {
throw new IllegalArgumentException("can't find " + utf8
+ " encoding");
}
}
protected DataOutputStream out;
public TextAndMapWritableRecordWriter(DataOutputStream out) {
this.out = out;
}
@Override
public synchronized void write(Text key, MapWritable value)
throws IOException {
out.write(key.getBytes(), 0, key.getLength());
out.write(keyValueSeparator);
Iterator<Writable> it = value.keySet().iterator();
while (it.hasNext()) {
Writable k = it.next();
LongWritable v = (LongWritable) value.get(k);
out.write(((Text) k).getBytes());
out.write(colon);
out.write(v.toString().getBytes(utf8));
out.write(comma);
}
out.write(newline);
}
@Override
public synchronized void close(Reporter reporter) throws IOException {
out.close();
}
}
}
转自:http://roserouge.iteye.com/blog/347857
MapWritable的内容是 [Text: LongWritable]
输出格式 [url url2:times2,url3:times3,...]
参考TextOutputFormat,修改简化之
Java代码
public class TextAndMapWritableOutputFormat extends
FileOutputFormat<Text, MapWritable> {
@Override
public RecordWriter<Text, MapWritable> getRecordWriter(FileSystem ignored,
JobConf job, String name, Progressable progress) throws IOException {
Path file = FileOutputFormat.getTaskOutputPath(job, name);
FileSystem fs = file.getFileSystem(job);
FSDataOutputStream fileOut = fs.create(file, progress);
return new TextAndMapWritableRecordWriter(fileOut);
}
protected static class TextAndMapWritableRecordWriter implements RecordWriter<Text, MapWritable> {
private static final String utf8 = "UTF-8";
private static final byte[] newline;
private static final byte[] keyValueSeparator;
private static final byte[] colon;
private static final byte[] comma;
static {
try {
newline = "\n".getBytes(utf8);
keyValueSeparator = "\t".getBytes(utf8);
colon = ":".getBytes(utf8);
comma = ",".getBytes(utf8);
} catch (UnsupportedEncodingException uee) {
throw new IllegalArgumentException("can't find " + utf8
+ " encoding");
}
}
protected DataOutputStream out;
public TextAndMapWritableRecordWriter(DataOutputStream out) {
this.out = out;
}
@Override
public synchronized void write(Text key, MapWritable value)
throws IOException {
out.write(key.getBytes(), 0, key.getLength());
out.write(keyValueSeparator);
Iterator<Writable> it = value.keySet().iterator();
while (it.hasNext()) {
Writable k = it.next();
LongWritable v = (LongWritable) value.get(k);
out.write(((Text) k).getBytes());
out.write(colon);
out.write(v.toString().getBytes(utf8));
out.write(comma);
}
out.write(newline);
}
@Override
public synchronized void close(Reporter reporter) throws IOException {
out.close();
}
}
}
转自:http://roserouge.iteye.com/blog/347857
相关文章推荐
- 38、定义一个自己的IO流
- VC++中ID是如何分配的,如果自己定义一个ID号,系统还会不会再分配与此相同的ID
- 自己定义一个带进度的圆形进度条
- ABAP 自己定义一个长文本TEXT的编写、保存
- java中自己定义一个类,类中必须有一个自己定义的构造方法,否则编译系统识别不了
- android怎样写一个自己定义的dialog能够在Title的位置弹出来
- SuperSwipeRefreshLayout 一个功能强大的自己定义下拉刷新组件
- 终于学会了自己定义一个文件扩展名,可以通过iis识别执行的
- Jfianl aop 和 利用aop 自己定义一个Service异常
- 如何将自己的实体类封装到一个list中,定义自己一系列的方法
- Python元类实践--自己定义一个和collections中一样的namedtuple
- 简单编程(十四)定义一个方法能够判断并返回两个整数的最大值,并调用自己的方法测试是否正确。
- 自己定义的一个SVN管理规则
- //1、有一个字符串开头或结尾含有n个空格(” abcdefgdddd ”),欲去掉前后空格,返回一个新字符串。 //要求1:请自己定义一个接口(函数),并实现功能;70分 //要求2:编写
- 自己主动化測试程序之中的一个自己定义键盘的模拟測试程序(C语言)
- Android实现一个自己定义相机的界面
- 定义一个公共方法,打造自己的AJAX框架
- 网址重写 由一个任意或是自己定义的地址转到指定的处理程序 4000 上来
- C++ 单例模式中处理在类中声明一个指向一个自己的指针,在编译时显示定义的指针未定义的处理办法
- 关于Go语言,自己定义结构体标签的一个妙用.