flume 使用 spool source的时候字符集出错
2016-04-26 13:43
423 查看
1. 错误所在
2. 解决方法
原因的inputCharset属性的默认值UTF-8,但是所读取的日志文件的字符集却是GBK,所以更改一下这个属性值就可以了
2016-04-21 02:23:05,508 (pool-3-thread-1) [ERROR - org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:256)] FATAL: Spool Directory source source1: { spoolDir: /home/hadoop_admin/movielog/ }: Uncaught exception in SpoolDirectorySource thread. Restart or reconfigure Flume to continue processing. java.nio.charset.MalformedInputException: Input length = 1 at java.nio.charset.CoderResult.throwException(CoderResult.java:277) at org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:195) at org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:134) at org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:72) at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:91) at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:238) at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:227) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
2. 解决方法
原因的inputCharset属性的默认值UTF-8,但是所读取的日志文件的字符集却是GBK,所以更改一下这个属性值就可以了
agent1.sources = source1 agent1.channels = channel1 agent1.sinks = sink1 # For each one of the sources, the type is defined agent1.sources.source1.type = spooldir agent1.sources.source1.spoolDir =/home/hadoop_admin/movielog/ agent1.sources.source1.inputCharset = GBK agent1.sources.source1.fileHeader = true agent1.sources.source1.deletePolicy = immediate agent1.sources.source1.batchSize = 1000 agent1.sources.source1.channels = channel1 # Each sink's type must be defined agent1.sinks.sink1.type = hdfs agent1.sinks.sink1.hdfs.path = hdfs://master:9000/flumeTest agent1.sinks.sink1.hdfs.filePrefix = master- agent1.sinks.sink1.hdfs.writeFormat = Text agent1.sinks.sink1.hdfs.fileType = DataStream agent1.sinks.sink1.hdfs.rollInterval = 0 agent1.sinks.sink1.hdfs.rollSize = 10240 agent1.sinks.sink1.hdfs.batchSize = 100 agent1.sinks.sink1.hdfs.callTimeout = 30000 agent1.sinks.sink1.channel = channel1 # Each channel's type is defined. agent1.channels.channel1.type = memory agent1.channels.channel1.capacity = 100000 agent1.channels.channel1.transactionCapacity = 100000 agent1.channels.channel1.keep-alive = 30
相关文章推荐
- Newtonsoft.Json
- Linux下Keepalived 安装与配置
- SQLZOO(More JOIN operations)Writeup
- 我U盘中的efi manager
- iOS程序启动原理
- CentOS7.1上安装Apache + SVN
- ios导航栏透明
- 信长之野望
- APNs改动 (转)
- 使用JS遇到的问题点
- JavaScript 函数讲解
- 002-storm基本概念
- String类中常用方法归纳
- 百度云盘资料查询地址
- PHP学习04----PHP代码标记风格
- concurrent之BlockingQueue
- django admin site override save_model
- UITableView总结
- Android中的安全与访问权限控制
- 使用 jxl.jar 架包生成 Excel文件