您的位置：首页 > 大数据

大数据企业学习篇05----flume初识

2017-12-20 13:41 387 查看

一、flume架构

<1>Flume is a distributed, reliable, and availableservice for efficiently collecting, aggregating, and moving large amounts of log data.

<2>It has a simple and flexible architecture based on streaming data flows. It is robust（健壮）and fault tolerant （容错）with tunable reliability mechanisms and many failover and recovery mechanisms.

<3>It uses a simple extensible data model that allows for online analytic application.（实时性要求较高）

<4>flume data flow model

<5>flume中的角色

<
4000
6>flume中的数据传输

<7>flume的三要素

二、flume的初步使用

<1>解压缩，配置flume-env.sh

export JVAV_HOME=/opt/software/jdk1.7.0_67

<2>flume常用的命令

bin/flume-ng
Usage: bin/flume-ng <command> [options]...

commands:
agent                     run a Flume agent

global options:
--conf,-c <conf>          use configs in <conf> directory
-Dproperty=value          sets a Java system property value

agent options:
--name,-n <name>          the name of this agent (required)
--conf-file,-f <file>     specify a config file (required if -z missing)

<3>启动agent

An agent is started using a shell script called flume-ng which is located in the bin directory of the Flume distribution. You need to specify the agent name, the config directory, and the config file on the command line:

bin/flume-ng agent --conf conf --name agent-test --conf-file test.conf

Now the agent will start running source and sinks configured in the given properties file.

<4>安装telnet

*安装rpm包

rpm -ivh ./*.rpm

*启动xinetd服务

/etc/rc.d/init.d/xinetd restart

<5>简单的样例

* 在conf下新建a1.conf

* 编写a1.conf(四步走：agent、source、channel、sink)

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

*运行

bin/flume-ng agent \
-c conf \
-n a1 \
-f conf/a1.conf \
-Dflume.root.logger=DEBUG,console

*测试是否启动监听端口

telnet -nltp

*启动客户端

telnet localhost 44444

三、flume收集hive运行日志

<1>思路分析

* 收集log

hive运行的日志

/opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6/logs/hive.log

tail -f

* memory

hdfs

/user/beifeng/flume/hive-logs/

<2>为了使用HDFS sink，需将如下jar包放置到flume/lib下

<3>编写agent配置文件

# The configuration file needs to define the sources,
# the channels and the sinks.
# Sources, channels and sinks are defined per agent,
# in this case called 'agent'

### define agent#######
a2.sources = r2
a2.channels = c2
a2.sinks = k2

### define sources #####
a2.sources.r2.type=exec
a2.sources.r2.command=tail -F /opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6/logs/hive.log

### define channels####
a2.channels.c2.type=memory

###define sinks ###
a2.sinks.k2.type=hdfs
a2.sinks.k2.hdfs.path=hdfs://hadoop-senior.ibeifeng.com:8020/user/beifeng/flume/hive.log
a2.sinks.k2.hdfs.fileType=DataStream
a2.sinks.k2.hdfs.batchSize=10

### bind sources and sinks###
a2.sources.r2.channels=c2
a2.sinks.k2.channel=c2

<4>运行

bin/flume-ng agent \
-c conf \
-n a2 \
-f conf/a2.conf \
-Dflume.root.logger=DEBUG,console

四、Flume项目架构

五、flume实战案例

<1>agent编写

# The configuration file needs to define the sources,
# the channels and the sinks.
# Sources, channels and sinks are defined per agent,
# in this case called 'agent'

### define agent#######
a3.sources = r3
a3.channels = c3
a3.sinks = k3

### define sources #####
a3.sources.r3.type=spooldir
a3.sources.r3.spoolDir=/opt/datas
a3.sources.r3.ignorePattern=^(.)*\\.txt$

### define channels####
a3.channels.c3.type=file
a3.channels.c3.checkpointDir =/opt/datas/check_dir
a3.channels.c3.dataDirs =/opt/datas/flume_data

###define sinks ###
a3.sinks.k3.type= hdfs
**a3.sinks.k3.hdfs.path=hdfs://hadoop-senior.ibeifeng.com:8020/user/beifeng/flume/%Y%m%d
a3.sinks.k3.hdfs.useLocalTimeStamp=true**

### bind sources and sinks###
a3.sources.r3.channels=c3
a3.sinks.k3.channel=c3

<2>测试运行

bin/flume-ng agent \
-c conf \
-n a3 \
-f conf/a3.conf \
-Dflume.root.logger=DEBUG,console

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： 大数据企业 flume

相关文章推荐

新的分享

章节导航