开启hadoop和Hbase集群的lzo压缩功能
2014-04-09 13:03
447 查看
安装前:
# yum -y install lzo-devel zlib-devel gcc autoconf automakelibtool
1.(all) 在集群的所有节点上安装Lzo库,
tar -zxvf lzo-2.06.tar.gz
cd lzo-2.06
# export CFLAGS=-m64
#./configure --enable-shared
# make
# make install
库文件被默认安装到了/usr/local/lib,我们需要进一步指定lzo库文件的路径,两个方法都可以:
1) 拷贝/usr/local/lib目录下的lzo库文件到/usr/lib(32位平台),或/usr/lib64(64位平台)
#cp /usr/local/lib/liblzo2.* /usr/lib64
2)在/etc/ld.so.conf.d/目录下新建lzo.conf文件,写入lzo库文件的路径,然后运行/sbin/ldconfig -v,使配置生效
#vi /etc/ld.so.conf.d/lzo.conf
/usr/local/lib
# /sbin/ldconfig -v
2. 编译安装Hadoop Lzo本地库以及Jar包,这里采用twitter维护的,从这里下载
https://github.com/twitter/hadoop-lzo
修改pom.xml:
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<hadoop.current.version>2.1.0-beta</hadoop.current.version>
<hadoop.old.version>1.0.4</hadoop.old.version>
</properties>
修改为
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<hadoop.current.version>2.2.0</hadoop.current.version>
<hadoop.old.version>1.0.4</hadoop.old.version>
</properties>
hadoop-lzo-master.zip
cd hadoop-lzo-master
export CFLAGS=-m64
export CXXFLAGS=-m64
export C_INCLUDE_PATH=/usr/local/include/lzo
export LIBRARY_PATH=/usr/local/lib
mvn clean package -Dmaven.test.skip=true
接着把target/native/Linux-amd64-64/lib下的全部文件拷贝到${HADOOP_HOME}/lib/native,或者
cp /build/native/Linux-amd64-64/lib/* $HADOOP_HOME/lib/native/
cp target/hadoop-lzo-0.4.20-SNAPSHOT.jar /opt/hadoop-2.2.0/share/hadoop/common/lib
对于Hbase启用LZO
cp $HADOOP_HOME/lib/native/Linux-amd64-64/* $HBASE_HOME/lib/native/Linux-amd64-64
修改:
hbase-env.sh
exportHBASE_LIBRARY_PATH=$HBASE_LIBRARY_PATH:$HBASE_HOME/lib/native/Linux-amd64-64/:/usr/local/lib/
同步hadoop 和hbase集群
3.配置文件修改
在hadoop-env.sh中加入
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
core-site.xml
<!-- 配置 Hadoop压缩包 -->
<property>
<name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
mapred-site.xml
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
<property>
<name>mapred.child.env</name>
<value>LD_LIBRARY_PATH=/usr/local/lib</value>
</property>
同步hadoop-env.sh,core-site.xml, mapred-site.xml到集群
4. 安装lzop
LZOP是使用lzo库写的一个程序,通过shell命令直接可以压缩、解压缩文件。
tar zxvf lzop-1.03.tar.gz
cd
# exportLD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
# ./configure
# make
# makeinstall
用一下lzop的压缩解压功能,成功安装后可直接使用lzop命令对文件进行解压缩操作了。
# 压缩
[hadoop@master1 ~]$ lzop -v test1.txt
compressing test1.txt into test1.txt.lzo
#上传到hdfs
[hadoop@master1 ~]$ hadoop fs -put *.lzo /in
#给Lzo文件建立Index
hadoop jar/opt/hadoop-2.2.0/share/hadoop/common/lib/hadoop-lzo-0.4.20-SNAPSHOT.jarcom.hadoop.compression.lzo.LzoIndexer /in
#运行一个wordcount程序
hadoop jar /home/hadoop/wordcount.jar org.apache.hadoop.examples.WordCount /input1 /out1
正常:
14/02/23 18:53:14 INFOlzo.GPLNativeCodeLoader: Loaded native gpl library from the embedded binaries
14/02/23 18:53:14 INFO lzo.LzoCodec:Successfully loaded & initialized native-lzo library [hadoop-lzo rev478aa845e11bbbeeb9b8326e733cd20a06d2cb3a]
……
# yum -y install lzo-devel zlib-devel gcc autoconf automakelibtool
1.(all) 在集群的所有节点上安装Lzo库,
tar -zxvf lzo-2.06.tar.gz
cd lzo-2.06
# export CFLAGS=-m64
#./configure --enable-shared
# make
# make install
库文件被默认安装到了/usr/local/lib,我们需要进一步指定lzo库文件的路径,两个方法都可以:
1) 拷贝/usr/local/lib目录下的lzo库文件到/usr/lib(32位平台),或/usr/lib64(64位平台)
#cp /usr/local/lib/liblzo2.* /usr/lib64
2)在/etc/ld.so.conf.d/目录下新建lzo.conf文件,写入lzo库文件的路径,然后运行/sbin/ldconfig -v,使配置生效
#vi /etc/ld.so.conf.d/lzo.conf
/usr/local/lib
# /sbin/ldconfig -v
2. 编译安装Hadoop Lzo本地库以及Jar包,这里采用twitter维护的,从这里下载
https://github.com/twitter/hadoop-lzo
修改pom.xml:
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<hadoop.current.version>2.1.0-beta</hadoop.current.version>
<hadoop.old.version>1.0.4</hadoop.old.version>
</properties>
修改为
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<hadoop.current.version>2.2.0</hadoop.current.version>
<hadoop.old.version>1.0.4</hadoop.old.version>
</properties>
hadoop-lzo-master.zip
cd hadoop-lzo-master
export CFLAGS=-m64
export CXXFLAGS=-m64
export C_INCLUDE_PATH=/usr/local/include/lzo
export LIBRARY_PATH=/usr/local/lib
mvn clean package -Dmaven.test.skip=true
接着把target/native/Linux-amd64-64/lib下的全部文件拷贝到${HADOOP_HOME}/lib/native,或者
cp /build/native/Linux-amd64-64/lib/* $HADOOP_HOME/lib/native/
cp target/hadoop-lzo-0.4.20-SNAPSHOT.jar /opt/hadoop-2.2.0/share/hadoop/common/lib
对于Hbase启用LZO
cp $HADOOP_HOME/lib/native/Linux-amd64-64/* $HBASE_HOME/lib/native/Linux-amd64-64
修改:
hbase-env.sh
exportHBASE_LIBRARY_PATH=$HBASE_LIBRARY_PATH:$HBASE_HOME/lib/native/Linux-amd64-64/:/usr/local/lib/
同步hadoop 和hbase集群
3.配置文件修改
在hadoop-env.sh中加入
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
core-site.xml
<!-- 配置 Hadoop压缩包 -->
<property>
<name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
mapred-site.xml
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
<property>
<name>mapred.child.env</name>
<value>LD_LIBRARY_PATH=/usr/local/lib</value>
</property>
同步hadoop-env.sh,core-site.xml, mapred-site.xml到集群
4. 安装lzop
LZOP是使用lzo库写的一个程序,通过shell命令直接可以压缩、解压缩文件。
tar zxvf lzop-1.03.tar.gz
cd
# exportLD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
# ./configure
# make
# makeinstall
用一下lzop的压缩解压功能,成功安装后可直接使用lzop命令对文件进行解压缩操作了。
# 压缩
[hadoop@master1 ~]$ lzop -v test1.txt
compressing test1.txt into test1.txt.lzo
#上传到hdfs
[hadoop@master1 ~]$ hadoop fs -put *.lzo /in
#给Lzo文件建立Index
hadoop jar/opt/hadoop-2.2.0/share/hadoop/common/lib/hadoop-lzo-0.4.20-SNAPSHOT.jarcom.hadoop.compression.lzo.LzoIndexer /in
#运行一个wordcount程序
hadoop jar /home/hadoop/wordcount.jar org.apache.hadoop.examples.WordCount /input1 /out1
正常:
14/02/23 18:53:14 INFOlzo.GPLNativeCodeLoader: Loaded native gpl library from the embedded binaries
14/02/23 18:53:14 INFO lzo.LzoCodec:Successfully loaded & initialized native-lzo library [hadoop-lzo rev478aa845e11bbbeeb9b8326e733cd20a06d2cb3a]
……
相关文章推荐
- 开启hadoop和Hbase集群的lzo压缩功能(转)
- 在hadoop2.X集群中安装压缩工具snappy(主要用于hbase)
- 在hadoop集群部署hbase并开启kerberos 推荐
- 在Hadoop集群部署Hbase并开启kerberos
- Hadoop 、Hbase、zookeeper 集群环境搭建
- hadoop,hbase集群搭建的又一次总结
- 使用ganglia监控hadoop及hbase集群
- Nginx开启Gzip压缩功能
- hadoop+hbase+zookeeper 分布式集群搭建 + eclipse远程连接hdfs 完美运行
- Hadoop和HBase集群的JMX监控
- hadoop集群间的hbase数据迁移
- Hadoop和HBase集群的JMX监控
- hbase单机及集群安装配置,整合到hadoop
- 开启Nginx的gzip压缩功能详解
- hadoop集群运行dedup实现去重功能
- Hadoop,HBase集群环境搭建的问题集锦(二)
- hadoop2.X集群上Hbase的安装
- 利用ambari搭建hadoop、hbase集群
- 伪分布式集群环境hadoop、hbase、zookeeper搭建(全)
- 三步教你开启IIS的GZIP压缩功能