Hadoop之——CentOS + hadoop2.5.2分布式环境配置
2016-05-30 10:41
519 查看
转载请注明出处:http://blog.csdn.net/l1028386804/article/details/51536051
hadoop版本:hadoop-2.5.2
jdk版本:jdk-7u72-linux-x64.tar.gz
配置同步到其他两台机器
2)/etc/profile文件追加以下内容
追加内容如下:
http://hadoop.apache.org/docs/r2.5.2/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
用来配置集群中每台主机都可用,指定主机上作为namenode和datanode的目录。
编辑 $HADOOP_HOME/etc/hadoop/slaves
内容如下:
浏览器打开 http://liuyazhuang-01:8088/,会看到hadoop进程管理页面
浏览器打开 http://liuyazhuang-01:8088/cluster 查看cluster情况
Ubuntu14.04下安装Hadoop2.4.0 (单机模式)
http://www.cnblogs.com/kinglau/p/3794433.html
Ubuntu14.04下安装Hadoop2.4.0 (伪分布模式)
http://www.cnblogs.com/kinglau/p/3796164.html
伪分布模式下执行wordcount实例时报错解决办法
http://www.cnblogs.com/kinglau/p/3364928.html
Eclipse下搭建Hadoop2.4.0开发环境
http://www.cnblogs.com/kinglau/p/3802705.html
Hadoop学习三十:Win7 Eclipse调试Centos Hadoop2.2-Mapreduce
http://zy19982004.iteye.com/blog/2024467
hadoop2.5.0 centOS系列 分布式的安装 部署
http://my.oschina.net/yilian/blog/310189
Centos6.5源码编译安装Hadoop2.5.1
http://www.myhack58.com/Article/sort099/sort0102/2014/54025.htm
Hadoop MapReduce两种常见的容错场景分析
http://www.chinacloud.cn/show.aspx?id=15793&cid=17
hadoop 2.2.0集群安装
http://blog.csdn.net/bluishglc/article/details/24591185
Apache Hadoop 2.2.0 HDFS HA + YARN多机部署
http://blog.csdn.net/u010967382/article/details/20380387
Hadoop集群配置(最全面总结)
http://blog.csdn.net/hguisu/article/details/7237395
Hadoop hdfs-site.xml 配置项清单
http://he.iori.blog.163.com/blog/static/6955953520138107638208/ http://slaytanic.blog.51cto.com/2057708/1101111
Hadoop三种安装模式
http://blog.csdn.net/liumm0000/article/details/13408855
一、基础环境准备
系统:(VMWare) CentOS-6.5-x86_64-bin-DVD1.isohadoop版本:hadoop-2.5.2
jdk版本:jdk-7u72-linux-x64.tar.gz
1.集群机器
三台测试集群,一个master(liuyazhuang-01),两个slave(liuyazhuang-02,liuyazhuang-03)/etc/hosts 192.168.1.112 liuyazhuang-01 192.168.1.113 liuyazhuang-02 192.168.1.114 liuyazhuang-03注意不要保留127.0.0.1 localhost
配置同步到其他两台机器
scp /etc/hosts root@192.168.1.113:/etc/hosts scp /etc/hosts root@192.168.1.114:/etc/hosts
2. 设置linux上ssh是用户可以自动登录
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
3.java环境配置
1)JAVA_HOME为/usr/local/java/jdk1.7.0_722)/etc/profile文件追加以下内容
JAVA_HOME=/usr/local/jdk1.7.0_72 CLASS_PATH=$JAVA_HOME/lib PATH=$JAVA_HOME/bin:$PATH export PATH JAVA_HOME CLASS_PATH
二、下载解压hadoop-2.5.2.tar.gz
hadoop@liuyazhuang-01:~/data$ pwd /home/hadoop/data hadoop@liuyazhuang-01:~/data$ wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.5.2/hadoop-2.5.2.tar.gz hadoop@liuyazhuang-01:~/data$tar zxvf hadoop-2.5.2.tar.gz
三、配置环境变量
hadoop@liuyazhuang-01:~/data$gedit /etc/profile追加内容如下:
#HADOOP VARIABLES START export HADOOP_INSTALL=/home/hadoop/data/hadoop-2.5.2 export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib" #HADOOP VARIABLES END使配置生效
hadoop@liuyazhuang-01:~/data$source /etc/profile同时需要修改$HADOOP_HOME/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/local/java/jdk1.7.0_72
四、修改$HADOOP_HOME/etc/hadoop/core-site.xml
添加如下内容:<property> <name>fs.default.name</name> <value>hdfs://liuyazhuang-01:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/data/hadoop-2.5.2/hadoop-${user.name}</value> </property>
五、修改$HADOOP_HOME/etc/hadoop/yarn-site.xml
添加如下内容:<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>liuyazhuang-01</value> </property>更多yarn-site.xml参数配置可参考:
http://hadoop.apache.org/docs/r2.5.2/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
六、修改$HADOOP_HOME/etc/hadoop/mapred-site.xml
默认没有mapred-site.xml文件,copy mapred-site.xml.template 一份为 mapred-site.xml即可#cp etc/hadoop/mapred-site.xml.template ./etc/hadoop/mapred-site.xml添加如下内容:
<property> <name>mapreduce.framework.name</name> <value>yarn</value> <final>true</final> </property>
七、配置hdfs-site.xml (这里可以不配,采用默认参数)
/usr/local/hadoop/etc/hadoop/hdfs-site.xml用来配置集群中每台主机都可用,指定主机上作为namenode和datanode的目录。
<property> <name>dfs.name.dir</name> <value>/home/hadoop/data/hadoop-2.5.2/name1,/home/hadoop/data/hadoop-2.5.2/name2</value> </property> <property> <name>dfs.data.dir</name> <value>/home/hadoop/data/hadoop-2.5.2/data1,/home/hadoop/data/hadoop-2.5.2/data2</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property>
八、配置salves
告诉hadoop 其他从节点,这样,只要主节点启动,他会自动启动其他机器上的nameNode dataNode 等等编辑 $HADOOP_HOME/etc/hadoop/slaves
内容如下:
liuyazhuang-02 liuyazhuang-03
九、同步同步该文件夹 到其他各个从主机上即可
因为我们使用ssh免登陆 不需要使用密码hadoop@liuyazhuang-01:~/data/hadoop-2.5.2$scp -r /home/hadoop/data/hadoop-2.5.2 hadoop@192.168.1.113:/home/hadoop/data/hadoop-2.5.2 hadoop@liuyazhuang-01:~/data/hadoop-2.5.2$scp -r /home/hadoop/data/hadoop-2.5.2 hadoop@192.168.1.114:/home/hadoop/data/hadoop-2.5.2
十、格式化hdfs
hadoop@liuyazhuang-01:~/data/hadoop-2.5.2$./bin/hdfs namenode -format或者
hadoop@liuyazhuang-01:~ hadoop namenode -format
十一、启动hadoop集群
hadoop@liuyazhuang-01:~/data/hadoop-2.5.2$./sbin/start-dfs.sh hadoop@liuyazhuang-01:~/data/hadoop-2.5.2$./sbin/start-yarn.sh
十二、浏览器查看
浏览器打开 http://liuyazhuang-01:50070/,会看到hdfs管理页面浏览器打开 http://liuyazhuang-01:8088/,会看到hadoop进程管理页面
浏览器打开 http://liuyazhuang-01:8088/cluster 查看cluster情况
十三、验证(WordCount验证)
1.dfs上创建input目录
hadoop@liuyazhuang-01:~/data/hadoop-2.5.2$bin/hadoop fs -mkdir -p input
2.把hadoop目录下的README.txt拷贝到dfs新建的input里
hadoop@liuyazhuang-01:~/data/hadoop-2.5.2$bin/hadoop fs -copyFromLocal README.txt input
3.运行WordCount
hadoop@liuyazhuang-01:~/data/hadoop-2.5.2$bin/hadoop jar share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.5.2-sources.jar org.apache.hadoop.examples.WordCount input output
4.运行完毕后,查看单词统计结果
hadoop@liuyazhuang-01:~/data/hadoop-2.5.2$bin/hadoop fs -cat output/*假如程序的输出路径为output,如果该文件夹已经存在,先删除
hadoop@liuyazhuang-01:~/data/hadoop-2.5.2$bin/hadoop dfs -rmr output参考资料:
Ubuntu14.04下安装Hadoop2.4.0 (单机模式)
http://www.cnblogs.com/kinglau/p/3794433.html
Ubuntu14.04下安装Hadoop2.4.0 (伪分布模式)
http://www.cnblogs.com/kinglau/p/3796164.html
伪分布模式下执行wordcount实例时报错解决办法
http://www.cnblogs.com/kinglau/p/3364928.html
Eclipse下搭建Hadoop2.4.0开发环境
http://www.cnblogs.com/kinglau/p/3802705.html
Hadoop学习三十:Win7 Eclipse调试Centos Hadoop2.2-Mapreduce
http://zy19982004.iteye.com/blog/2024467
hadoop2.5.0 centOS系列 分布式的安装 部署
http://my.oschina.net/yilian/blog/310189
Centos6.5源码编译安装Hadoop2.5.1
http://www.myhack58.com/Article/sort099/sort0102/2014/54025.htm
Hadoop MapReduce两种常见的容错场景分析
http://www.chinacloud.cn/show.aspx?id=15793&cid=17
hadoop 2.2.0集群安装
http://blog.csdn.net/bluishglc/article/details/24591185
Apache Hadoop 2.2.0 HDFS HA + YARN多机部署
http://blog.csdn.net/u010967382/article/details/20380387
Hadoop集群配置(最全面总结)
http://blog.csdn.net/hguisu/article/details/7237395
Hadoop hdfs-site.xml 配置项清单
http://he.iori.blog.163.com/blog/static/6955953520138107638208/ http://slaytanic.blog.51cto.com/2057708/1101111
Hadoop三种安装模式
http://blog.csdn.net/liumm0000/article/details/13408855
相关文章推荐
- Linux连接MySQL出现1045错误 解决方法
- [Linux] 串口调试工具 Minicom 详细介绍
- linux---基本查找命令
- linux下c++使用pthread_create时需要调用类成员
- yum报错Error: Cannot retrieve metalink for repository: epel
- Linux中crontab下scp文件传输的两种方式
- windows和linux下的文件路径表示小结
- centos7安装sersync2+rsync+inotify-tools实现文件实时同步
- Linux软连接和硬链接
- 【linux】centos6.5上bugzilla的搭建
- Linux字符设备驱动结构
- linux 新建文件的命令
- linux centos 和ubuntu 的安装命令
- linux socket select非阻塞模式多台客户端与服务器通信
- 又一次 Mindcraft 事件?关于 Linux 内核安全性的批评
- Hadoop伪分布式配置:CentOS6.5(64)+JDK1.7+hadoop2.7.2
- linux socket 非阻塞select
- LINUX连接mysql数据库
- linux两个文件修改主机名
- Linux命令行修改IP、网关、DNS的方法