您的位置:首页 > 运维架构

Hadoop2.5.1完全分布式安装

2014-11-26 16:55 232 查看
机器类型,64位centos;列表:
master

slave1
slave2
slave3
slave4


0 准备工作

0.1 修改每台机器的hostname为列表中的名字

如master就做如下修改:
[root@master ~]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master


[root@master ~]# vi /etc/hosts
#127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1       localhost
192.168.11.99   master

192.168.12.174  slave1
192.168.12.178  slave2
192.168.12.18   slave3
192.168.11.94   slave4


保存并重启host就变成了我们设置的名字。

0.2 打通master到slave的ssh免验证登录

简述一下ssh免验证登录的原理:要从主机A免验证登录到主机B,首先在在A上运行ssh-keygen,生成私钥文件id_rsa和对应的公钥文件id_rsa.pub,然后把公钥文件中的内容添加到B主机的~/.ssh/authorized_keys文件中,就完成了A免验证登录到B的设置。

查看一下master登录到slave1的效果:
[root@master ~]# ssh slave1
Last login: Mon Nov 24 16:45:15 2014 from master
[root@slave1 ~]#


0.3 在每台机器上安装jdk

去oracle下载jdk的rpm包并安装,我装的是1.8,比较适合的应该是1.7的版本:
rpm -ivh jdk-8u25-linux-x64.rpm


vi /etc/profile,设置环境变量,添加path和classpath:
export JAVA_HOME=/usr/java/jdk1.8.0_25
export PATH=/usr/hbase/bin:/usr/hadoop/bin:/usr/hadoop/sbin:$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar


使修改生效:
source /etc/profile


1 安装配置hadoop

1.1 下载解压hadoop

其实应该编译个64位的版本,没弄...

cp hadoop-2.5.1.tar.gz /usr/
tar -zxf hadoop-2.5.1.tar.gz
mv hadoop-2.5.1 hadoop


修改环境变量:
export PATH=/usr/hbase/bin:/usr/hadoop/bin:/usr/hadoop/sbin:$JAVA_HOME/bin:$PATH


1.2 修改配置文件

配置文件:
[root@slave1 hadoop]# pwd
/usr/hadoop/etc/hadoop
[root@slave1 hadoop]# ll
total 140
drwxr-xr-x 2 root  root   4096 Nov 15 14:31 .
drwxr-xr-x 3 10011 10011  4096 Oct 28 16:42 ..
-rw-r--r-- 1 root  root   3589 Nov 13 13:57 capacity-scheduler.xml
-rw-r--r-- 1 root  root   1335 Nov 13 13:57 configuration.xsl
-rw-r--r-- 1 root  root    318 Nov 13 13:57 container-executor.cfg
-rw-r--r-- 1 root  root   1339 Nov 13 13:57 core-site.xml
-rw-r--r-- 1 root  root   3670 Nov 13 13:57 hadoop-env.cmd
-rw-r--r-- 1 root  root   3452 Nov 13 13:57 hadoop-env.sh
-rw-r--r-- 1 root  root   1774 Nov 13 13:57 hadoop-metrics2.properties
-rw-r--r-- 1 root  root   2490 Nov 13 13:57 hadoop-metrics.properties
-rw-r--r-- 1 root  root   9201 Nov 13 13:57 hadoop-policy.xml
-rw-r--r-- 1 root  root   2372 Nov 13 13:57 hdfs-site.xml
-rw-r--r-- 1 root  root   1449 Nov 13 13:57 httpfs-env.sh
-rw-r--r-- 1 root  root   1657 Nov 13 13:57 httpfs-log4j.properties
-rw-r--r-- 1 root  root     21 Nov 13 13:57 httpfs-signature.secret
-rw-r--r-- 1 root  root    620 Nov 13 13:57 httpfs-site.xml
-rw-r--r-- 1 root  root  11118 Nov 13 13:57 log4j.properties
-rw-r--r-- 1 root  root    938 Nov 13 13:57 mapred-env.cmd
-rw-r--r-- 1 root  root   1383 Nov 13 13:57 mapred-env.sh
-rw-r--r-- 1 root  root   4113 Nov 13 13:57 mapred-queues.xml.template
-rw-r--r-- 1 root  root    844 Nov 13 13:57 mapred-site.xml
-rw-r--r-- 1 root  root    758 Nov 13 13:57 mapred-site.xml.template
-rw-r--r-- 1 root  root     29 Nov 13 13:57 slaves
-rw-r--r-- 1 root  root   2316 Nov 13 13:57 ssl-client.xml.example
-rw-r--r-- 1 root  root   2268 Nov 13 13:57 ssl-server.xml.example
-rw-r--r-- 1 root  root   2237 Nov 13 13:57 yarn-env.cmd
-rw-r--r-- 1 root  root   4606 Nov 13 13:57 yarn-env.sh
-rw-r--r-- 1 root  root   1875 Nov 13 13:57 yarn-site.xml
-rw-r--r-- 1 root  root   1087 Oct 31 16:42 yarn-site.xml.bak


要修改的文件列表:
hadoop-env.sh
yarn-env.sh
core-site.xml
hdfs-site.xml
mapred-site.xml
slaves
yarn-site.xml


a. hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_25


b. yarn-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_25


c. slaves
添加slave主机名,每行一个:
slave1
slave2
slave3
slave4


d. core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/root/tmp</value>
<description>A base for other temporary directories.</description>
</property>
</configuration>


e. hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:50090</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/root/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/root/dfs/data</value>
</property>
</configuration>


f. mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>


g. yarn-site.xml
<configuration>

<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
</configuration>


修改完成后,把相同配置拷贝到每台机器中:
scp -r /usr/hadoop/etc/hadoop root@slave*:/usr/hadoop/etc/


2. 测试

start-dfs.sh
start-yarn.sh
jps
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: