Hadoop Yarn 安装
2014-07-30 10:22
357 查看
来源: http://blog.csdn.net/zlcd1988/article/details/36008681 http://www.cnblogs.com/toughhou/p/3864273.html
环境:Linux, 8G 内存,60G 硬盘 , Hadoop 2.2.0
为了构建基于Yarn体系的Spark集群,先要安装Hadoop集群,为了以后查阅方便记录了我本次安装的具体步骤。
1. 机器准备
三台主机,#后面说明了用途
10.64.245.152 #hadooptest1 : master
10.64.247.157
#hadooptest2 : datanode1
10.64.253.197 #hadooptest3: datanode2
修改主机名:
在hadooptest1上, vi /etc/sysconfig/network,修改HOSTNAME=hadoop1
在hadooptest2上, vi /etc/sysconfig/network,修改HOSTNAME=hadoop2
在hadooptest3上, vi /etc/sysconfig/network,修改HOSTNAME=hadoop3
增加ip和主机映射:
在三台机器上,在/etc/hosts末尾添加
10.64.245.152 hadoop1
10.64.247.157 hadoop2
10.64.253.197 hadoop3
在hadoop1上, 运行 hostname hadoop1
在hadoop2上, 运行 hostname hadoop2
在hadoop3上, 运行 hostname hadoop3
exit重连之后,hostname 就会变成hadoop[1-3],这样做的好处是ssh hadoop2 会自动解析连接10.64.245.152,方便以后使用。这也是短域名实现的方式。
2. 目录创建
[plain] view
plaincopy
$mkdir -p /hadoop/hdfs
$mkdir -p /hadoop/tmp
$mkdir -p /hadoop/log
$mkdir -p /usr/java ###java安装路径
$mkdir -p /usr/hadoop ###hadoop安装路径
$chmod -R 777 /hadoop
可以根据自己的情况确定安装路径。
1. 下载JDK,并安装,建议安装JDK 1.7。本次下载 jdk-7u65-linux-x64.tar.gz
http://www.oracle.com/technetwork/java/javase/downloads/index.html
download.oracle.com/otn-pub/java/jdk/7u65-b17/jdk-7u65-linux-x64.tar.gz
[plain] view
plaincopy
$tar -zxvf jdk-7u65-linux-x64.tar.gz
$mv jdk1.7.0_65 java
注释:下载的java 包类型不同,安装略有不同。
2. 配置Java 环境
可以修改/etc/profile,也可以修改自己home目录下的~/.profile(ksh)或者~/.bash_profile(bash),本次安装是bash,所以在.bash_profile 末尾添加
export JAVA_HOME=/usr/java/javaexport CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar
export PATH=$JAVA_HOME/bin:$PATH
使环境立即生效,执行
[plain] view
plaincopy
$source .bash_profile
3. 检查Java是否安装成功
[plain] view
plaincopy
$ java -version
java version "1.7.0_65"
Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
[plain] view
plaincopy
$ mkdir .ssh
$ cd .ssh
$ ssh-keygen -t rsa ##generate key
Generating public/private rsa key pair.
Enter file in which to save the key (/export/home/*******/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in ~/.ssh/id_rsa.
Your public key has been saved in ~/.ssh/id_rsa.pub.
The key fingerprint is:
b0:76:89:6a:44:8b:cd:fc:23:a4:3f:69:55:3f:83:e3 ...
$ ls -lrt
total 2
-rw------- 1 887 Jun 30 02:10 id_rsa
-rw-r--r-- 1 232 Jun 30 02:10 id_rsa.pub
$ touch authorized_keys
$ cat id_rsa.pub >> authorized_keys
hadoop2和hadoop3上,同样生成公钥和私钥。
[plain] view
plaincopy
[hadoop2]$ mv id_rsa.pub pub2
[hadoop3]$ mv id_rsa.pub pub3
把pub2,pub3都scp到hadoop1上,然后
[plain] view
plaincopy
$ cat pub2 >> authorized_keys
$ cat pub3 >> authorized_keys
把authorized_keys scp到hadoop2和hadoop3上,这样就可以免密码登录了。
一言以蔽之,就是在每台node上生成公钥和私钥,把所有公钥的内容汇总成authorized_keys,并把authorized_keys分发到集群所有node上相同的目录,这样每个node都拥有整个集群node的公钥,互相之间就可以免密码登录了。
注意:authorized_keys必须严格设定其读写权限,如果权限过大如666可能导致验证失败。
验证免密码登录,在hadoop1上:
[plain] view
plaincopy
$ ssh haoop1
ssh: Could not resolve hostname haoop1: Name or service not known
[username@hadoop3 hadoop]$ ssh hadoop1
The authenticity of host 'hadoop1 (192.168.1.1)' can't be established.
RSA key fingerprint is 18:85:c6:50:0c:15:36:9c:55:34:d7:ab:0e:1c:c7:0f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop1' (RSA) to the list of known hosts.
#################################################################
# #
# This system is for the use of authorized users only. #
# Individuals using this computer system without #
# authority, or in excess of their authority, are #
# subject to having all of their activities on this #
# system monitored and recorded by system personnel. #
# #
# In the course of monitoring individuals improperly #
# using this system, or in the course of system #
# maintenance, the activities of authorized users #
# may also be monitored. #
# #
# Anyone using this system expressly consents to such #
# monitoring and is advised that if such monitoring #
# reveals possible evidence of criminal activity, #
# system personnel may provide the evidence of such #
# monitoring to law enforcement officials. #
# #
# This system/database contains restricted data. #
# #
#################################################################
[hadoop1 ~]$
1. 下载与解压(所有节点)
或使用地址 http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.4.0/hadoop-2.4.0.tar.gz
[plain] view
plaincopy
$ wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
$ tar -zxvf hadoop-2.2.0.tar.gz
$ mv hadoop-2.2.0 /usr/hadoop
以下都运行在haoop1上
2. 配置环境变量,在.bash_profile末尾添加
export HADOOP_HOME=/usr/hadoop
export HADOOP_MAPARED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
source .bash_profile
3. $HADOOP_HOME/etc/hadoop/hadoop-env.sh,末尾添加
export JAVA_HOME=/usr/java/jdk1.7.0_65
4. $HADOOP_HOME/etc/hadoop/core-site.xml 添加
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.1.1:9000</value>
</property>
5. $HADOOP_HOME/etc/hadoop/slaves 内容变为(datanode)
10.64.247.157
10.64.253.197
6. $HADOOP_HOME/etc/hadoop/hdfs-site.xml 添加
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/hadoop/hdfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.federation.nameservice.id</name>
<value>ns1</value>
</property>
<property>
<name>dfs.namenode.backup.address.ns1</name>
<value>10.64.245.152:50100</value>
</property>
<property>
<name>dfs.namenode.backup.http-address.ns1</name>
<value>10.64.245.152:50105</value>
</property>
<property>
<name>dfs.federation.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1</name>
<value>10.64.245.152:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns2</name>
<value>10.64.245.152:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1</name>
<value>10.64.245.152:23001</value>
</property>
<property>
<name>dfs.namenode.http-address.ns2</name>
<value>10.64.245.152:13001</value>
</property>
<property>
<name>dfs.dataname.data.dir</name>
<value>file:/hadoop/hdfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns1</name>
<value>10.64.245.152:23002</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns2</name>
<value>10.64.245.152:23002</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns1</name>
<value>10.64.245.152:23003</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns2</name>
<value>10.64.245.152:23003</value>
</property>
7. $HADOOP_HOME/etc/hadoop/yarn-site.xml 添加
<property>
<name>yarn.resourcemanager.address</name>
<value>10.64.245.152:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>10.64.245.152:18030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>10.64.245.152:50030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>10.64.245.152:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>10.64.245.152:18141</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.web-proxy.address</name>
<value>hadoop1-9014.lvs01.dev.ebayc3.com:54315</value>
</property>
8. $HADOOP_HOME/etc/hadoop/httpfs-site.xml 添加
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>10.64.245.152</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
9. $HADOOP_HOME/etc/hadoop/mapred-site.xml 添加(配置job提交到yarn上并且配置history log 服务器)
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>Execution framework set to Hadoop YARN.</description>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>10.64.245.152:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>10.64.245.152:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/log/tmp</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/log/history</value>
</property>
这个是说明把job放到yarn 上去跑。
10. 配置同步到其他datanode上
[plain] view
plaincopy
$ scp ~/.bash_profile hadoop2:~/.bash_profile
$ scp $HADOOP_HOME/etc/hadoop/hadoop-env.sh hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/core-site.xml hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/slaves hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/hdfs-site.xml hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/yarn-site.xml hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/httpfs-site.xml hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/mapred-site.xml hadoop2:$HADOOP_HOME/etc/hadoop/
把hadoop2改成hadoop3,,把配置同步到hadoop3上
1. 格式化
hadoop namenode -format
2. 启动hdfs
start-dfs.sh
3. 启动yarn
start-yarn.sh
4. 启动httpfs
httpfs.sh start
5. 创建日志存放目录
hadoop fs -mkdir -p /log/tmp
hadoop fs -mkdir -p /log/history
1. hadoop1,看看进程是否已经开启
[plain] view
plaincopy
$ jps
8606 NameNode
4640 Bootstrap
17007 Jps
16077 ResourceManager
8781 SecondaryNameNode
这些进程必须都有
2. 在hadoop2 上看进程是否开启
[plain] view
plaincopy
$ jps
5992 Jps
5422 NodeManager
3292 DataNode
这些进程必须都有
3. hadoop fs -ls / 看是否可以列出文件
4. 测试hadoop job
[plain] view
plaincopy
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /input /output7
如果运行正常,可以在job monitor页面看到job运行状况。
http://10.64.245.152:50030/cluster/app/application_1406472756004_0002 (见配置$HADOOP_HOME/etc/hadoop/yarn-site.xml)
总结:在安装过程中会遇到各种问题,这里不一一列示,以免太过啰嗦。
环境:Linux, 8G 内存,60G 硬盘 , Hadoop 2.2.0
为了构建基于Yarn体系的Spark集群,先要安装Hadoop集群,为了以后查阅方便记录了我本次安装的具体步骤。
事前准备
1. 机器准备三台主机,#后面说明了用途
10.64.245.152 #hadooptest1 : master
10.64.247.157
#hadooptest2 : datanode1
10.64.253.197 #hadooptest3: datanode2
修改主机名:
在hadooptest1上, vi /etc/sysconfig/network,修改HOSTNAME=hadoop1
在hadooptest2上, vi /etc/sysconfig/network,修改HOSTNAME=hadoop2
在hadooptest3上, vi /etc/sysconfig/network,修改HOSTNAME=hadoop3
增加ip和主机映射:
在三台机器上,在/etc/hosts末尾添加
10.64.245.152 hadoop1
10.64.247.157 hadoop2
10.64.253.197 hadoop3
在hadoop1上, 运行 hostname hadoop1
在hadoop2上, 运行 hostname hadoop2
在hadoop3上, 运行 hostname hadoop3
exit重连之后,hostname 就会变成hadoop[1-3],这样做的好处是ssh hadoop2 会自动解析连接10.64.245.152,方便以后使用。这也是短域名实现的方式。
2. 目录创建
[plain] view
plaincopy
$mkdir -p /hadoop/hdfs
$mkdir -p /hadoop/tmp
$mkdir -p /hadoop/log
$mkdir -p /usr/java ###java安装路径
$mkdir -p /usr/hadoop ###hadoop安装路径
$chmod -R 777 /hadoop
可以根据自己的情况确定安装路径。
安装Java
1. 下载JDK,并安装,建议安装JDK 1.7。本次下载 jdk-7u65-linux-x64.tar.gzhttp://www.oracle.com/technetwork/java/javase/downloads/index.html
download.oracle.com/otn-pub/java/jdk/7u65-b17/jdk-7u65-linux-x64.tar.gz
[plain] view
plaincopy
$tar -zxvf jdk-7u65-linux-x64.tar.gz
$mv jdk1.7.0_65 java
注释:下载的java 包类型不同,安装略有不同。
2. 配置Java 环境
可以修改/etc/profile,也可以修改自己home目录下的~/.profile(ksh)或者~/.bash_profile(bash),本次安装是bash,所以在.bash_profile 末尾添加
export JAVA_HOME=/usr/java/javaexport CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar
export PATH=$JAVA_HOME/bin:$PATH
使环境立即生效,执行
[plain] view
plaincopy
$source .bash_profile
3. 检查Java是否安装成功
[plain] view
plaincopy
$ java -version
java version "1.7.0_65"
Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
配置SSH 无密码登录
hadoop1 上[plain] view
plaincopy
$ mkdir .ssh
$ cd .ssh
$ ssh-keygen -t rsa ##generate key
Generating public/private rsa key pair.
Enter file in which to save the key (/export/home/*******/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in ~/.ssh/id_rsa.
Your public key has been saved in ~/.ssh/id_rsa.pub.
The key fingerprint is:
b0:76:89:6a:44:8b:cd:fc:23:a4:3f:69:55:3f:83:e3 ...
$ ls -lrt
total 2
-rw------- 1 887 Jun 30 02:10 id_rsa
-rw-r--r-- 1 232 Jun 30 02:10 id_rsa.pub
$ touch authorized_keys
$ cat id_rsa.pub >> authorized_keys
hadoop2和hadoop3上,同样生成公钥和私钥。
[plain] view
plaincopy
[hadoop2]$ mv id_rsa.pub pub2
[hadoop3]$ mv id_rsa.pub pub3
把pub2,pub3都scp到hadoop1上,然后
[plain] view
plaincopy
$ cat pub2 >> authorized_keys
$ cat pub3 >> authorized_keys
把authorized_keys scp到hadoop2和hadoop3上,这样就可以免密码登录了。
一言以蔽之,就是在每台node上生成公钥和私钥,把所有公钥的内容汇总成authorized_keys,并把authorized_keys分发到集群所有node上相同的目录,这样每个node都拥有整个集群node的公钥,互相之间就可以免密码登录了。
注意:authorized_keys必须严格设定其读写权限,如果权限过大如666可能导致验证失败。
验证免密码登录,在hadoop1上:
[plain] view
plaincopy
$ ssh haoop1
ssh: Could not resolve hostname haoop1: Name or service not known
[username@hadoop3 hadoop]$ ssh hadoop1
The authenticity of host 'hadoop1 (192.168.1.1)' can't be established.
RSA key fingerprint is 18:85:c6:50:0c:15:36:9c:55:34:d7:ab:0e:1c:c7:0f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop1' (RSA) to the list of known hosts.
#################################################################
# #
# This system is for the use of authorized users only. #
# Individuals using this computer system without #
# authority, or in excess of their authority, are #
# subject to having all of their activities on this #
# system monitored and recorded by system personnel. #
# #
# In the course of monitoring individuals improperly #
# using this system, or in the course of system #
# maintenance, the activities of authorized users #
# may also be monitored. #
# #
# Anyone using this system expressly consents to such #
# monitoring and is advised that if such monitoring #
# reveals possible evidence of criminal activity, #
# system personnel may provide the evidence of such #
# monitoring to law enforcement officials. #
# #
# This system/database contains restricted data. #
# #
#################################################################
[hadoop1 ~]$
安装Hadoop
1. 下载与解压(所有节点)或使用地址 http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.4.0/hadoop-2.4.0.tar.gz
[plain] view
plaincopy
$ wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
$ tar -zxvf hadoop-2.2.0.tar.gz
$ mv hadoop-2.2.0 /usr/hadoop
以下都运行在haoop1上
2. 配置环境变量,在.bash_profile末尾添加
export HADOOP_HOME=/usr/hadoop
export HADOOP_MAPARED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
source .bash_profile
3. $HADOOP_HOME/etc/hadoop/hadoop-env.sh,末尾添加
export JAVA_HOME=/usr/java/jdk1.7.0_65
4. $HADOOP_HOME/etc/hadoop/core-site.xml 添加
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.1.1:9000</value>
</property>
5. $HADOOP_HOME/etc/hadoop/slaves 内容变为(datanode)
10.64.247.157
10.64.253.197
6. $HADOOP_HOME/etc/hadoop/hdfs-site.xml 添加
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/hadoop/hdfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.federation.nameservice.id</name>
<value>ns1</value>
</property>
<property>
<name>dfs.namenode.backup.address.ns1</name>
<value>10.64.245.152:50100</value>
</property>
<property>
<name>dfs.namenode.backup.http-address.ns1</name>
<value>10.64.245.152:50105</value>
</property>
<property>
<name>dfs.federation.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1</name>
<value>10.64.245.152:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns2</name>
<value>10.64.245.152:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1</name>
<value>10.64.245.152:23001</value>
</property>
<property>
<name>dfs.namenode.http-address.ns2</name>
<value>10.64.245.152:13001</value>
</property>
<property>
<name>dfs.dataname.data.dir</name>
<value>file:/hadoop/hdfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns1</name>
<value>10.64.245.152:23002</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns2</name>
<value>10.64.245.152:23002</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns1</name>
<value>10.64.245.152:23003</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns2</name>
<value>10.64.245.152:23003</value>
</property>
7. $HADOOP_HOME/etc/hadoop/yarn-site.xml 添加
<property>
<name>yarn.resourcemanager.address</name>
<value>10.64.245.152:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>10.64.245.152:18030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>10.64.245.152:50030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>10.64.245.152:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>10.64.245.152:18141</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.web-proxy.address</name>
<value>hadoop1-9014.lvs01.dev.ebayc3.com:54315</value>
</property>
8. $HADOOP_HOME/etc/hadoop/httpfs-site.xml 添加
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>10.64.245.152</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
9. $HADOOP_HOME/etc/hadoop/mapred-site.xml 添加(配置job提交到yarn上并且配置history log 服务器)
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>Execution framework set to Hadoop YARN.</description>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>10.64.245.152:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>10.64.245.152:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/log/tmp</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/log/history</value>
</property>
这个是说明把job放到yarn 上去跑。
10. 配置同步到其他datanode上
[plain] view
plaincopy
$ scp ~/.bash_profile hadoop2:~/.bash_profile
$ scp $HADOOP_HOME/etc/hadoop/hadoop-env.sh hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/core-site.xml hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/slaves hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/hdfs-site.xml hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/yarn-site.xml hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/httpfs-site.xml hadoop2:$HADOOP_HOME/etc/hadoop/
$ scp $HADOOP_HOME/etc/hadoop/mapred-site.xml hadoop2:$HADOOP_HOME/etc/hadoop/
把hadoop2改成hadoop3,,把配置同步到hadoop3上
启动Hadoop集群
1. 格式化hadoop namenode -format
2. 启动hdfs
start-dfs.sh
3. 启动yarn
start-yarn.sh
4. 启动httpfs
httpfs.sh start
5. 创建日志存放目录
hadoop fs -mkdir -p /log/tmp
hadoop fs -mkdir -p /log/history
测试hadoop集群
1. hadoop1,看看进程是否已经开启[plain] view
plaincopy
$ jps
8606 NameNode
4640 Bootstrap
17007 Jps
16077 ResourceManager
8781 SecondaryNameNode
这些进程必须都有
2. 在hadoop2 上看进程是否开启
[plain] view
plaincopy
$ jps
5992 Jps
5422 NodeManager
3292 DataNode
这些进程必须都有
3. hadoop fs -ls / 看是否可以列出文件
4. 测试hadoop job
[plain] view
plaincopy
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /input /output7
如果运行正常,可以在job monitor页面看到job运行状况。
http://10.64.245.152:50030/cluster/app/application_1406472756004_0002 (见配置$HADOOP_HOME/etc/hadoop/yarn-site.xml)
总结:在安装过程中会遇到各种问题,这里不一一列示,以免太过啰嗦。
相关文章推荐
- hadoop2.2.0安装中遇到的错误:mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid
- Hadoop2.0 YARN cloudra4.4.0安装配置
- hadoop多机安装HA+YARN
- Hadoop YARN的伪分布式安装
- 安装Hadoop YARN
- Hadoop 2.x(YARN)安装配置LZO
- Spark2.0.1 on yarn with hue 集群搭建部署(五)hue安装支持hadoop
- Spark on Yarn+Hbase环境搭建指南(二)Hadoop安装
- hadoop2.2.0安装中遇到的错误:mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid
- Hadoop YARN安装部署初探
- Hadoop YARN的安装配置
- Hadoop YARN 安装
- Hadoop 2.4.0和YARN的安装过程
- hadoop多机安装YARN
- Hadoop yarn完全分布式安装笔记
- hadoop2.0安装中遇到的错误:mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid
- hadoop2.2.0安装中遇到的错误:mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid
- Alex 的 Hadoop 菜鸟教程: 第5课 YARN 安装以及helloworld (基于centos的CDH)
- 大数据平台安装测试(1)centos7.1 docker mesos tachyon hadoop (myriad? yarn?)spark hbase speaksql 选型分析
- Hadoop2:MapR(YARN)在ubuntu集群中的安装