测试集群模式安装实施Hadoop
2014-05-28 08:45
316 查看
测试集群模式安装实施Hadoop
1. 集群架构
在VMware中安装三台CentOS虚拟机server1,server2,server3,其中server1作为Hadoop集群的NomeNode和JobTracker,server2和server3作为DataNode和TaskTracker. 为简便将DNS、NFS也安装在server1之上。
2.安装DNS
使用yum安装bind
[root@server1
admin]# yum install
bind*
安装完成后检查,
[root@server1
admin]# rpm -qa |
grep '^bind'
bind-dyndb-ldap-1.1.0-0.9.b1.el6_3.1.x86_64
bind-chroot-9.8.2-0.10.rc1.el6_3.6.x86_64
bind-libs-9.8.2-0.10.rc1.el6_3.6.x86_64
bind-sdb-9.8.2-0.10.rc1.el6_3.6.x86_64
bind-utils-9.8.2-0.10.rc1.el6_3.6.x86_64
bind-devel-9.8.2-0.10.rc1.el6_3.6.x86_64
bind-9.8.2-0.10.rc1.el6_3.6.x86_64
安装已经齐全
修改配置文件
修改/etc/named.conf,将127.0.0.1,localhost 改成any
[root@server1 etc]# vim
named.conf
options {
listen-on
port 53 { any; };
listen-on-v6
port 53 { ::1; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file
"/var/named/data/named_stats.txt";
memstatistics-file
"/var/named/data/named_mem_stats.txt";
allow-query { any;
};
recursion
yes;
dnssec-enable
yes;
dnssec-validation
yes;
dnssec-lookaside
auto;
bindkeys-file
"/etc/named.iscdlv.key";
managed-keys-directory
"/var/named/dynamic";
};
修改/etc/named.rfc1912.zones, 加入以下内容
zone "myhadoop.com" IN {
type
master;
file
"myhadoop.com.zone";
allow-update
{ none; };
};
zone "1.168.192.in-addr.arpa" IN {
type
master;
file
"1.168.192.in-addr.zone";
allow-update
{ none; };
};
在目录/var/named下创建文件myhadoop.com.zone、1.168.192.in-addr.zone
修改myhadoop.com.zone为
$TTL 86400
@ IN
SOA server1.myhadoop.com.
chizk.root.myhadoop.com. (
0 ;
serial (d.adams)
1D ;
refresh
1H ;
retry
1W ;
expire
3H
) ; minimum
@ IN
NS server1.myhadoop.com.
server1.myhadoop.com. IN
A 192.168.1.201
server2.myhadoop.com. IN
A 192.168.1.202
server3.myhadoop.com. IN
A 192.168.1.203
修改1.168.192.in-addr.zone为
$TTL 86400
@ IN
SOA server1.myhadoop.com.
chizk.root.myhadoop.com. (
0 ;
serial
1D ;
refresh
1H ;
retry
1W ;
expire
3H
) ;
minimum
@ IN
NS server1.myhadoop.com.
201 IN
PTR
server1.myhadoop.com.
202 IN
PTR
server2.myhadoop.com.
202 IN
PTR server3.myhadoop.com.
修改这两个文件的所有者
[root@server1 named]# chown root.named
myhadoop.com.zone
[root@server1 named]# chown root.named
1.168.192.in-addr.zone
在/etc/resolv.conf中添加以下配置
nameserver 192.168.1.201
用同样的方法修改server2、server3中的/etc/resolv.conf文件
启动DNS
[root@server1 named]# service named
start
Starting
named: [ OK ]
设置为开机自动启动
[root@server1 admin]# chkconfig named
on
测试DNS查询
[root@server1 admin]# nslookup
server1.myhadoop.com
Server: 192.168.1.201
Address: 192.168.1.201#53
Name: server1.myhadoop.com
Address: 192.168.1.201
[root@server1 admin]# nslookup
server2.myhadoop.com
Server: 192.168.1.201
Address: 192.168.1.201#53
Name: server2.myhadoop.com
Address: 192.168.1.202
[root@server1 admin]# nslookup
server3.myhadoop.com
Server: 192.168.1.201
Address: 192.168.1.201#53
Name: server3.myhadoop.com
Address: 192.168.1.203
查询成功,同时在server2,server3中测试,查询都成功
3.安装NFS
查看NFS和rpcbind包是否已安装
[root@server1 admin]# rpm -qa | grep
nfs
nfs4-acl-tools-0.3.3-5.el6.x86_64
nfs-utils-1.2.2-7.el6.x86_64
nfs-utils-lib-1.1.5-1.el6.x86_64
[root@server1 admin]# rpm -qa | grep
rpcbind
rpcbind-0.2.0-8.el6.x86_64
可见已经安装完全,若没有安装使用yum安装即可。
修改文件/etc/exports, 加入以下内容
/home/admin *(sync,rw)
启动NFS
[root@server1 admin]# service nfs
start
Starting NFS
services: [ OK ]
Starting NFS
quotas: [ OK ]
Starting NFS
daemon: [ OK ]
Starting NFS
mountd: [ OK ]
设置为开机自动启动
[root@server1 admin]# chkconfig nfs
on
启动rpcbind
[root@server1 admin]# service rpcbind
start
Starting
rpcbind: [ OK ]
设置为自动启动
[root@server1 admin]# chkconfig rpcbind
on
输出挂载点
[root@server1 admin]# showmount -e
localhost
Export list for localhost:
/home/admin *
修改/home/admin的权限,为方便设置为777
[root@server1 home]# chmod 777
/home/admin
在server2中挂载server1中的/home/admin
[root@server2 home]# mount
server1.myhadoop.com:/home/admin/ /home/admin_share/
测试访问
[root@server2 home]# cd admin_share/
[root@server2 admin_share]# cat test.txt
aaaa,111
bbbb,222
cccc,333
dddd,444
可见访问成功
修改server2的/etc/fstab 文件,设置自动挂载,在末尾添加如下行:
server1.myhadoop.com:/home/admin /home/admin_share nfs defaults
1 1
同理在server3中挂载server1的/home/admin,并测试
4. 共享密钥文件
为server1,server2,server3的admin用户各自生成登录密钥
[admin@server1 ~]$ ssh-keygen -t
rsa
Generating public/private rsa key pair.
Enter file in which to save the key
(/home/admin/.ssh/id_rsa):
/home/admin/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in
/home/admin/.ssh/id_rsa.
Your public key has been saved in
/home/admin/.ssh/id_rsa.pub.
The key fingerprint is:
46:56:64:8f:83:13:e0:f3:17:cb:b9:7d:d5:fc:9f:52
admin@server1
The key's randomart image is:
+--[ RSA 2048]----+
| ....+ |
| . =
o |
| o
= +
. |
| =
o
= ..|
| S
= +|
| .
.
o E.|
| .
. o .|
| o o|
| ...|
+-----------------+
[admin@server2 ~]$ ssh-keygen -t
rsa
[admin@server3 ~]$ ssh-keygen -t
rsa
在server1中将id_rsa.pub 复制为authorized_keys
[admin@server1 ~]$ cp .ssh/id_rsa.pub
.ssh/authorized_keys
在server2,server3中建立共享密钥到本地的软连接
[admin@server2 ~]$ ln -s
/home/admin_share/.ssh/authorized_keys ~/.ssh/authorized_keys
[admin@server3 ~]$ ln -s
/home/admin_share/.ssh/authorized_keys ~/.ssh/authorized_keys
将server2,server3的密钥分别追加到authorized_keys
[admin@server2 ~]$ cat .ssh/id_rsa.pub
>>
.ssh/authorized_keys
[admin@server3 ~]$ cat .ssh/id_rsa.pub
>>
.ssh/authorized_keys
测试配置
[admin@server1 ~]$ ssh
server1.myhadoop.com
The authenticity of host 'server1.myhadoop.com
(192.168.1.201)' can't be established.
RSA key fingerprint is
a9:f3:7f:55:56:3a:a7:d7:9e:23:1e:86:a5:eb:90:dc.
Are you sure you want to continue connecting (yes/no)?
yes
Warning: Permanently added
'server1.myhadoop.com,192.168.1.201' (RSA) to the list of known
hosts.
Last login: Sun Jan 27 10:02:12 2013 from
server1
同样方法测试其他机器,测试成功
5. 安装Hadoop
在server1上,hadoop中配置core-site.xml 为以下格式:
<?xml
version="1.0"?>
<?xml-stylesheet type="text/xsl"
href="configuration.xsl"?>
<!-- Put site-specific property overrides in
this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://server1.myhadoop.com:9000</value>
</property>
</configuration>
配置mapred-site.xml:
<?xml
version="1.0"?>
<?xml-stylesheet type="text/xsl"
href="configuration.xsl"?>
<!-- Put site-specific property overrides in
this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>server1.myhadoop.com:9001</value>
</property>
<property>
<name>mapred.job.tracker.map.tasks.maximum</name>
<value>50</value>
</property>
<property>
<name>mapred.job.tracker.reduce.tasks.maximum</name>
<value>50</value>
</property>
</configuration>
配置master为
server1.myhadoop.com
配置slaves为
server2.myhadoop.com
server3.myhadoop.com
建立文本文件serverlist.txt,里面包含所有需要分发Hadoop的机器域名,在这里为server2,server3,即内容为
[admin@server1 ~]$ cat serverlist.txt
server2.myhadoop.com
server3.myhadoop.com
生成Hadoop的Shell脚本
[admin@server1 ~]$ cat serverlist.txt |
awk '{print "scp -rp /home/admin/hadoop-0.20.2/
admin@"$1":/home/admin/"}' >
distributeHadoop.sh
内容如下
[admin@server1 ~]$ cat
./distributeHadoop.sh
scp -rp /home/admin/hadoop-0.20.2/
admin@server2.myhadoop.com:/home/admin/
scp -rp /home/admin/hadoop-0.20.2/
admin@server3.myhadoop.com:/home/admin/
运行脚本
[admin@server1
~]$ ./distributeHadoop.sh
检查server2 、server3,复制成功
格式化namenode
[admin@server1 logs]$ hadoop namenode
-format
启动Hadoop
[admin@server1
~]$ ./hadoop-0.20.2/bin/start-all.sh
starting namenode, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-namenode-server1.out
server2.myhadoop.com: starting datanode, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-datanode-server2.out
server3.myhadoop.com: starting datanode, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-datanode-server3.out
server1.myhadoop.com: starting secondarynamenode, logging
to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-secondarynamenode-server1.out
starting jobtracker, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-jobtracker-server1.out
server2.myhadoop.com: starting tasktracker, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-tasktracker-server2.out
server3.myhadoop.com: starting tasktracker, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-tasktracker-server3.out
检查server1、server2、server3,启动成功
[admin@server1
logs]$ jps
6481 NameNode
6612 SecondaryNameNode
6681 JobTracker
6749 Jps
[admin@server2
logs]$ jps
14869 TaskTracker
14917 Jps
14795 DataNode
[admin@server3
logs]$ jps
16354 TaskTracker
16396 Jps
16280 DataNode
1. 集群架构
在VMware中安装三台CentOS虚拟机server1,server2,server3,其中server1作为Hadoop集群的NomeNode和JobTracker,server2和server3作为DataNode和TaskTracker. 为简便将DNS、NFS也安装在server1之上。
2.安装DNS
使用yum安装bind
[root@server1
admin]# yum install
bind*
安装完成后检查,
[root@server1
admin]# rpm -qa |
grep '^bind'
bind-dyndb-ldap-1.1.0-0.9.b1.el6_3.1.x86_64
bind-chroot-9.8.2-0.10.rc1.el6_3.6.x86_64
bind-libs-9.8.2-0.10.rc1.el6_3.6.x86_64
bind-sdb-9.8.2-0.10.rc1.el6_3.6.x86_64
bind-utils-9.8.2-0.10.rc1.el6_3.6.x86_64
bind-devel-9.8.2-0.10.rc1.el6_3.6.x86_64
bind-9.8.2-0.10.rc1.el6_3.6.x86_64
安装已经齐全
修改配置文件
修改/etc/named.conf,将127.0.0.1,localhost 改成any
[root@server1 etc]# vim
named.conf
options {
listen-on
port 53 { any; };
listen-on-v6
port 53 { ::1; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file
"/var/named/data/named_stats.txt";
memstatistics-file
"/var/named/data/named_mem_stats.txt";
allow-query { any;
};
recursion
yes;
dnssec-enable
yes;
dnssec-validation
yes;
dnssec-lookaside
auto;
bindkeys-file
"/etc/named.iscdlv.key";
managed-keys-directory
"/var/named/dynamic";
};
修改/etc/named.rfc1912.zones, 加入以下内容
zone "myhadoop.com" IN {
type
master;
file
"myhadoop.com.zone";
allow-update
{ none; };
};
zone "1.168.192.in-addr.arpa" IN {
type
master;
file
"1.168.192.in-addr.zone";
allow-update
{ none; };
};
在目录/var/named下创建文件myhadoop.com.zone、1.168.192.in-addr.zone
修改myhadoop.com.zone为
$TTL 86400
@ IN
SOA server1.myhadoop.com.
chizk.root.myhadoop.com. (
0 ;
serial (d.adams)
1D ;
refresh
1H ;
retry
1W ;
expire
3H
) ; minimum
@ IN
NS server1.myhadoop.com.
server1.myhadoop.com. IN
A 192.168.1.201
server2.myhadoop.com. IN
A 192.168.1.202
server3.myhadoop.com. IN
A 192.168.1.203
修改1.168.192.in-addr.zone为
$TTL 86400
@ IN
SOA server1.myhadoop.com.
chizk.root.myhadoop.com. (
0 ;
serial
1D ;
refresh
1H ;
retry
1W ;
expire
3H
) ;
minimum
@ IN
NS server1.myhadoop.com.
201 IN
PTR
server1.myhadoop.com.
202 IN
PTR
server2.myhadoop.com.
202 IN
PTR server3.myhadoop.com.
修改这两个文件的所有者
[root@server1 named]# chown root.named
myhadoop.com.zone
[root@server1 named]# chown root.named
1.168.192.in-addr.zone
在/etc/resolv.conf中添加以下配置
nameserver 192.168.1.201
用同样的方法修改server2、server3中的/etc/resolv.conf文件
启动DNS
[root@server1 named]# service named
start
Starting
named: [ OK ]
设置为开机自动启动
[root@server1 admin]# chkconfig named
on
测试DNS查询
[root@server1 admin]# nslookup
server1.myhadoop.com
Server: 192.168.1.201
Address: 192.168.1.201#53
Name: server1.myhadoop.com
Address: 192.168.1.201
[root@server1 admin]# nslookup
server2.myhadoop.com
Server: 192.168.1.201
Address: 192.168.1.201#53
Name: server2.myhadoop.com
Address: 192.168.1.202
[root@server1 admin]# nslookup
server3.myhadoop.com
Server: 192.168.1.201
Address: 192.168.1.201#53
Name: server3.myhadoop.com
Address: 192.168.1.203
查询成功,同时在server2,server3中测试,查询都成功
3.安装NFS
查看NFS和rpcbind包是否已安装
[root@server1 admin]# rpm -qa | grep
nfs
nfs4-acl-tools-0.3.3-5.el6.x86_64
nfs-utils-1.2.2-7.el6.x86_64
nfs-utils-lib-1.1.5-1.el6.x86_64
[root@server1 admin]# rpm -qa | grep
rpcbind
rpcbind-0.2.0-8.el6.x86_64
可见已经安装完全,若没有安装使用yum安装即可。
修改文件/etc/exports, 加入以下内容
/home/admin *(sync,rw)
启动NFS
[root@server1 admin]# service nfs
start
Starting NFS
services: [ OK ]
Starting NFS
quotas: [ OK ]
Starting NFS
daemon: [ OK ]
Starting NFS
mountd: [ OK ]
设置为开机自动启动
[root@server1 admin]# chkconfig nfs
on
启动rpcbind
[root@server1 admin]# service rpcbind
start
Starting
rpcbind: [ OK ]
设置为自动启动
[root@server1 admin]# chkconfig rpcbind
on
输出挂载点
[root@server1 admin]# showmount -e
localhost
Export list for localhost:
/home/admin *
修改/home/admin的权限,为方便设置为777
[root@server1 home]# chmod 777
/home/admin
在server2中挂载server1中的/home/admin
[root@server2 home]# mount
server1.myhadoop.com:/home/admin/ /home/admin_share/
测试访问
[root@server2 home]# cd admin_share/
[root@server2 admin_share]# cat test.txt
aaaa,111
bbbb,222
cccc,333
dddd,444
可见访问成功
修改server2的/etc/fstab 文件,设置自动挂载,在末尾添加如下行:
server1.myhadoop.com:/home/admin /home/admin_share nfs defaults
1 1
同理在server3中挂载server1的/home/admin,并测试
4. 共享密钥文件
为server1,server2,server3的admin用户各自生成登录密钥
[admin@server1 ~]$ ssh-keygen -t
rsa
Generating public/private rsa key pair.
Enter file in which to save the key
(/home/admin/.ssh/id_rsa):
/home/admin/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in
/home/admin/.ssh/id_rsa.
Your public key has been saved in
/home/admin/.ssh/id_rsa.pub.
The key fingerprint is:
46:56:64:8f:83:13:e0:f3:17:cb:b9:7d:d5:fc:9f:52
admin@server1
The key's randomart image is:
+--[ RSA 2048]----+
| ....+ |
| . =
o |
| o
= +
. |
| =
o
= ..|
| S
= +|
| .
.
o E.|
| .
. o .|
| o o|
| ...|
+-----------------+
[admin@server2 ~]$ ssh-keygen -t
rsa
[admin@server3 ~]$ ssh-keygen -t
rsa
在server1中将id_rsa.pub 复制为authorized_keys
[admin@server1 ~]$ cp .ssh/id_rsa.pub
.ssh/authorized_keys
在server2,server3中建立共享密钥到本地的软连接
[admin@server2 ~]$ ln -s
/home/admin_share/.ssh/authorized_keys ~/.ssh/authorized_keys
[admin@server3 ~]$ ln -s
/home/admin_share/.ssh/authorized_keys ~/.ssh/authorized_keys
将server2,server3的密钥分别追加到authorized_keys
[admin@server2 ~]$ cat .ssh/id_rsa.pub
>>
.ssh/authorized_keys
[admin@server3 ~]$ cat .ssh/id_rsa.pub
>>
.ssh/authorized_keys
测试配置
[admin@server1 ~]$ ssh
server1.myhadoop.com
The authenticity of host 'server1.myhadoop.com
(192.168.1.201)' can't be established.
RSA key fingerprint is
a9:f3:7f:55:56:3a:a7:d7:9e:23:1e:86:a5:eb:90:dc.
Are you sure you want to continue connecting (yes/no)?
yes
Warning: Permanently added
'server1.myhadoop.com,192.168.1.201' (RSA) to the list of known
hosts.
Last login: Sun Jan 27 10:02:12 2013 from
server1
同样方法测试其他机器,测试成功
5. 安装Hadoop
在server1上,hadoop中配置core-site.xml 为以下格式:
<?xml
version="1.0"?>
<?xml-stylesheet type="text/xsl"
href="configuration.xsl"?>
<!-- Put site-specific property overrides in
this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://server1.myhadoop.com:9000</value>
</property>
</configuration>
配置mapred-site.xml:
<?xml
version="1.0"?>
<?xml-stylesheet type="text/xsl"
href="configuration.xsl"?>
<!-- Put site-specific property overrides in
this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>server1.myhadoop.com:9001</value>
</property>
<property>
<name>mapred.job.tracker.map.tasks.maximum</name>
<value>50</value>
</property>
<property>
<name>mapred.job.tracker.reduce.tasks.maximum</name>
<value>50</value>
</property>
</configuration>
配置master为
server1.myhadoop.com
配置slaves为
server2.myhadoop.com
server3.myhadoop.com
建立文本文件serverlist.txt,里面包含所有需要分发Hadoop的机器域名,在这里为server2,server3,即内容为
[admin@server1 ~]$ cat serverlist.txt
server2.myhadoop.com
server3.myhadoop.com
生成Hadoop的Shell脚本
[admin@server1 ~]$ cat serverlist.txt |
awk '{print "scp -rp /home/admin/hadoop-0.20.2/
admin@"$1":/home/admin/"}' >
distributeHadoop.sh
内容如下
[admin@server1 ~]$ cat
./distributeHadoop.sh
scp -rp /home/admin/hadoop-0.20.2/
admin@server2.myhadoop.com:/home/admin/
scp -rp /home/admin/hadoop-0.20.2/
admin@server3.myhadoop.com:/home/admin/
运行脚本
[admin@server1
~]$ ./distributeHadoop.sh
检查server2 、server3,复制成功
格式化namenode
[admin@server1 logs]$ hadoop namenode
-format
启动Hadoop
[admin@server1
~]$ ./hadoop-0.20.2/bin/start-all.sh
starting namenode, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-namenode-server1.out
server2.myhadoop.com: starting datanode, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-datanode-server2.out
server3.myhadoop.com: starting datanode, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-datanode-server3.out
server1.myhadoop.com: starting secondarynamenode, logging
to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-secondarynamenode-server1.out
starting jobtracker, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-jobtracker-server1.out
server2.myhadoop.com: starting tasktracker, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-tasktracker-server2.out
server3.myhadoop.com: starting tasktracker, logging to
/home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-tasktracker-server3.out
检查server1、server2、server3,启动成功
[admin@server1
logs]$ jps
6481 NameNode
6612 SecondaryNameNode
6681 JobTracker
6749 Jps
[admin@server2
logs]$ jps
14869 TaskTracker
14917 Jps
14795 DataNode
[admin@server3
logs]$ jps
16354 TaskTracker
16396 Jps
16280 DataNode
相关文章推荐
- hadoop 2.2.0 集群模式安装配置和测试
- Hadoop安装后的集群基准测试
- 在Ubuntu上安装Hadoop(集群模式)
- Hadoop学习笔记(4)hadoop集群模式安装
- hadoop2集群安装和测试之demo测试
- Redis集群的安装测试(伪分布模式 - 主从复制)
- 完全分布模式hadoop集群安装配置之二 添加新节点组成分布式集群
- Hadoop2.2.0版本多节点集群安装及测试
- hadoop2集群安装和测试之环境配置
- hadoop大集群实施--比较实用的思路(设备选型、是否使用虚拟机、快速部署安装、自动复制节点等)
- hadoop2集群安装和测试之window开发环境
- Hadoop全分布模式安装和测试
- Hadoop安装后的集群基准测试
- 面向生产环境的大集群模式安装Hadoop
- hadoop2.x分布式集群安装配置 ~hadoop3种模式的介绍
- HBase入门笔记(三)-- 完全分布模式Hadoop集群安装配置
- Hadoop安装后的集群基准测试
- Hadoop2.2.0版本多节点集群安装及测试
- Hadoop2.2.0版本多节点集群安装及测试
- hadoop2集群安装和测试之软件安装配置