基于linux的hadoop环境搭建
2013-02-25 22:38:52| 发布者:
领悟书生| 查看:
758
单机模式 伪分布式模式的安装和配置步骤 完全分布式模式 伪分布模式三种运行模式单机模式:安装简单,几乎不用作任何配置,但仅限于调试用途
伪分布模式:在单节点上同时启动namenode、datanode、jobtracker、tasktracker、secondary namenode等5个进程,模拟分布式运行的各个节点【考虑到本人的机器配置,选择这种方式学习】
完全分布式模式:正常的Hadoop集群,由多个各司其职的节点构成
伪分布式模式的安装和配置步骤下载与解压
下载并解压Hadoop安装包,现在很多教程都是用0.20.2版本,所以我也选择这个版本
http://hadoop.apache.org/releases.html
http://archive.apache.org/dist/hadoop/core/
http://archive.apache.org/dist/hadoop/core/hadoop-0.20.2/
hadoop-0.20.2.tar.gz
root@debian3:/usr/local#tar zxvf hadoop-0.20.2.tar.gz
更改配置文件
进入Hadoop的解压目录,编辑conf/hadoop-env.sh文件(注意0.23版后配置文件的位置有所变化)配置JDK的安装路径
conf/hadoop-env.sh
# The java implementation to use. Required.
export JAVA_HOME=/usr/local/jdk1.6.0_38
| JDK的安装请参考:基于debian(ubuntu)的JDK安装与卸载-vps环境搭建实录(一)
编辑conf目录下core-site.xml、hdfs-site.xml和mapred-site.xml三个核心配置文件
core-site.xml
<configuration>
<!--
fs.default.name - 这是一个描述集群中NameNode结点的URI(包括协议、主机名称、端口号),集群里面的每一台机器都需要知道NameNode的地址。DataNode结点会先在NameNode上注册,这样它们的数据才可以被使用。独立的客户端程序通过这个URI跟DataNode交互,以取得文件的块列表。
-->
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.1.102:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-0.20.2/mytmp</value>
</property>
</configuration>
| hdfs-site.xml
<configuration>
<!--
dfs.replication:它决定着系统里面的文件块的数据备份个数。对于一个实际的应用>,它 应该被设为3(这个数字并没有上限,但更多的备份可能并没有作用,而且会占用更多
的空间)。少于三个的备份,可能会影响到数据的可靠性(系统故障时,也许会造成数据丢>
失)
dfs.data.dir:数据节点存储块的目录
-->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/local/hadoop-0.20.2/data</value>
</property>
</configuration>
| mapred-site.xml
<configuration>
<!-- mapred.job.tracker -JobTracker的主机(或者IP)和端口。-->
<property>
<name>mapred.job.tracker</name>
<value>192.168.1.102:9001</value>
</property>
</configuration>
| 编辑conf目录下conf/master和conf/conf/slaves,把IP添加进去
在conf/master中加入master的ip : 192.168.1.102
在conf/slaves中加入slaves的IP: 192.168.1.102
配置ssh
配置ssh,生成密钥,使到ssh可以免密码连192.168.1.102
root@debian3:/usr/local/hadoop-0.20.2# cd /root/
root@debian3:~# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
11:25:02:a9:9b:70:2f:52:72:10:96:3e:a6:21:9f:89 root@debian3
The key's randomart image is:
+--[ RSA 2048]----+
|.o. .o. o.. |
|o. . . o |
|.. . . |
|=+= . |
|+X.* S |
|E B . |
| . . |
| |
| |
+-----------------+
root@debian3:~# cd .ssh
root@debian3:~/.ssh# ls
id_rsa id_rsa.pub known_hosts
id_rsa是密钥文件,id_rsa.pub是公钥文件。
root@debian3:~/.ssh# cp id_rsa.pub authorized_keys
root@debian3:~/.ssh# ls
authorized_keys id_rsa id_rsa.pub known_hosts
| 格式化HDFS
root@debian3:/usr/local/hadoop-0.20.2# bin/hadoop namenode -format
13/02/23 22:57:17 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = debian3/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
13/02/23 22:57:18 INFO namenode.FSNamesystem: fsOwner=root,root
13/02/23 22:57:18 INFO namenode.FSNamesystem: supergroup=supergroup
13/02/23 22:57:18 INFO namenode.FSNamesystem: isPermissionEnabled=true
13/02/23 22:57:18 INFO common.Storage: Image file of size 94 saved in 0 seconds.
13/02/23 22:57:18 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
13/02/23 22:57:18 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at debian3/127.0.1.1
************************************************************/
| 启动Hadoop
使用bin/start-all.sh启动Hadoop
root@debian3:/usr/local/hadoop-0.20.2# bin/start-all.sh
starting namenode, logging to /usr/local/hadoop-0.20.2/bin/../logs/hadoop-root-namenode-debian3.out
192.168.1.102: starting datanode, logging to /usr/local/hadoop-0.20.2/bin/../logs/hadoop-root-datanode-debian3.out
192.168.1.102: starting secondarynamenode, logging to /usr/local/hadoop-0.20.2/bin/../logs/hadoop-root-secondarynamenode-debian3.out
starting jobtracker, logging to /usr/local/hadoop-0.20.2/bin/../logs/hadoop-root-jobtracker-debian3.out
192.168.1.102: starting tasktracker, logging to /usr/local/hadoop-0.20.2/bin/../logs/hadoop-root-tasktracker-debian3.out
| 检测守护进程启动情况
root@debian3:/usr/local/hadoop-0.20.2# /usr/local/jdk1.6.0_38/bin/jps
9622 DataNode
9872 TaskTracker
9533 NameNode
9781 JobTracker
9711 SecondaryNameNode
9919 Jps
| 关闭Hadoop
使用bin/stop-all.sh关闭Hadoop
root@debian3:/usr/local/hadoop-0.20.2# bin/stop-all.sh
stopping jobtracker
192.168.1.102: stopping tasktracker
stopping namenode
192.168.1.102: stopping datanode
192.168.1.102: stopping secondarynamenode
| 本文链接:基于linux的hadoop环境搭建,由领悟书生原创,转载请注明出处http://www.656463.com/article/377
|
本站采用创作共用版权 CC BY-NC-ND/2.5/CN 许可协议
如非特别注明,本站内容均为领悟书生原创,转载请务必注明作者和原始出处。
本文地址:http://www.656463.com/article/377
|
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理