您的位置:首页 > 运维架构

Hadoop2.7.0学习——伪分布式搭建

2016-07-29 18:04 447 查看

Hadoop2.7.0学习——伪分布式搭建

根据极客学院视频学习

http://www.jikexueyuan.com/course/2089_3.html?ss=1

需要的材料

1. [红帽企业Linux.6.4.服务器版].rhel-server-6.4-x86_64-dvd[ED2000.COM]|下载

2. hadoop-2.7.0

1. 分卷1

2. 分卷2

3. 分卷3

下载好后,右键分卷1,解压即可


3. hbase-0.98.13-hadoop2-bin|下载

4. jdk-7u80-linux-x64|下载

关闭防火墙

打开终端

输入

service iptables stop 临时关闭,重启后失效

chkconfig iptables off 永久关闭

关闭SELinux

输入

vim /etc/sysconfig/selinux

按i进入编辑模式

设置:SELINUX=disabled

按Esc进入退出编辑,输入:wq!回车,即为保存并退出,或者shift+z+z

配置主机ip







设置完成后重启网络

service network restart

设置虚拟机网络为桥接模式





配置主机名

配置主机名

vi /etc/sysconfig/network

修改主机名为hbase02.pzr.com



配置ip的映射关系

vi /etc/hosts

添加 192.168.20.140 hbase02.pzr.com hbase02

配置SSh免密码登录

生成秘钥,注意ssh-keygen之间无空格

ssh-keygen -t rsa

执行后连续点击回车,下图为成功图片



秘钥拷贝到本机

ssh-copy-id 192.168.20.140



测试是否成功

ssh 192.168.20.140

看到下图说明成功



重启

reboot

安装JDK

下载并安装

执行命令

rpm -ivh jdk-7u80-linux-x64.rpm

java安装在usr下java中

配置/etc/profile

export JAVA_HOME=/usr/java/jdk1.7.0_80
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar


搭建Hadoop环境

上传文件

配置HDFS,YARN

格式化

启动并测试

上传文件

上传编译好的文件

解压文件

tar -zxf hadoop-2.7.0.tar.gz -C ../soft/

配置Hadoop

地址:http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SingleCluster.html





往下滚动即可看到对应的配置文件的配置方法

配置Java路径

文件hadoop-2.7.0/etc/hadoop/hadoop-env.sh

找到JAVA_HOME,修改为刚刚配置的JAVA_HOME地址

export JAVA_HOME=/usr/java/jdk1.7.0_80


修改文件:hadoop-2.7.0/etc/hadoop/core-site.xml

hadoop.tmp.dir:缓存目录

地址用固定ip,否则在跑job的时候会有问题

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.20.140:8032</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/bigdata/soft/hadoop-2.7.0/data/tmp</value>
</property>
</configuration>


修改备份配置文件:hadoop-2.7.0/etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0 
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>


修改配置文件:hadoop-2.7.0/etc/hadoop/mapred-site.xml.template

重命名为mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0 
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>


修改配置文件:hadoop-2.7.0/etc/hadoop/yarn-site.xml

比视频中多了一些配置内容

<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0 
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>127.0.0.1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>127.0.0.1:8031</value>
</property>
<property>

<name>yarn.nodemanager.resource.memory-mb</name>
<value>3072</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>256</value>
</property>

</configuration>


启动

第一次启动hadoop需要格式化

查看命令

bin/hdfs



格式化命令:bin/hdfs namenode -format



看到这个则说明格式化成功

这是sbin目录下的命令,如果配置在PATH下就可以直接使用命令,如果没有,则到对应目录下执行该命令

start-dfs.sh 启动Hadoop HDFS守护进程NameNode、SecondaryNameNode和DataNode

start-yarn.sh 启动Hadoop YARN守护进程ResourceManager,NodeManager

测试是否启动成功

登录Hadoop 的管理页面:http://部署服务器ip:50070\

看到以下页面,说明部署成功



创建目录:-p说明创建多级目录

bin/hadoop fs -mkdir -p /user/root/my/in

上传文件

bin/hadoop fs -put /etc/profile /user/root/my/in

在管理页面可以看到创建的目录以及上传的文件





跑job,将上传的文件上传到指定目录

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar wordcount /user/root/my/in/profile /user/root/my/out



管理页面可以看到





搭建Hbase环境

下载解压

http://archive.apache.org/dist/hbase/0.98.13/hbase-0.98.13-hadoop1-bin.tar.gz

解压文件

tar -zxf hbase-0.98.13-hadoop2-bin.tar.gz -C ../soft/

配置

修改配置文件:hbase-0.98.13-hadoop2/conf/hbase-env.sh

找到JAVA_HOME,解除注释,配置服务器JAVA_HOME的地址

export JAVA_HOME=/usr/java/jdk1.7.0_80


修改配置文件:hbase-0.98.13-hadoop2/conf/hbase-site.xml

hbase.zookeeper.property.dataDir的值是自己创建的目录地址

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements.  See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership.  The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License.  You may obtain a copy of the License at
*
*     http://www.apache.org/licenses/LICENSE-2.0 *
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-->
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>

<property>
<name>hbase.rootdir</name>
<value>hdfs://192.168.20.39:8032/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/bigdata/soft/hbase-0.98.13-hadoop2/data/zkData</value>
</property>
</configuration>


修改配置文件:hbase-0.98.13-hadoop2/conf/regionservers

localhost修改为固定ip

192.168.20.140


启动

顺序必须正确

hbase-daemon.sh start zookeeper

hbase-daemon.sh start master

hbase-daemon.sh start regionserver

测试是否启动成功

输入jps,查看进程是否正确

输入hbase,可以看到相关命令



关闭

顺序和启动相反

hbase-daemon.sh stop regionserver

hbase-daemon.sh stop master

hbase-daemon.sh stop zookeeper

常用命令

查看其它命令,直接使用hbase



进入shell

hbase shell

查看表

list

新建表

create ‘表名’,’列族’

create ‘mytest’,’info’

删除表

先禁用,在删除

disable ‘mytest’ 禁用

drop ‘mytest’ 删除

插入数据

put ‘表名’,’rowkey’,’列族:列名’,’值’

put ‘mytest’,’rk0001’,’info:name’,’myname’

更新数据

向相同的rowkey中插入相同数据即可

删除数据

deleteall ‘mytest’,’rok0001’ 删除整行

deleteall ‘mytest’,’rk0001’,’info:age’ 删除指定列

还有其他的命令,输入del按table键查看

查询数据

scan ‘mytest’ 扫描整张表

scan ‘mytest’,{LIMIT=>10} 扫描前10条数据

p/hbase_shell.html)

get ‘mytest’,’rk0001’ 通过rowkey获取

get ‘mytest’,’rk0001’,’info:name’ 通过rowkey和指定列名获取

更多命令用法:[(http://www.cnblogs.com/nexiyi/]http://www.cnblogs.com/nexiyi/p/hbase_shell.html](http://www.cnblogs.com/nexiyi/

查看管理页面地址

http://192.168.20.140:60010/master-status
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  hadoop 分布式 大数据