基于apache ambari 的大数据平台搭建并运行WordCount
2017-05-14 13:05
633 查看
操作系统:3个 rhel 6.4,内存2g,动态硬盘20g。
虚拟机软件:virtualBox
注意:
1、yum可能要重新安装,可百度解决
1、确保三个host能互相ping通
每个host配置DNS
vi /etc/hosts
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/54545476f12bd6f476fd5032e3252460)
为host4安装http服务
yum install httpd
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/2318c138fb9333e8802f85d4c09c71e7)
用Xftp4软件连接host4的文件系统
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/bb984bc298e4a45a1ea11c8564a7d87a)
连接成功后显示
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/9ea07a67811f174a3c1e628222d5936d)
进入目录/var/www/html
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/0974d7de2c79b1bcb4d847850cb3d073)
将三个压缩包从windows传到host4的上面那个目录里,分别是:
HDP:HDP-2.6.0.3-centos6-rpm.tar.gz
HDP-UTILS:HDP-UTILS-1.1.0.21-centos6.tar.gz
Ambari 2.5.0 :ambari-2.5.0.3-centos6.tar.gz
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/487ee166e98f2a0eb48a7b81ddf86eb2)
解压
tar -zxvf ambari-2.5.0.3-centos6.tar.gz
tar -zxvf HDP-2.6.0.3-centos6-rpm.tar.gz
tar -zxvf HDP-UTILS-1.1.0.21-centos6.tar.gz
解压时注意空间不够的问题,最好先解压大的文件,解压完删除对应的压缩文件。
启动http服务
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/8440189a7d2d1c0251a8e0355b77a7a4)
在浏览器上输入10.132.102.71/ambari/
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/bae712f6504f86e5ec32ff750eac4af1)
在浏览器上输入http://10.132.102.71/HDP/centos6/
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/0d20ec444c5748d73e4b0528066272aa)
在浏览器上输入http://10.132.102.71/HDP-UTILS-1.1.0.21/
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/f6811fe1658bd564b7d176b1e86ed29e)
接下来配置host1到其它host能root免密码登录。
为每个host 执行
ssh-keygen
cd .ssh
touch authorized_keys
host1 执行
cat id_rsa.pub >> authorized_keys
把host1的authorized_keys拷贝到其它host的.ssh下
scp authorized_keys root@host2:~/.ssh
scp authorized_keys root@host4:~/.ssh
更改目录和文件的权限,每个host执行
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
每个host安装ntp
yum install -y ntp
设置ntp开机自启动
chkconfig ntpd on
启动ntp
service ntpd start
关闭防火墙
chkconfig iptables off
disable SELinux
setenforce 0
Setting the umask for your current login session:
umask 0022
Checking your current umask:
umask 0022
Permanently changing the umask for all interactive users:
echo umask 0022 >> /etc/profile
每个host下载ambari.repo
地址:
http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.5.0.3/ambari.repo
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/1ccb6a214e5c4b467813e64d26b8bc46)
并将文件放到/etc/yum.repos.d/下
cp ambari.repo /etc/yum.repos.d/
执行
cat /sys/kernel/mm/transparent_hugepage/enabled
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/8aa77935e2211eed2de3ffef1c29de5b)
若中括号在always两侧,自行百度“如何将Transparent HugePages关闭”解决
选择host1安装ambari服务器
yum install ambari-server
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/c8952fa03616cf13ea22cfa5c0e3f28a)
将jdk-8u112-linux-x64.tar.gz放到/var/lib/ambari-server/resources/目录下,否则自动下载Java时会很慢。
ambari-server setup
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/f9d6defbc787ecc1bd985199216c8ca2)
setup第一次可能会失败,我试了三次就成功了。
启动ambari服务器
ambari-server start
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/9aee3fc937ee6197e9d962405e333e5a)
在浏览器中输入10.132.102.61:8080
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/d91ec5a678244e263da321eeb314792a)
4000
用户名和密码都是admin
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/0a3939ab1b3ead54ec4c00879ca89e52)
点击Launch Install Wizard,随便起个名字
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/bf6751d25d097d413470310da1efbad5)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/84d9732128bfdcfdab796342b8c8edf2)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/8cab2cc8802637fac6b09ff1bd361668)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/bbbd57e7f687378121a2db507822676b)
复制host1的私钥(id_rsa)到下面的编辑框
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/a1b669cf460be0684222a11d1b209687)
注册认证
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/3a559189b73115983cbe5a5447f57762)
由于内存不够,这里只示范两个host
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/e83b8f516ee13cf5f90886e83728fa86)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/6dea89971965f7f8b1358c068b030304)
注意检查红框里的警告
如没有警告,Next。
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/86f1ee27e082061a04133bf2573a6a94)
此处只选择HDFS和MapReduce服务,Next。
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/5a1bf069329e8748e5657cd844b8c94f)
Next
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/bf6f3b14fb222d81b9032ccd9c08e560)
DataNode都打钩,next
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/0679d1872488bb0e144d9b517708987b)
红色的数字都是要求设置密码的,设置好后next
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/b1f1f4166b5d03565967b35d509a6138)
Deploy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/bde05f5f919fc3fa567bd40a32a5aadd)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/3d1d91906cd6febf6e5b3f0a55907ff4)
到这里,部署成功了,增加删除结点和服务都可以在此基础上操作。
接下来运行wordcount自带例程
自己在本地创建好word_test.txt,并通过copyfromlocal复制到hdfs下。
hadoop fs -cat /tmp/input/word_test.txt
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/1bb2c886433dfa92ceb3b8b897f58661)
sudo -u hdfs hadoop jar /usr/hdp/2.6.0.3-8/hadoop-mapreduce/hadoop-mapreduce-examples-2.7.3.2.6.0.3-8.jar wordcount /tmp/input/word_test.txt /tmp/output
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/a5227dbe6bf50f081934cef273ab23b9)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/28677851c49d01f5f489752a11261954)
hadoop fs -ls /tmp/output
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/9f3348f2860d2989e66a59ba9abbb667)
hadoop fs -cat /tmp/output/part-r-00000
![](https://oscdn.geek-share.com/Uploads/Images/Content/201705/fb9854a9ccaba530634e224b81a6a084)
结束。
虚拟机软件:virtualBox
注意:
1、yum可能要重新安装,可百度解决
1、确保三个host能互相ping通
每个host配置DNS
vi /etc/hosts
为host4安装http服务
yum install httpd
用Xftp4软件连接host4的文件系统
连接成功后显示
进入目录/var/www/html
将三个压缩包从windows传到host4的上面那个目录里,分别是:
HDP:HDP-2.6.0.3-centos6-rpm.tar.gz
HDP-UTILS:HDP-UTILS-1.1.0.21-centos6.tar.gz
Ambari 2.5.0 :ambari-2.5.0.3-centos6.tar.gz
解压
tar -zxvf ambari-2.5.0.3-centos6.tar.gz
tar -zxvf HDP-2.6.0.3-centos6-rpm.tar.gz
tar -zxvf HDP-UTILS-1.1.0.21-centos6.tar.gz
解压时注意空间不够的问题,最好先解压大的文件,解压完删除对应的压缩文件。
启动http服务
在浏览器上输入10.132.102.71/ambari/
在浏览器上输入http://10.132.102.71/HDP/centos6/
在浏览器上输入http://10.132.102.71/HDP-UTILS-1.1.0.21/
接下来配置host1到其它host能root免密码登录。
为每个host 执行
ssh-keygen
cd .ssh
touch authorized_keys
host1 执行
cat id_rsa.pub >> authorized_keys
把host1的authorized_keys拷贝到其它host的.ssh下
scp authorized_keys root@host2:~/.ssh
scp authorized_keys root@host4:~/.ssh
更改目录和文件的权限,每个host执行
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
每个host安装ntp
yum install -y ntp
设置ntp开机自启动
chkconfig ntpd on
启动ntp
service ntpd start
关闭防火墙
chkconfig iptables off
disable SELinux
setenforce 0
Setting the umask for your current login session:
umask 0022
Checking your current umask:
umask 0022
Permanently changing the umask for all interactive users:
echo umask 0022 >> /etc/profile
每个host下载ambari.repo
地址:
http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.5.0.3/ambari.repo
并将文件放到/etc/yum.repos.d/下
cp ambari.repo /etc/yum.repos.d/
执行
cat /sys/kernel/mm/transparent_hugepage/enabled
若中括号在always两侧,自行百度“如何将Transparent HugePages关闭”解决
选择host1安装ambari服务器
yum install ambari-server
将jdk-8u112-linux-x64.tar.gz放到/var/lib/ambari-server/resources/目录下,否则自动下载Java时会很慢。
ambari-server setup
setup第一次可能会失败,我试了三次就成功了。
启动ambari服务器
ambari-server start
在浏览器中输入10.132.102.61:8080
4000
用户名和密码都是admin
点击Launch Install Wizard,随便起个名字
复制host1的私钥(id_rsa)到下面的编辑框
注册认证
由于内存不够,这里只示范两个host
注意检查红框里的警告
如没有警告,Next。
此处只选择HDFS和MapReduce服务,Next。
Next
DataNode都打钩,next
红色的数字都是要求设置密码的,设置好后next
Deploy
到这里,部署成功了,增加删除结点和服务都可以在此基础上操作。
接下来运行wordcount自带例程
自己在本地创建好word_test.txt,并通过copyfromlocal复制到hdfs下。
hadoop fs -cat /tmp/input/word_test.txt
sudo -u hdfs hadoop jar /usr/hdp/2.6.0.3-8/hadoop-mapreduce/hadoop-mapreduce-examples-2.7.3.2.6.0.3-8.jar wordcount /tmp/input/word_test.txt /tmp/output
hadoop fs -ls /tmp/output
hadoop fs -cat /tmp/output/part-r-00000
结束。
相关文章推荐
- 基于Apache Ambari搭建Hadoop大数据平台
- win7(64位)平台下Cygwin+Eclipse搭建Hadoop单机开发环境 (四) 导入Hadoop源码+wordcount程序+运行
- 大数据之Hadoop平台(二)Centos6.5(64bit)Hadoop2.5.1伪分布式安装记录,wordcount运行测试
- 大数据分析平台搭建教程:基于Apache Zeppelin Notebook和R的交互式数据科学
- 【云星数据---Apache Flink实战系列(精品版)】:Flink流处理API详解与编程实战001-Flink基于流的wordcount示例001
- 大数据之Hadoop平台(二)Centos6.5(64bit)Hadoop2.5.1伪分布式安装记录,wordcount运行测试
- Hadoop2.x环境搭建之搭建伪分布模式以及运行wordcount案例【HDFS上的数据】
- 大数据分析平台搭建教程:基于Apache Zeppelin Notebook和R的交互式数据科学
- Ambari——大数据平台的搭建利器之进阶篇
- 解决Eclipse中运行WordCount出现 java.lang.ClassNotFoundException: org.apache.hadoop.examples.WordCount$TokenizerMapper问题【转】
- 安装Hadoop,搭建jdk环境,运行wordcount程序
- Ambari——大数据平台的搭建利器
- 在线实时大数据平台Storm开发之wordcount
- clojure实战——基于logstash搭建日志数据获取与整理平台(2)
- CentOS下Hadoop搭建与wordcount实例运行
- hadoop平台运行WordCount程序
- Hadoop2.x实战:Eclipse本地开发环境搭建与本地运行wordcount实例
- 用APMServ一键快速搭建Apache+PHP+MySQL+Nginx+Memcached+ASP运行平台
- 面向服务体系架构(SOA)和数据仓库(DW)的思考基于 IBM 产品体系搭建基于 SOA 和 DW 的企业基础架构平台
- 解决Eclipse中运行WordCount出现 java.lang.ClassNotFoundException: org.apache.hadoop.examples.WordCount$Token