您的位置：首页 > 运维架构

Hadoop学习（一）: Ubuntu上安装Hadoop

2016-06-21 13:45 253 查看

Hadoop学习（一）: Ubuntu上安装Hadoop

１．安装ｓｓｈ

$ sudo apt-get install openssh-client
$ sudo apt-get install openssh-server

2.查看ＪＡＶＡ_HOME变量值

/opt/jdk1.8.0_91

３．安装ｈａｄｏｏｐ－２．７．２

从官网（http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.7.2/）下载，解压到hadoop-2.7.2

４．修改hadoop-2.7.2的etc/hadoop/hadoop-env.sh文件，设置ＪＡＶＡ_ＨＯＭＥ

export JAVA_HOME=/opt/jdk1.8.0_91

输入以下命令,弹出ｈａｄｏｏｐ的用法，则配置成功

$ bin/hadoop

ｈａｄｏｏｐ支持以下三种模式:

５．Standalone Operation（单机模式）

开启ｓｓｈ服务

$ sudo /etc/init.d/ssh start

免密码登陆

#client端产生密钥：
$ ssh-keygen -t rsa
#ｓｅｒｖｅｒ端：
$ cp id_rsa.pub authorized_keys
$ chmod 600 authorized_keys

测试：The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.

$ mkdir input
$ cp etc/hadoop/*.xml input
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'
$ cat output/*

６．Pseudo-Distributed Operation(单机伪分布模式)

修改两处文件：

etc/hadoop/core-site.xml:

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

etc/hadoop/hdfs-site.xml:

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

ｓｓｈ登陆：

$ ssh localhost

执行：

/
4000
/Format the filesystem:
$ bin/hdfs namenode -format

//Start NameNode daemon and DataNode daemon:
$ sbin/start-dfs.sh

Browse the web interface for the NameNode; by default it is available at: NameNode - http://localhost:50070/

//Make the HDFS directories required to execute MapReduce jobs:
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username>

//Copy the input files into the distributed filesystem:
$ bin/hdfs dfs -put etc/hadoop input

//Run some of the examples provided:
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'

//Examine the output files: Copy the output files from the distributed filesystem to the local filesystem and examine them:
$ bin/hdfs dfs -get output output
$ cat output/*
//or view the output files on the distributed filesystem:
$ bin/hdfs dfs -cat output/*

//When you’re done, stop the daemons with:
$ sbin/stop-dfs.sh

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航