Hadoop学习笔记——安装Hadoop
2017-03-10 22:44
447 查看
sudo mv /home/common/下载/hadoop-2.7.2.tar.gz /usr/local sudo tar -xzvf hadoop-2.7.2.tar.gz sudo mv hadoop-2.7.2 hadoop #改个名
在etc/profile文件中添加
export HADOOP_HOME=/usr/local/hadoop export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
1.修改/usr/local/hadoop/etc/hadoop/hadoop-env.sh文件
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_121
2.修改/usr/local/hadoop/etc/hadoop/core-site.xml文件
<configuration> <property> <name>fs.default.name</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/tmp</value> </property> <property> <name>hadoop.native.lib</name> <value>false</value> </property> </configuration>
3.修改/usr/local/hadoop/etc/hadoop/hdfs-site.xml文件
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.name.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/data</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> </configuration>
4./usr/local/hadoop/etc/hadoop/mapred-site.xml(修改mapred-queues.xml.template的那个文件)
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
5. /usr/local/hadoop/etc/hadoop/yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
6.使得/etc/profile生效
sudo source /etc/profile
/etc/profile文件内容
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_121export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
export PATH=/usr/local/texlive/2015/bin/x86_64-linux:$PATH
export MANPATH=/usr/local/texlive/2015/texmf-dist/doc/man:$MANPATH
export INFOPATH=/usr/local/texlive/2015/texmf-dist/doc/info:$INFOPATH
export HADOOP_HOME=/usr/local/hadoop export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
export M2_HOME=/opt/apache-maven-3.3.9
export M2=$M2_HOME/bin
export PATH=$M2:$PATH
export GRADLE_HOME=/opt/gradle/gradle-3.4.1
export PATH=$GRADLE_HOME/bin:$PATH
~/.bashrc文件内容
export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL
SSH和Hadoop用户设置可以参考
http://www.cnblogs.com/CheeseZH/p/5051135.html
http://www.powerxing.com/install-hadoop/
如果遇到dataNode不能启动的问题,参考
http://www.aboutyun.com/thread-12803-1-1.html
去Hadoop/log目录下查看log日志文件,然后在/usr/local/hadoop/tmp/dfs/data/current目录下修改VERSION文件中的内容
Hadoop目录下的权限
格式化一个新的分布式文件系统
hdfs namenode -format
运行Hadoop
运行Hadoop示例
./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar pi 2 5
输出
Number of Maps = 2 Samples per Map = 5 Wrote input for Map #0 Wrote input for Map #1 Starting Job 17/03/26 11:49:47 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 17/03/26 11:49:47 INFO input.FileInputFormat: Total input paths to process : 2 17/03/26 11:49:47 INFO mapreduce.JobSubmitter: number of splits:2 17/03/26 11:49:48 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1490497943530_0002 17/03/26 11:49:48 INFO impl.YarnClientImpl: Submitted application application_1490497943530_0002 17/03/26 11:49:48 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1490497943530_0002/ 17/03/26 11:49:48 INFO mapreduce.Job: Running job: job_1490497943530_0002 17/03/26 11:49:55 INFO mapreduce.Job: Job job_1490497943530_0002 running in uber mode : false 17/03/26 11:49:55 INFO mapreduce.Job: map 0% reduce 0% 17/03/26 11:50:02 INFO mapreduce.Job: map 100% reduce 0% 17/03/26 11:50:08 INFO mapreduce.Job: map 100% reduce 100% 17/03/26 11:50:08 INFO mapreduce.Job: Job job_1490497943530_0002 completed successfully 17/03/26 11:50:08 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=50 FILE: Number of bytes written=353898 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=524 HDFS: Number of bytes written=215 HDFS: Number of read operations=11 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=9536 Total time spent by all reduces in occupied slots (ms)=3259 Total time spent by all map tasks (ms)=9536 Total time spent by all reduce tasks (ms)=3259 Total vcore-milliseconds taken by all map tasks=9536 Total vcore-milliseconds taken by all reduce tasks=3259 Total megabyte-milliseconds taken by all map tasks=9764864 Total megabyte-milliseconds taken by all reduce tasks=3337216 Map-Reduce Framework Map input records=2 Map output records=4 Map output bytes=36 Map output materialized bytes=56 Input split bytes=288 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=56 Reduce input records=4 Reduce output records=0 Spilled Records=8 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=319 CPU time spent (ms)=2570 Physical memory (bytes) snapshot=719585280 Virtual memory (bytes) snapshot=5746872320 Total committed heap usage (bytes)=513802240 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=236 File Output Format Counters Bytes Written=97 Job Finished in 21.472 seconds Estimated value of Pi is 3.60000000000000000000
可以访问 Web 界面 http://localhost:50070 查看 NameNode 和 Datanode 信息,还可以在线查看 HDFS 中的文件
启动 YARN 之后,运行实例的方法还是一样的,仅仅是资源管理方式、任务调度不同。观察日志信息可以发现,不启用 YARN 时,是 “mapred.LocalJobRunner” 在跑任务,启用 YARN 之后,是 “mapred.YARNRunner” 在跑任务。启动 YARN 有个好处是可以通过 Web 界面查看任务的运行情况:http://localhost:8088/cluster
点击history,查看每一个任务,如果遇到master:19888不能访问的情况,在目录下执行
mr-jobhistory-daemon.sh start historyserver
关于Hadoop的架构请关注下面这篇博文的内容
Hadoop HDFS概念学习系列之初步掌握HDFS的架构及原理1(一)
关于Hadoop中HDFS的读取过程请关注下面这篇博文的内容
Hadoop HDFS概念学习系列之初步掌握HDFS的架构及原理2(二)
关于Hadoop中HDFS的写入过程请关注下面这篇博文的内容
Hadoop HDFS概念学习系列之初步掌握HDFS的架构及原理3(三)
关于Hadoop中SNN的作用请关注下面这篇博文的内容
http://blog.csdn.net/xh16319/article/details/31375197
相关文章推荐
- Hadoop学习笔记二 安装部署
- 开始hadoop前的准备:ubuntu学习笔记-基本环境的搭建(ssh的安装,SecureCRT连接,vim的安装及使用、jdk的安装)
- hadoop学习笔记(一):安装hadoop
- Hadoop学习笔记【12】-Hadoop2.1全分布式集群安装
- hadoop学习笔记之-生产环境Hadoop大集群配置安装
- hadoop学习笔记(2)-hadoop安装目录权限的问题导致datanode启动失败
- Hadoop学习笔记二 安装部署
- hadoop学习笔记(二):安装hive
- Hadoop学习笔记二 安装部署
- Hadoop学习笔记3---安装并运行Hadoop
- Hadoop学习笔记二 安装部署
- Hadoop学习笔记-Hadoop在Windows下安装
- hadoop学习笔记1--centos6.2 64位 最小化(minimal)安装配置
- 云计算学习笔记004---hadoop的简介,以及安装,用命令实现对hdfs系统进行文件的上传下载
- Hadoop学习笔记二 安装部署
- 【hadoop学习笔记】1.hadoop安装
- hadoop学习笔记之--完全分布模式安装
- hadoop学习笔记之-hbase完全分布模式安装
- Hadoop学习笔记-Hadoop在Windows下安装
- Hadoop学习笔记二 安装部署