Hadoop单节点
2016-07-15 15:45
260 查看
Hadoop: Setting up a Single Node Cluster
[first time] install ssh, rsync注意: 修改
$HADOOP_HOME/etc/hadoop/hadoop-env.sh中的JAVA_HOME。这一步很重要,然后启动时会报错。
Unpack the downloaded Hadoop distribution. In the distribution, edit the file etc/hadoop/hadoop-env.sh to define some parameters as follows:
# set to the root of your Java installation export JAVA_HOME=/usr/java/latest
Standalone operation
By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.
Execution
The following instructions are to run a MapReduce job locally. If you want to execute a job on YARN, see YARN on Single Node.
Format the filesystem:
$ bin/hdfs namenode -format
Start NameNode daemon and DataNode daemon:
$ sbin/start-dfs.sh
The hadoop daemon log output is written to the HADOOPLOGDIRdirectory(defaultstoHADOOP_HOME/logs).
Browse the web interface for the NameNode; by default it is available at:
NameNode - http://localhost:50070/
Make the HDFS directories required to execute MapReduce jobs:
bin/hdfsdfs−mkdir/user bin/hdfs dfs -mkdir /user/
Copy the input files into the distributed filesystem:
$ bin/hdfs dfs -put etc/hadoop input
Run some of the examples provided:
注意:跑这个例
e340
子时要先确认当前目录底下没有output目录,否则执行这个任务会报错。因为hadoop不会覆盖这个目录。
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output ‘dfs[a-z.]+’
Examine the output files: Copy the output files from the distributed filesystem to the local filesystem and examine them:
bin/hdfsdfs−getoutputoutput cat output/*
or
View the output files on the distributed filesystem:
$ bin/hdfs dfs -cat output/*
When you’re done, stop the daemons with:
$ sbin/stop-dfs.sh
参考http://www.powerxing.com/install-hadoop/
常见问题:执行hadoop namenode -format之前必须删除缓存文件,不然会报错,导致找不到datanode
默认的缓存在/tmp 目录下。
相关文章推荐
- java对世界各个时区(TimeZone)的通用转换处理方法(转载)
- java-注解annotation
- java-模拟tomcat服务器
- java-用HttpURLConnection发送Http请求.
- java-WEB中的监听器Lisener
- Android IPC进程间通讯机制
- Android Native 绘图方法
- Android java 与 javascript互访(相互调用)的方法例子
- 介绍一款信息管理系统的开源框架---jeecg
- 聚类算法之kmeans算法java版本
- java实现 PageRank算法
- 详解HDFS Short Circuit Local Reads
- PropertyChangeListener简单理解
- c++11 + SDL2 + ffmpeg +OpenAL + java = Android播放器
- 插入排序
- 冒泡排序
- 堆排序
- 快速排序