您的位置:首页 > 运维架构

hadoop-2.2.0单点安装

2016-07-13 17:25 381 查看
解压hadoop-2.2.0.tar.gz

目录说明:

drwxr-xr-x  2 qiulp qiulp  4096 Oct 22 11:37 bin/    ......hadoop命令及yarn命令

drwxr-xr-x  3 qiulp qiulp  4096 Oct  7 14:38 etc/    ......site xml配置文件

drwxr-xr-x  2 qiulp qiulp  4096 Oct  7 14:38 include/

drwxr-xr-x  2 qiulp qiulp  4096 Oct 22 11:40 sbin/   ......启动命令

drwxr-xr-x  4 qiulp qiulp  4096 Oct  7 14:38 share/  ......jar 源码(example jar)

配置hadoop jdk环境变量

修改etc/hadoop/hadoop-env.sh yarn-env.sh javahome例如:export JAVA_HOME=/usr/local/jrockit-jdk1.6.0_29

修改etc/hadoop/slaves文件,单点则直接配置该机器hostname

单机无密码登录

修改xml

core-site.xml

<property>

<name>fs.default.name</name>

<value>hdfs://qiulp:9010</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/usr/local/hadoop/tmp</value>

</property>

.....................

hdfs-site.xml

<property>

<name>dfs.name.dir</name>

<value>/usr/local/hadoop/name</value>

</property>

<property>

<name>dfs.data.dir</name>

<value>/usr/local/hadoop/data</value>

</property>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

....................

mapred-site.xml

<property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>-----指定采用的框架名称yarn 有local和classic默认事jobtracker即mrv1

</property>

<property>

    <name>mapreduce.cluster.temp.dir</name>

    <value>/usr/local/hadoop/ctmp/</value>

    <description>No description</description>

    <final>true</final>

  </property>

  <property>

    <name>mapreduce.cluster.local.dir</name>

    <value>/usr/local/hadoop/clocal</value>

    <description>No description</description>

    <final>true</final>

  </property>

........................

yarn-site.xml

<property>

    <name>yarn.resourcemanager.resource-tracker.address</name>

    <value>qiulp:8031</value>

    <description>host is the hostname of the resource manager and

    port is the port on which the NodeManagers contact the Resource Manager.

    </description>

  </property>

  <property>

    <name>yarn.resourcemanager.scheduler.address</name>

    <value>qiulp:8030</value>

    <description>host is the hostname of the resourcemanager and port is the port

    on which the Applications in the cluster talk to the Resource Manager.

    </description>

  </property>

  <property>

    <name>yarn.resourcemanager.scheduler.class</name>

    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>

    <description>In case you do not want to use the default scheduler</description>

  </property>

  <property>

    <name>yarn.resourcemanager.address</name>

    <value>qiulp:8032</value>

    <description>the host is the hostname of the ResourceManager and the port is the port on

    which the clients can talk to the Resource Manager. </description>

  </property>

  <property>

    <name>yarn.nodemanager.local-dirs</name>

    <value></value>

    <description>the local directories used by the nodemanager</description>

  </property>

  <property>

    <name>yarn.nodemanager.address</name>

    <value>qiulp:0</value>

    <description>the nodemanagers bind to this port</description>

  </property>

  <property>

    <name>yarn.nodemanager.resource.memory-mb</name>

    <value>10240</value>

    <description>the amount of memory on the NodeManager in GB</description>

  </property>

  <property>

    <name>yarn.nodemanager.remote-app-log-dir</name>

    <value>/app-logs</value>

    <description>directory on hdfs where the application logs are moved to </description>

  </property>

   <property>

    <name>yarn.nodemanager.log-dirs</name>

    <value></value>

    <description>the directories used by Nodemanagers as log directories</description>

  </property>

  <property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce_shuffle</value>

    <description>shuffle service that needs to be set for Map Reduce to run </description>

  </property>

  <property>

      <name>yarn.web-proxy.address</name>

      <value>qiulp:8038</value>

  </property>

................................

capacity-scheduler.xml

使用默认即可

执行命令

hadoop namenode -format

(正常情况下直接成功,没有提示输入y or n,若不成功共删除相关文件,例如/usr/local/hadoop下文件清空)

启动:

sbin/start-all.sh

5451 NodeManager

5033 SecondaryNameNode

5226 ResourceManager

4516 NameNode

4735 DataNode

Start a standalone WebAppProxy server. If multiple servers are used with load balancing it should be run on each of them:

$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start proxyserver --config $HADOOP_CONF_DIR

Start the MapReduce JobHistory Server with the following command, run on the designated server:

$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR

7540 WebAppProxyServer

7628 JobHistoryServer

JobHistoryServer开启后可查看历史任务日志http://qiulp:19888/jobhistory

相关web Interfaces

NameNode http://nn_host:port/ Default HTTP port is 50070.

ResourceManager http://rm_host:port/ Default HTTP port is 8088.

MapReduce JobHistory Server http://jhs_host:port/ Default HTTP port is 19888.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: