您的位置：首页 > 大数据 > 人工智能

Hadoop源码解析之修改distributedshell使每个Container运行在不同节点上

2017-01-21 02:18 387 查看

1.提出问题

修改DistributedShell程序，使得每个Container运行在不同节点上（目前是随机的，可能运行在任意节点上）。

2.分析过程

在YARN的处理流程中：

1. AM通过RPC协议ApplicationMasterProtocol向RM申请Container。

2. AM通过RPC协议ContainerManagementProtocol要求NM启动或者停止Container。

要想使每个Container运行在不同节点上，只需AM向每个节点都申请一个Container。

3.代码改动

通过第二步的分析得知，只需修改AM端，向每个节点都申请一个Container即可。

3.1 获取计算节点列表

定义nodeList用于保存计算节点列表，在ApplicationMaster的init()函数中添加初始化nodeList的代码。初始化完成后，nodeList中保存有计算节点的列表（不包括RM 节点）。

public class ApplicationMaster {

//所有计算节点

private static List nodeList =new ArrayList();

public boolean init(String[] args)throws ParseException,
IOException {

//该函数的末尾添加如下代码，用于获取计算节点列表

…….

    try{

      YarnClient yarnClient =YarnClient.createYarnClient();

      yarnClient.init(conf);

      yarnClient.start();

      List<NodeReport>clusterNodeReports;

      clusterNodeReports =yarnClient.getNodeReports(NodeState.RUNNING);

      for (NodeReport node :clusterNodeReports) {

        this.nodeList.add(node.getNodeId().getHost());

      }

    } catch (YarnException e) {

      e.printStackTrace();

    }

    return true;

}

}

3.2 向RM申请资源

申请资源的时候，会调用函数setupContainerAskForRM。

private
ContainerRequest setupContainerAskForRM() {

    // setup requirements for hosts

    // using * as any host will do for the distributed shellapp

    // set the priority for the request

    Priority
pri= Records.newRecord(Priority.class);

    // TODO - what is the range for priority? how to decide?

    pri.setPriority(requestPriority);

    // Set up resource type requirements

    // For now, memory and CPU are supported so we set memoryandcpu requirements

    Resource
capability= Records.newRecord(Resource.class);

    capability.setMemory(containerMemory);

    capability.setVirtualCores(containerVirtualCores);

    String[] nodes = null;

    if (!NodeList.isEmpty()) {

                   nodes = new String[1];

                   nodes[0] = (String)NodeList.get(0);

                   NodeList.remove(0);

         }

//默认的nodes为null

//考虑到本地性松弛，有可能节点1没有满足条件的Container可以分配，为了不让此Container分配到其余节点上，

//需要将本地性松弛参数关闭，即参数传入false。

    ContainerRequest request=new ContainerRequest(capability,nodes,
null, pri,false);

    LOG.info("Requestedcontainer ask: " +request.toString());

    return
request;

}

4.总结

解决该问题的关键点有以下几个地方：

1.      ApplicationMaster可以通过yarnClient从RM中获取计算节点列表。

2.      申请资源的时候，会调用函数setupContainerAskForRM。

3.      本地性松弛参数关闭。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： hadoop yarn 源码

相关文章推荐

新的分享

章节导航