Spark源码的编译过程详细解读(各版本)(博主推荐)
2017-05-12 17:28
661 查看
说在前面的话
重新试多几次。编译过程中会出现下载某个包的时间太久,这是由于连接网站的过程中会出现假死,按ctrl+c,重新运行编译命令。
如果出现缺少了某个文件的情况,则要先清理maven(使用命令 mvn clean) 再重新编译。
Spark源码编译的3大方式
1、Maven编译
2、SBT编译 (暂时没)
3、打包编译make-distribution.sh
前言
Spark可以通过SBT和Maven两种方式进行编译,再通过make-distribution.sh脚本生成部署包。
SBT编译需要安装git工具,而Maven安装则需要maven工具,两种方式均需要在联网 下进行。
尽管maven是Spark官网推荐的编译方式,但是sbt的编译速度更胜一筹。因此,对于spark的开发者来说,sbt编译可能是更好的选择。由于sbt编译也是基于maven的POM文件,因此sbt的编译参数与maven的编译参数是一致的。
心得
有时间,自己一定要动手编译源码,想要成为高手和大数据领域大牛,前面的苦,是必定要吃的。
无论是编译spark源码,还是hadoop源码。新手初次编译,一路会碰到很多问题,也许会花上个一天甚至几天,这个是正常。把心态端正就是!有错误,更好,解决错误,是最好锻炼和提升能力的。
更不要小看它们,能碰到是幸运,能弄懂和深入研究,之所以然,是福气。
各大版本简介
1、Apache版------可自己编译,也可采用预编译的版本
2、CDH版---------无需自己编译
3、HDP版----------无需自己编译
主流是这3大版本,其实,是有9大版本。CDH的CM是要花钱的,当然它的预编译包,是免费的。
hadoop/spark源码的下载方式:
1、官网下载
2、Github下载(仅source code)
以下是从官网下载:
以下是Github下载(仅source code)
CDH的下载
http://archive-primary.cloudera.com/cdh5/cdh/5/
HDP的下载
http://zh.hortonworks.com/products/
好的,那我这里就以,Githud为例。
准备Linux系统环境(如CentOS6.5)
********************************************************************************
* 思路流程:
* 第一大步:在线安装git
* 第二大步:创建一个目录来克隆spark源代码(mkdir -p /root/projects/opensource)
* 第三大步:切换分支
* 第四大步:安装jdk1.7+
* 第五大步:安装maven
* 第六大步:看官网,跟着走
* 第七大步:通过MVN下载对应的包
********************************************************************************
当然,可以参考官网给出的文档,
http://spark.apache.org/docs/1.6.1/building-spark.html
第一大步:在线安装git(root 用户下)
yum install git (root用户)
或者
Sudo yum install git (普通用户)
第二大步:创建一个目录克隆spark源代码
重新试多几次。编译过程中会出现下载某个包的时间太久,这是由于连接网站的过程中会出现假死,按ctrl+c,重新运行编译命令。
如果出现缺少了某个文件的情况,则要先清理maven(使用命令 mvn clean) 再重新编译。
Spark源码编译的3大方式
1、Maven编译
2、SBT编译 (暂时没)
3、打包编译make-distribution.sh
前言
Spark可以通过SBT和Maven两种方式进行编译,再通过make-distribution.sh脚本生成部署包。
SBT编译需要安装git工具,而Maven安装则需要maven工具,两种方式均需要在联网 下进行。
尽管maven是Spark官网推荐的编译方式,但是sbt的编译速度更胜一筹。因此,对于spark的开发者来说,sbt编译可能是更好的选择。由于sbt编译也是基于maven的POM文件,因此sbt的编译参数与maven的编译参数是一致的。
心得
有时间,自己一定要动手编译源码,想要成为高手和大数据领域大牛,前面的苦,是必定要吃的。
无论是编译spark源码,还是hadoop源码。新手初次编译,一路会碰到很多问题,也许会花上个一天甚至几天,这个是正常。把心态端正就是!有错误,更好,解决错误,是最好锻炼和提升能力的。
更不要小看它们,能碰到是幸运,能弄懂和深入研究,之所以然,是福气。
各大版本简介
1、Apache版------可自己编译,也可采用预编译的版本
2、CDH版---------无需自己编译
Cloudera Manager安装之利用parcels方式安装3节点集群(包含最新稳定版本或指定版本的安装)(添加服务)
3、HDP版----------无需自己编译
Ambari安装部署搭建hdp集群(图文分五大步详解)(博主强烈推荐)
主流是这3大版本,其实,是有9大版本。CDH的CM是要花钱的,当然它的预编译包,是免费的。hadoop/spark源码的下载方式:
1、官网下载
2、Github下载(仅source code)
以下是从官网下载:
以下是Github下载(仅source code)
CDH的下载
http://archive-primary.cloudera.com/cdh5/cdh/5/
HDP的下载
http://zh.hortonworks.com/products/
好的,那我这里就以,Githud为例。
准备Linux系统环境(如CentOS6.5)
********************************************************************************
* 思路流程:
* 第一大步:在线安装git
* 第二大步:创建一个目录来克隆spark源代码(mkdir -p /root/projects/opensource)
* 第三大步:切换分支
* 第四大步:安装jdk1.7+
* 第五大步:安装maven
* 第六大步:看官网,跟着走
* 第七大步:通过MVN下载对应的包
********************************************************************************
当然,可以参考官网给出的文档,
http://spark.apache.org/docs/1.6.1/building-spark.html
第一大步:在线安装git(root 用户下)
yum install git (root用户)
或者
Sudo yum install git (普通用户)
[root@Compiler ~]# yum install git ....... Total download size: 4.7 M Installed size: 15 M Is this ok [y/N]: y Downloading Packages: (1/3): git-1.7.1-4.el6_7.1.x86_64.rpm | 4.6 MB 00:01 ......... Complete! [root@Compiler ~]#
第二大步:创建一个目录克隆spark源代码
mkdir -p /root/projects/opensource cd /root/projects/opensource git clone https://github.com/apache/spark.git[/code][root@Compiler ~]# pwd /root [root@Compiler ~]# mkdir -p /root/projects/opensource [root@Compiler ~]# cd projects/opensource/ [root@Compiler opensource]# pwd /root/projects/opensource [root@Compiler opensource]# ls [root@Compiler opensource]#
[root@Compiler ~]# pwd
/root
[root@Compiler ~]# mkdir -p /root/projects/opensource
[root@Compiler ~]# cd projects/opensource/
[root@Compiler opensource]# pwd
/root/projects/opensource
[root@Compiler opensource]# ls
[root@Compiler opensource]#
[root@Compiler opensource]# pwd /root/projects/opensource [root@Compiler opensource]# git clone https://github.com/apache/spark.git Initialized empty Git repository in /root/projects/opensource/spark/.git/ remote: Counting objects: 403059, done. remote: Compressing objects: 100% (13/13), done. remote: Total 403059 (delta 4), reused 1 (delta 1), pack-reused 403045 Receiving objects: 100% (403059/403059), 182.79 MiB | 896 KiB/s, done. Resolving deltas: 100% (157557/157557), done. [root@Compiler opensource]# ls spark [root@Compiler opensource]# cd spark/ [root@Compiler spark]#
[root@Compiler opensource]# pwd
/root/projects/opensource
[root@Compiler opensource]# git clone https://github.com/apache/spark.git
Initialized empty Git repository in /root/projects/opensource/spark/.git/
remote: Counting objects: 403059, done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 403059 (delta 4), reused 1 (delta 1), pack-reused 403045
Receiving objects: 100% (403059/403059), 182.79 MiB | 896 KiB/s, done.
Resolving deltas: 100% (157557/157557), done.
[root@Compiler opensource]# ls
spark
[root@Compiler opensource]# cd spark/
[root@Compiler spark]#
其实就是,对应着,如下网页界面。[root@Compiler spark]# pwd /root/projects/opensource/spark [root@Compiler spark]# ll total 280 -rw-r--r--. 1 root root 1804 Sep 2 03:53 appveyor.yml drwxr-xr-x. 3 root root 4096 Sep 2 03:53 assembly drwxr-xr-x. 2 root root 4096 Sep 2 03:53 bin drwxr-xr-x. 2 root root 4096 Sep 2 03:53 build drwxr-xr-x. 8 root root 4096 Sep 2 03:53 common drwxr-xr-x. 2 root root 4096 Sep 2 03:53 conf -rw-r--r--. 1 root root 988 Sep 2 03:53 CONTRIBUTING.md drwxr-xr-x. 3 root root 4096 Sep 2 03:53 core drwxr-xr-x. 5 root root 4096 Sep 2 03:53 data drwxr-xr-x. 6 root root 4096 Sep 2 03:53 dev drwxr-xr-x. 9 root root 4096 Sep 2 03:53 docs drwxr-xr-x. 3 root root 4096 Sep 2 03:53 examples drwxr-xr-x. 15 root root 4096 Sep 2 03:53 external drwxr-xr-x. 3 root root 4096 Sep 2 03:53 graphx drwxr-xr-x. 3 root root 4096 Sep 2 03:53 launcher -rw-r--r--. 1 root root 17811 Sep 2 03:53 LICENSE drwxr-xr-x. 2 root root 4096 Sep 2 03:53 licenses drwxr-xr-x. 3 root root 4096 Sep 2 03:53 mesos drwxr-xr-x. 3 root root 4096 Sep 2 03:53 mllib drwxr-xr-x. 3 root root 4096 Sep 2 03:53 mllib-local -rw-r--r--. 1 root root 24749 Sep 2 03:53 NOTICE -rw-r--r--. 1 root root 97324 Sep 2 03:53 pom.xml drwxr-xr-x. 2 root root 4096 Sep 2 03:53 project drwxr-xr-x. 6 root root 4096 Sep 2 03:53 python drwxr-xr-x. 3 root root 4096 Sep 2 03:53 R -rw-r--r--. 1 root root 3828 Sep 2 03:53 README.md drwxr-xr-x. 5 root root 4096 Sep 2 03:53 repl drwxr-xr-x. 2 root root 4096 Sep 2 03:53 sbin -rw-r--r--. 1 root root 16952 Sep 2 03:53 scalastyle-config.xml drwxr-xr-x. 6 root root 4096 Sep 2 03:53 sql drwxr-xr-x. 3 root root 4096 Sep 2 03:53 streaming drwxr-xr-x. 3 root root 4096 Sep 2 03:53 tools drwxr-xr-x. 3 root root 4096 Sep 2 03:53 yarn [root@Compiler spark]#
第三大步:切换分支git checkout v1.6.1 //在spark目录下执行
[root@Compiler spark]# pwd /root/projects/opensource/spark [root@Compiler spark]# git branch -a * master remotes/origin/HEAD -> origin/master remotes/origin/branch-0.5 ... remotes/origin/branch-1.6 remotes/origin/branch-2.0 remotes/origin/master [root@Compiler spark]# git checkout v1.6.1 Note: checking out 'v1.6.1'. You are in 'detached HEAD' state. You can look around, make experimental ... HEAD is now at 15de51c... Preparing Spark release v1.6.1-rc1 [root@Compiler spark]#
那么,就有了。make-distribution.sh[root@Compiler spark]# pwd /root/projects/opensource/spark [root@Compiler spark]# ll total 1636 drwxr-xr-x. 3 root root 4096 Sep 2 03:57 assembly drwxr-xr-x. 3 root root 4096 Sep 2 03:57 bagel drwxr-xr-x. 2 root root 4096 Sep 2 03:57 bin drwxr-xr-x. 2 root root 4096 Sep 2 03:57 build -rw-r--r--. 1 root root 1343562 Sep 2 03:57 CHANGES.txt drwxr-xr-x. 2 root root 4096 Sep 2 03:57 conf -rw-r--r--. 1 root root 988 Sep 2 03:53 CONTRIBUTING.md drwxr-xr-x. 3 root root 4096 Sep 2 03:57 core drwxr-xr-x. 3 root root 4096 Sep 2 03:57 data drwxr-xr-x. 7 root root 4096 Sep 2 03:57 dev drwxr-xr-x. 4 root root 4096 Sep 2 03:57 docker drwxr-xr-x. 3 root root 4096 Sep 2 03:57 docker-integration-tests drwxr-xr-x. 9 root root 4096 Sep 2 03:57 docs drwxr-xr-x. 3 root root 4096 Sep 2 03:57 ec2 drwxr-xr-x. 3 root root 4096 Sep 2 03:57 examples drwxr-xr-x. 11 root root 4096 Sep 2 03:57 external drwxr-xr-x. 6 root root 4096 Sep 2 03:57 extras drwxr-xr-x. 4 root root 4096 Sep 2 03:57 graphx drwxr-xr-x. 3 root root 4096 Sep 2 03:57 launcher -rw-r--r--. 1 root root 17352 Sep 2 03:57 LICENSE drwxr-xr-x. 2 root root 4096 Sep 2 03:57 licenses -rwxr-xr-x. 1 root root 8557 Sep 2 03:57 make-distribution.sh drwxr-xr-x. 3 root root 4096 Sep 2 03:57 mllib drwxr-xr-x. 5 root root 4096 Sep 2 03:57 network -rw-r--r--. 1 root root 23529 Sep 2 03:57 NOTICE -rw-r--r--. 1 root root 91106 Sep 2 03:57 pom.xml drwxr-xr-x. 3 root root 4096 Sep 2 03:57 project -rw-r--r--. 1 root root 13991 Sep 2 03:57 pylintrc drwxr-xr-x. 6 root root 4096 Sep 2 03:57 python drwxr-xr-x. 3 root root 4096 Sep 2 03:57 R -rw-r--r--. 1 root root 3359 Sep 2 03:57 README.md drwxr-xr-x. 5 root root 4096 Sep 2 03:57 repl drwxr-xr-x. 2 root root 4096 Sep 2 03:57 sbin drwxr-xr-x. 2 root root 4096 Sep 2 03:57 sbt -rw-r--r--. 1 root root 13191 Sep 2 03:57 scalastyle-config.xml drwxr-xr-x. 6 root root 4096 Sep 2 03:57 sql drwxr-xr-x. 3 root root 4096 Sep 2 03:57 streaming drwxr-xr-x. 3 root root 4096 Sep 2 03:57 tags drwxr-xr-x. 3 root root 4096 Sep 2 03:57 tools -rw-r--r--. 1 root root 848 Sep 2 03:57 tox.ini drwxr-xr-x. 3 root root 4096 Sep 2 03:57 unsafe drwxr-xr-x. 3 root root 4096 Sep 2 03:57 yarn [root@Compiler spark]#
其实啊,对应下面的这个界面
修改make-distribution.sh文件[root@Compiler spark]# pwd /root/projects/opensource/spark [root@Compiler spark]# vim make-distribution.sh
我自己安装的maven,是 MAVEN_HOME=/usr/local/apache-maven-3.3.3
改为。MVN="/usr/local/apache-maven-3.3.3/bin/mvn" 或 MVN="$MAVEN_HOME/bin /mvn"MAKE_TGZ=false NAME=none #MVN="$SPARK_HOME/build/mvn" MVN="$MAVEN_HOME/bin/mvn"
第四大步 安装jdk7+
第一步:查看Centos6.5自带的JDK是否已安装
<1> 检测原OPENJDK版本
# java -version
一般将获得如下信息:tzdata-java-2013g-1.el6.noarch java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64 java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64
<2>进一步查看JDK信息
# rpm -qa|grep javarpm -e --nodeps tzdata-java-2013g-1.el6.noarch rpm -e --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64 rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64
<3>卸载OPENJDK
自带的jdk已经没了。
在root用户下安装jdk-7u79-linux-x64.tar.gz
在/usr/local上传
解压,tar -zxvf jdk-7u79-linux-x64.tar.gz
删除压缩包,rm -rf jdk-7u79-linux-x64.tar.gz
配置环境变量,vim /etc/profile#java export JAVA_HOME=/usr/local/jdk1.7.0_79 export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/jre/lib/dt.jar:$JAVA_HOME/jre/lib/tools.jar export PATH=$PATH:$JAVA_HOME/bin
文件生效,source /etc/profile
查看是否安装成功,java –version
第五大步、安装maven
下载apache-maven-3.3.3-bin.tar.gz
/usr/local/
上传apache-maven-3.3.3-bin.tar.gz
解压,tar -zxvf apache-maven-3.3.3-bin.tar.gz
删除压缩包,rm -rf apache-maven-3.3.3-bin.tar.gz
maven的配置环境变量,vim /etc/profile#mavenexport MAVEN_HOME=/usr/local/apache-maven-3.3.3export PATH=$PATH:$MAVEN_HOME/bin
文件生效,source /etc/profile
查看是否安装成功,mvn -v
第六大步:看官网,跟着走,初步了解
http://spark.apache.org/docs/1.6.1/building-spark.html
[root@Compiler spark]# vim pom.xml
先来初步认识下这个pom.xml文件
P是profile的意思,
我们可以同时激活多个嘛
其他的不再赘述,这是对它的一些初步认识。
有了对pom.xml的初步了解,之后呢?经验之谈,一般都会对$MAVEN_HOME/conf/settings.xml修改,这是大牛在生产环境下的心血总结啊!!!
这里啊,给大家推荐一款很实用的软件!
解压,
这是不行的
是因为,左侧 本地站点 这个位置选的是 计算机 ,而非具体的某个盘。
以下是默认的<?xml version="1.0" encoding="UTF-8"?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <!-- | This is the configuration file for Maven. It can be specified at two levels: | | 1. User Level. This settings.xml file provides configuration for a single user, | and is normally provided in ${user.home}/.m2/settings.xml. | | NOTE: This location can be overridden with the CLI option: | | -s /path/to/user/settings.xml | | 2. Global Level. This settings.xml file provides configuration for all Maven | users on a machine (assuming they're all using the same Maven | installation). It's normally provided in | ${maven.home}/conf/settings.xml. | | NOTE: This location can be overridden with the CLI option: | | -gs /path/to/global/settings.xml | | The sections in this sample file are intended to give you a running start at | getting the most out of your Maven installation. Where appropriate, the default | values (values used when the setting is not specified) are provided. | |--> <settings xmlns="http://maven.apache.org/SETTINGS/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd"> <!-- localRepository | The path to the local repository maven will use to store artifacts. | | Default: ${user.home}/.m2/repository <localRepository>/path/to/local/repo</localRepository> --> <!-- interactiveMode | This will determine whether maven prompts you when it needs input. If set to false, | maven will use a sensible default value, perhaps based on some other setting, for | the parameter in question. | | Default: true <interactiveMode>true</interactiveMode> --> <!-- offline | Determines whether maven should attempt to connect to the network when executing a build. | This will have an effect on artifact downloads, artifact deployment, and others. | | Default: false <offline>false</offline> --> <!-- pluginGroups | This is a list of additional group identifiers that will be searched when resolving plugins by their prefix, i.e. | when invoking a command line like "mvn prefix:goal". Maven will automatically add the group identifiers | "org.apache.maven.plugins" and "org.codehaus.mojo" if these are not already contained in the list. |--> <pluginGroups> <!-- pluginGroup | Specifies a further group identifier to use for plugin lookup. <pluginGroup>com.your.plugins</pluginGroup> --> </pluginGroups> <!-- proxies | This is a list of proxies which can be used on this machine to connect to the network. | Unless otherwise specified (by system property or command-line switch), the first proxy | specification in this list marked as active will be used. |--> <proxies> <!-- proxy | Specification for one proxy, to be used in connecting to the network. | <proxy> <id>optional</id> <active>true</active> <protocol>http</protocol> <username>proxyuser</username> <password>proxypass</password> <host>proxy.host.net</host> <port>80</port> <nonProxyHosts>local.net|some.host.com</nonProxyHosts> </proxy> --> </proxies> <!-- servers | This is a list of authentication profiles, keyed by the server-id used within the system. | Authentication profiles can be used whenever maven must make a connection to a remote server. |--> <servers> <!-- server | Specifies the authentication information to use when connecting to a particular server, identified by | a unique name within the system (referred to by the 'id' attribute below). | | NOTE: You should either specify username/password OR privateKey/passphrase, since these pairings are | used together. | <server> <id>deploymentRepo</id> <username>repouser</username> <password>repopwd</password> </server> --> <!-- Another sample, using keys to authenticate. <server> <id>siteServer</id> <privateKey>/path/to/private/key</privateKey> <passphrase>optional; leave empty if not used.</passphrase> </server> --> </servers> <!-- mirrors | This is a list of mirrors to be used in downloading artifacts from remote repositories. | | It works like this: a POM may declare a repository to use in resolving certain artifacts. | However, this repository may have problems with heavy traffic at times, so people have mirrored | it to several places. | | That repository definition will have a unique id, so we can create a mirror reference for that | repository, to be used as an alternate download site. The mirror site will be the preferred | server for that repository. |--> <mirrors> <!-- mirror | Specifies a repository mirror site to use instead of a given repository. The repository that | this mirror serves has an ID that matches the mirrorOf element of this mirror. IDs are used | for inheritance and direct lookup purposes, and must be unique across the set of mirrors. | <mirror> <id>mirrorId</id> <mirrorOf>repositoryId</mirrorOf> <name>Human Readable Name for this Mirror.</name> <url>http://my.repository.com/repo/path</url> </mirror> --> </mirrors> <!-- profiles | This is a list of profiles which can be activated in a variety of ways, and which can modify | the build process. Profiles provided in the settings.xml are intended to provide local machine- | specific paths and repository locations which allow the build to work in the local environment. | | For example, if you have an integration testing plugin - like cactus - that needs to know where | your Tomcat instance is installed, you can provide a variable here such that the variable is | dereferenced during the build process to configure the cactus plugin. | | As noted above, profiles can be activated in a variety of ways. One way - the activeProfiles | section of this document (settings.xml) - will be discussed later. Another way essentially | relies on the detection of a system property, either matching a particular value for the property, | or merely testing its existence. Profiles can also be activated by JDK version prefix, where a | value of '1.4' might activate a profile when the build is executed on a JDK version of '1.4.2_07'. | Finally, the list of active profiles can be specified directly from the command line. | | NOTE: For profiles defined in the settings.xml, you are restricted to specifying only artifact | repositories, plugin repositories, and free-form properties to be used as configuration | variables for plugins in the POM. | |--> <profiles> <!-- profile | Specifies a set of introductions to the build process, to be activated using one or more of the | mechanisms described above. For inheritance purposes, and to activate profiles via <activatedProfiles/> | or the command line, profiles have to have an ID that is unique. | | An encouraged best practice for profile identification is to use a consistent naming convention | for profiles, such as 'env-dev', 'env-test', 'env-production', 'user-jdcasey', 'user-brett', etc. | This will make it more intuitive to understand what the set of introduced profiles is attempting | to accomplish, particularly when you only have a list of profile id's for debug. | | This profile example uses the JDK version to trigger activation, and provides a JDK-specific repo. <profile> <id>jdk-1.4</id> <activation> <jdk>1.4</jdk> </activation> <repositories> <repository> <id>jdk14</id> <name>Repository for JDK 1.4 builds</name> <url>http://www.myhost.com/maven/jdk14</url> <layout>default</layout> <snapshotPolicy>always</snapshotPolicy> </repository> </repositories> </profile> --> <!-- | Here is another profile, activated by the system property 'target-env' with a value of 'dev', | which provides a specific path to the Tomcat instance. To use this, your plugin configuration | might hypothetically look like: | | ... | <plugin> | <groupId>org.myco.myplugins</groupId> | <artifactId>myplugin</artifactId> | | <configuration> | <tomcatLocation>${tomcatPath}</tomcatLocation> | </configuration> | </plugin> | ... | | NOTE: If you just wanted to inject this configuration whenever someone set 'target-env' to | anything, you could just leave off the <value/> inside the activation-property. | <profile> <id>env-dev</id> <activation> <property> <name>target-env</name> <value>dev</value> </property> </activation> <properties> <tomcatPath>/path/to/tomcat/instance</tomcatPath> </properties> </profile> --> </profiles> <!-- activeProfiles | List of profiles that are active for all builds. | <activeProfiles> <activeProfile>alwaysActiveProfile</activeProfile> <activeProfile>anotherAlwaysActiveProfile</activeProfile> </activeProfiles> --> </settings>
改为,
<?xml version="1.0" encoding="UTF-8"?> <settings xmlns="http://maven.apache.org/SETTINGS/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd"> <pluginGroups> </pluginGroups> <proxies> </proxies> <servers> </servers> <mirrors> <mirror> <id>nexus-osc</id> <mirrorOf>*</mirrorOf> <name>Nexus osc</name> <url>http://nexus.rc.dataengine.com/nexus/content/groups/public</url> </mirror> <mirror> <id>nexus-osc</id> <mirrorOf>central</mirrorOf> <name>Nexus osc</name> <url>http://maven.oschina.net/content/groups/public</url> </mirror> <mirror> <id>nexus-osc-thirdparty</id> <mirrorOf>thirdparty</mirrorOf> <name>Nexus osc thirdparty</name> <url>http://maven.oschina.net/content/repositories/thirdparty</url> </mirror> <mirror> <id>central</id> <mirrorOf>central</mirrorOf> <name>central</name> <url>http://central.maven.org/maven2</url> </mirror> <mirror> <id>repol</id> <mirrorOf>central</mirrorOf> <name>repol</name> <url>http://repol.maven.org/maven2</url> </mirror> </mirrors> <profiles> <profile> <id>jdk-1.4</id> <activation> <jdk>1.4</jdk> </activation> <repositories> <repository> <id>rc</id> <name>rc nexus</name> <url>http://nexus.rc.dataengine.com/nexus/content/groups/public</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>nexus</id> <name>local private nexus</name> <url>http://maven.oschina.net/content/groups/public</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>central</id> <name>central</name> <url>http://central.maven.org/maven2/</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>repol</id> <name>repol</name> <url>http://repol.maven.org/maven2/</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories> <pluginRepositories> <pluginRepository> <id>rc</id> <name>rc nexus</name> <url>http://nexus.rc.dataengine.com/nexus/content/groups/public</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </pluginRepository> <pluginRepository> <id>nexus</id> <name>local private nexus</name> <url>http://maven.oschina.net/content/groups/public</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </pluginRepository> <pluginRepository> <id>central</id> <name>central</name> <url>http://central.maven.org/maven2/</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </pluginRepository> <pluginRepository> <id>repol</id> <name>repol</name> <url>http://repol.maven.org/maven2/</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </pluginRepository> </pluginRepositories> </profile> </profiles> <activateProfiles> <activateProfile>jdk-1.4</activateProfile> </activateProfiles> </settings>
好啦,上述,是初步的解读!!!
我们继续,解读spark根目录,
这样,我们就对这个目录结构,有了一个里里外外的认识。
https://github.com/apache/spark/tree/v1.6.1
好吧,到此,我对https://github.com/apache/spark/tree/v1.6.1 的解读到此结束。其他的,以后多深入研究。
第七大步:先通过mvn下载相应的jar包
mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Phive -Phive-thriftserver -Psparkr -DskipTests
clean package //在spark 源码父目录下执行
[root@Compiler spark]# pwd
/root/projects/opensource/spark
[root@Compiler spark]# mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Phive -Phive-thriftserver -Psparkr -DskipTests clean package
也许,要
[root@Compiler spark]# mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Phive -Phive-thriftserver -Psparkr -DskipTests clean package
第八大步: 编译spark
./make-distribution.sh --name custom-spark --tgz -Psparkr -Phadoop-2.6 -Dhadoop.version=2.7.1 -Phive -Phive-thriftserver -Pyarn //在spark 源码父目录下执行
[root@Compiler spark]# ./make-distribution.sh --name custom-spark --tgz -Psparkr -Phadoop-2.6 -Dhadoop.version=2.7.1 -Phive -Phive-thriftserver -Pyarn
安装过程中,可能会出现缺少包的情况、这可以通过重复七八步解决,实在不行,可以根据错误信息的提示,前往maven的官方网站,下载对应的包进行安装,我在安装的时候就通过maven安装了以下文件:
mvn install:install-file -DgroupId=org.scalatest -DartifactId=scalatest-maven-plugin -Dversion=1.0 -Dpackaging=jar -Dfile=/home/neoway/scalatest-maven-plugin-1.0.jar
Reference:http://www.cnblogs.com/zlslch/p/5865707.html
官方文档地址:http://spark.apache.org/docs/latest/building-spark.html
相关文章推荐
- Spark源码的编译过程详细解读(各版本)(博主推荐)
- .NET框架源码解读之SSCLI编译过程简介
- (版本定制)第6课:Spark Streaming源码解读之Job动态生成和深度思考
- (版本定制)第16课:Spark Streaming源码解读之数据清理内幕彻底解密
- 使用 IntelliJ IDEA 导入 Spark 最新源码及编译 Spark 源代码(博主强烈推荐)
- (版本定制)第14课:Spark Streaming源码解读之State管理之updateStateByKey和mapWithState解密
- Vmware安装ubuntu编译android源码详细过程
- 4000 Spark源码解读(6)——Shuffle过程
- (版本定制)第15课:Spark Streaming源码解读之No Receivers彻底思考
- Android Octa源码编译和下载过程详细记录
- OpenCV2.2.0版本的更改及源码zip包编译过程中的include路径的一个问题
- 源码编译安装MySQL5.6.12详细过程
- 修改hadoop源码后,hadoop和spark的编译过程
- spark的工作机制详细介绍、spark源码编译、spark编程实战
- 教你源码编译制作LAMP详细过程
- 修改hadoop源码后,hadoop和spark的编译过程
- Linux软件安装-详细源码安装过程 推荐
- (版本定制)第10课:Spark Streaming源码解读之流数据不断接收全生命周期彻底研究和思考
- (版本定制)第7课:Spark Streaming源码解读之JobScheduler内幕实现和深度思考
- (版本定制)第11课:Spark Streaming源码解读之Driver中的ReceiverTracker彻底研究和思考