编译spark2.X源码,参数说明
2018-01-23 16:36
316 查看
编译spark2.X源码
这里我们使用源码包中自带的make-distribution.sh文件进行编译。当然在编译之前你可以试着修改一些源代码。 在spark源码目录下运行
./dev/make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided,-Dscala-2.11" -rf :spark-repl_2.11
./dev/make-distribution.sh --name "hadoop2-hive" --tgz "-Pyarn,-Phive,hadoop-provided,hadoop-2.7,parquet-provided,-Dscala-2.11" -rf :spark-repl_2.11
参数解释:
-DskipTests,不执行测试用例,但编译测试用例类生成相应的class文件至target/test-classes下。
-Dhadoop.version 和-Phadoop: Hadoop 版本号,不加此参数时hadoop 版本为1.0.4 。
-Pyarn :是否支持Hadoop YARN ,不加参数时为不支持yarn 。
-Phive和-Phive-thriftserver:是否在Spark SQL 中支持hive ,不加此参数时为不支持hive 。
–with-tachyon :是否支持内存文件系统Tachyon ,不加此参数时不支持tachyon 。
–tgz :在根目录下生成 spark-$VERSION-bin.tgz ,不加此参数时不生成tgz 文件,只生成/dist 目录。
–name :和–tgz结合可以生成spark-$VERSION-bin-$NAME.tgz的部署包,不加此参数时NAME为hadoop的版本号。
这样大概要等二十分钟到一个多小时不等,主要取决于网络环境,因为要下载一些依赖包之类的。之后你就可以获得一个spark编译好的包了,解压之后就可以部署到机器上了。
执行以下命令,会在spark-2.0.2下生成文件 spark-2.0.2-bin-hadoop2-with-hive.tgz
[root@master spark-2.0.2]# ./dev/change-scala-version.sh 2.11
[root@master spark-2.0.2]# ./dev/make-distribution.sh --name "hadoop2-with-hive" --tgz "-Pyarn,-Phive,hadoop-provided,hadoop-2.7,parquet-provided"
main: [INFO] Executed tasks [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM ........................... SUCCESS [ 19.119 s] [INFO] Spark Proje 4000 ct Tags ................................. SUCCESS [ 7.630 s] [INFO] Spark Project Sketch ............................... SUCCESS [ 6.463 s] [INFO] Spark Project Networking ........................... SUCCESS [ 19.845 s] [INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 13.890 s] [INFO] Spark Project Unsafe ............................... SUCCESS [ 13.337 s] [INFO] Spark Project Launcher ............................. SUCCESS [ 23.115 s] [INFO] Spark Project Core ................................. SUCCESS [03:42 min] [INFO] Spark Project GraphX ............................... SUCCESS [ 26.100 s] [INFO] Spark Project Streaming ............................ SUCCESS [01:07 min] [INFO] Spark Project Catalyst ............................. SUCCESS [02:36 min] [INFO] Spark Project SQL .................................. SUCCESS [03:26 min] [INFO] Spark Project ML Local Library ..................... SUCCESS [ 14.402 s] [INFO] Spark Project ML Library ........................... SUCCESS [02:54 min] [INFO] Spark Project Tools ................................ SUCCESS [ 3.691 s] [INFO] Spark Project Hive ................................. SUCCESS [01:32 min] [INFO] Spark Project REPL ................................. SUCCESS [ 11.372 s] [INFO] Spark Project YARN Shuffle Service ................. SUCCESS [ 16.772 s] [INFO] Spark Project YARN ................................. SUCCESS [ 27.160 s] [INFO] Spark Project Assembly ............................. SUCCESS [ 5.484 s] [INFO] Spark Project External Flume Sink .................. SUCCESS [ 22.666 s] [INFO] Spark Project External Flume ....................... SUCCESS [ 22.288 s] [INFO] Spark Project External Flume Assembly .............. SUCCESS [ 5.101 s] [INFO] Spark Integration for Kafka 0.8 .................... SUCCESS [ 21.637 s] [INFO] Spark Project Examples ............................. SUCCESS [ 42.329 s] [INFO] Spark Project External Kafka Assembly .............. SUCCESS [ 8.713 s] [INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 22.547 s] [INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [ 7.028 s] [INFO] Kafka 0.10 Source for Structured Streaming ......... SUCCESS [ 18.807 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 21:43 min [INFO] Finished at: 2018-01-25T11:13:06+08:00 [INFO] Final Memory: 76M/327M [INFO] ------------------------------------------------------------------------ + rm -rf /opt/spark-2.0.2/dist + mkdir -p /opt/spark-2.0.2/dist/jars + echo 'Spark 2.0.2 built for Hadoop 2.7.3'
相关文章推荐
- 基于hive0.13.1的spark1.6.0源码编译说明
- 如何源码编译对应CDH版本的Spark2.X
- Nginx 源码编译参数详细列表
- php5.6.32版本编译安装参数官方说明参考
- Qt configure 参数不完全说明 以及 精简系统 减少编译时间的一些方法
- Universal-ImageLoader源码流程浅析之(一)--参数配置及主要参数说明
- mysql-5.5源码编译安装(附参数对照表)
- Scala 深入浅出实战经典 第60讲:Scala中隐式参数实战详解以及在Spark中的应用源码解析
- Spark源码编译并在YARN上运行WordCount实例
- GDAL源码剖析(二)之编译说明
- Apache Spark源码走读之9 -- Spark源码编译
- Weiss的数据结构与算法分析(C++版)源码编译说明
- 第121课:Spark Streaming性能优化:通过摄像头图像处理案例来说明Spark流处理性能评估新方法及性能调优参数调试
- Spark2.2.0源码编译打包
- Apache Spark源码走读之9 -- Spark源码编译
- 重新编译spark源码,使CDH支持spark sql
- Qt源码编译configure配置参数
- g++ 编译步骤及参数说明
- php7.1.8编译安装参数官方说明参考