spark 源代码分析 (二)spark启动过程
2017-08-15 20:36
274 查看
下载spark后,调用shell脚本启动:
接下来将分析这个一个过程。spark-shell
会启动--class
org.apache.spark.repl.Main,
参数
--name
"Spark shell"
"$@" ,然后会调用spark-submit
./bin/spark-shell --master local[2]
接下来将分析这个一个过程。spark-shell
会启动--class
org.apache.spark.repl.Main,
参数
--name
"Spark shell"
"$@" ,然后会调用spark-submit
cygwin=false
case "$(uname)" in
CYGWIN*) cygwin=true;;
esac
# Enter posix mode for bash
set -o posix
if [ -z "${SPARK_HOME}" ]; then
source "$(dirname "$0")"/find-spark-home
fi
export _SPARK_CMD_USAGE="Usage: ./bin/spark-shell [options]"
# SPARK-4161: scala does not assume use of the java classpath,
# so we need to add the "-Dscala.usejavacp=true" flag manually. We
# do this specifically for the Spark shell because the scala REPL
# has its own class loader, and any additional classpath specified
# through spark.driver.extraClassPath is not automatically propagated.
SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Dscala.usejavacp=true"
function main() {
if $cygwin; then
# Workaround for issue involving JLine and Cygwin
# (see http://sourceforge.net/p/jline/bugs/40/).[/code]# If you're using the Mintty terminal emulator in Cygwin, may need to set the# "Backspace sends ^H" setting in "Keys" section of the Mintty options# (see https://github.com/sbt/sbt/issues/562).[/code]stty -icanon min 1 -echo > /dev/null 2>&1export SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Djline.terminal=unix"## 这里自定了启动"${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.repl.Main --name "Spark shell" "$@"stty icanon echo > /dev/null 2>&1elseexport SPARK_SUBMIT_OPTS"${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.repl.Main --name "Spark shell" "$@"fi}# Copy restore-TTY-on-exit functions from Scala script so spark-shell exits properly even in# binary distribution of Spark where Scala is not installedexit_status=127saved_stty=""# restore stty settings (echo in particular)function restoreSttySettings() {stty $saved_sttysaved_stty=""}function onExit() {if [[ "$saved_stty" != "" ]]; thenrestoreSttySettingsfiexit $exit_status}# to reenable echo if we are interrupted before completing.trap onExit INT# save terminal settingssaved_stty=$(stty -g 2>/dev/null)# clear on error so we don't later try to restore themif [[ ! $? ]]; thensaved_stty=""fimain "$@"# record the exit status lest it be overwritten:# then reenable echo and propagate the code.exit_status=$?onExit
spark-submit,这个简单,调用了spark-class
参数 :org.apache.spark.deploy.SparkSubmitif [ -z "${SPARK_HOME}" ]; thensource "$(dirname "$0")"/find-spark-homefi# disable randomized hash for string in Python 3.3+export PYTHONHASHSEED=0exec "${SPARK_HOME}"/bin/spark-class org.apache.spark.deploy.SparkSubmit "$@"
spark-class如下:这里主要加载java class 和jar包if [ -z "${SPARK_HOME}" ]; thensource "$(dirname "$0")"/find-spark-homefi. "${SPARK_HOME}"/bin/load-spark-env.sh# Find the java binaryif [ -n "${JAVA_HOME}" ]; thenRUNNER="${JAVA_HOME}/bin/java"elseif [ "$(command -v java)" ]; thenRUNNER="java"elseecho "JAVA_HOME is not set" >&2exit 1fifi# Find Spark jars.if [ -d "${SPARK_HOME}/jars" ]; thenSPARK_JARS_DIR="${SPARK_HOME}/jars"elseSPARK_JARS_DIR="${SPARK_HOME}/assembly/target/scala-$SPARK_SCALA_VERSION/jars"fiif [ ! -d "$SPARK_JARS_DIR" ] && [ -z "$SPARK_TESTING$SPARK_SQL_TESTING" ]; thenecho "Failed to find Spark jars directory ($SPARK_JARS_DIR)." 1>&2echo "You need to build Spark with the target \"package\" before running this program." 1>&2exit 1elseLAUNCH_CLASSPATH="$SPARK_JARS_DIR/*"fi# Add the launcher build dir to the classpath if requested.if [ -n "$SPARK_PREPEND_CLASSES" ]; thenLAUNCH_CLASSPATH="${SPARK_HOME}/launcher/target/scala-$SPARK_SCALA_VERSION/classes:$LAUNCH_CLASSPATH"fi# For testsif [[ -n "$SPARK_TESTING" ]]; thenunset YARN_CONF_DIRunset HADOOP_CONF_DIRfi# The launcher library will print arguments separated by a NULL character, to allow arguments with# characters that would be otherwise interpreted by the shell. Read that in a while loop, populating# an array that will be used to exec the final command.## The exit code of the launcher is appended to the output, so the parent shell removes it from the# command array and checks the value to see if the launcher succeeded.build_command() {"$RUNNER" -Xmx128m -cp "$LAUNCH_CLASSPATH" org.apache.spark.launcher.Main "$@"printf "%d\0" $?}# Turn off posix mode since it does not allow process substitutionset +o posixCMD=()while IFS= read -d '' -r ARG; doCMD+=("$ARG")done < <(build_command "$@")COUNT=${#CMD[@]}LAST=$((COUNT - 1))LAUNCHER_EXIT_CODE=${CMD[$LAST]}# Certain JVM failures result in errors being printed to stdout (instead of stderr), which causes# the code that parses the output of the launcher to get confused. In those cases, check if the# exit code is an integer, and if it's not, handle it as a special error case.if ! [[ $LAUNCHER_EXIT_CODE =~ ^[0-9]+$ ]]; thenecho "${CMD[@]}" | head -n-1 1>&2exit 1fiif [ $LAUNCHER_EXIT_CODE != 0 ]; thenexit $LAUNCHER_EXIT_CODEfiCMD=("${CMD[@]:0:$LAST}")exec "${CMD[@]}"
所以最终启动了
org.apache.spark.repl.Main
org.apache.spark.deploy.SparkSubmit
org.apache.spark.launcher.Main
相关文章推荐
- Android系统默认Home应用程序(Launcher)的启动过程源代码分析
- Android系统进程间通信(IPC)机制Binder中的Server启动过程源代码分析
- Android系统默认Home应用程序(Launcher)的启动过程源代码分析
- Android应用程序(app)进程启动过程的源代码分析
- Android系统默认Home应用程序(Launcher)的启动过程源代码分析
- Android应用程序组件Content Provider的启动过程源代码分析
- Spark分析之SparkContext启动过程分析
- Android系统进程Zygote启动过程的源代码分析
- Android系统默认Home应用程序(Launcher)的启动过程源代码分析
- Android系统进程间通信(IPC)机制Binder中的Server启动过程源代码分析
- Android应用程序启动过程源代码分析(3)
- Android应用程序组件Content Provider的启动过程源代码分析(1)
- Android系统进程Zygote启动过程的源代码分析
- Content Provider的启动过程源代码分析
- Android系统默认Home应用程序(Launcher)的启动过程源代码分析(2)
- Activity启动过程源代码分析
- Android系统进程间通信(IPC)机制Binder中的Server启动过程源代码分析
- Android系统进程间通信(IPC)机制Binder中的Server启动过程源代码分析
- Android系统进程Zygote启动过程的源代码分析
- Android应用程序内部启动Activity过程(startActivity)的源代码分析