您的位置:首页 > 数据库

sparksql on hive实践

2016-12-05 11:17 666 查看
hive-site.xml拷贝到spark的conf目录下后,编程测试如下:

import org.apache.spark._

object exptest {
def main(args: Array[String]){
System.setProperty("hadoop.home.dir", "C:\\winutils\\")
var masterUrl = "local[1]"

if (args.length == 1){
masterUrl = args(0)
}

val conf = new SparkConf().setMaster(masterUrl).setAppName("Spark-Hive")
val sc = new SparkContext(conf)

// sparksql
val SqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

SqlContext.sql("show tables").collect().foreach(println)

sc.stop()

}
}


打包上传至集群运行

$SPARK_HOME/bin/spark-submit \

  --master yarn-cluster \

  --class com.spark.basic.exptest \

  /home/ubuntu/ispecexp.jar \

  yarn-cluster 

报错日志如下:

 16/12/05 10:15:09 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient)

16/12/05 10:15:09 INFO spark.SparkContext: Invoking stop() from shutdown hook

16/12/05 10:15:09 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static/sql,null}

16/12/05 10:15:09 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL/execution/json,null}

16/12/05 10:15:09 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL/execution,null}

16/12/05 10:15:09 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL/json,nul

可能是没有正常启动Hive的Metastore Server服务进程导致,执行命令hive --service metastore &

重新运行程序,还是有报错:

javax.jdo.JDOFatalUserException: Class org.datanucleus.api.jdo.JDOPersistenceManagerFactory was not found.

应该是datanucleus-api-jdo-3.2.6.jar,datanucleus-core-3.2.10.jar,datanucleus-rdbms-3.2.9.jar这几个文件没有找到的问题,修改运行命令如下

$SPARK_HOME/bin/spark-submit \

 --master yarn-cluster \

  --class com.spark.basic.exptest \

 --files /usr/local/spark/conf/hive-site.xml \

 --jars /usr/local/spark/lib/datanucleus-api-jdo-3.2.6.jar,/usr/local/spark/lib/datanucleus-core-3.2.10.jar,/usr/local/spark/lib/datanucleus-rdbms-3.2.9.jar,/usr/local/spark/lib/mysql-connector-java-5.1.37.jar \

 /home/ubuntu/ispecexp.jar \

  yarn-cluster 

程序正常运行。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  hive spark