sparksql与hive整合
2015-09-14 18:33
363 查看
hive配置
编辑 $HIVE_HOME/conf/hive-site.xml,增加如下内容:<property> <name>hive.metastore.uris</name> <value>thrift://master:9083</value> <description>Thrift uri for the remote metastore. Used by metastore client to connect to remote metastore.</description> </property>
启动hive metastore
启动 metastore: $hive --service metastore & 查看 metastore: $jobs [1]+ Running hive --service metastore & 关闭 metastore: $kill %1 kill %jobid,1代表job id
spark配置
将 $HIVE_HOME/conf/hive-site.xml copy或者软链 到 $SPARK_HOME/conf/ 将 $HIVE_HOME/lib/mysql-connector-java-5.1.12.jar copy或者软链到$SPARK_HOME/lib/ copy或者软链$SPARK_HOME/lib/ 是方便spark standalone模式使用
启动spark-sql
standalone模式./bin/spark-sql --master spark:master:7077 --jars /home/stark_summer/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jar
yarn-client模式
$./bin/spark-sql --master yarn-client --jars /home/stark_summer/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jar 执行 sql: select count(*) from o2o_app; 结果: 302 Time taken: 0.828 seconds, Fetched 1 row(s) 2015-09-14 18:27:43,158 INFO [main] CliDriver (SessionState.java:printInfo(536)) - Time taken: 0.828 seconds, Fetched 1 row(s) spark-sql> 2015-09-14 18:27:43,160 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - Finished stage: org.apache.spark.scheduler.StageInfo@5939ed30 2015-09-14 18:27:43,161 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - task runtime:(count: 1, mean: 242.000000, stdev: 0.000000, max: 242.000000, min: 242.000000) 2015-09-14 18:27:43,161 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 0% 5% 10% 25% 50% 75% 90% 95% 100% 2015-09-14 18:27:43,161 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 242.0 ms 242.0 ms 242.0 ms 242.0 ms 242.0 ms 242.0 ms 242.0 ms 242.0 ms 242.0 ms 2015-09-14 18:27:43,162 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - fetch wait time:(count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000) 2015-09-14 18:27:43,162 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 0% 5% 10% 25% 50% 75% 90% 95% 100% 2015-09-14 18:27:43,162 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 2015-09-14 18:27:43,163 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - remote bytes read:(count: 1, mean: 31.000000, stdev: 0.000000, max: 31.000000, min: 31.000000) 2015-09-14 18:27:43,163 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 0% 5% 10% 25% 50% 75% 90% 95% 100% 2015-09-14 18:27:43,163 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 31.0 B 31.0 B 31.0 B 31.0 B 31.0 B 31.0 B 31.0 B 31.0 B 31.0 B 2015-09-14 18:27:43,163 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - task result size:(count: 1, mean: 1228.000000, stdev: 0.000000, max: 1228.000000, min: 1228.000000) 2015-09-14 18:27:43,163 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 0% 5% 10% 25% 50% 75% 90% 95% 100% 2015-09-14 18:27:43,163 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 1228.0 B 1228.0 B 1228.0 B 1228.0 B 1228.0 B 1228.0 B 1228.0 B 1228.0 B 1228.0 B 2015-09-14 18:27:43,164 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - executor (non-fetch) time pct: (count: 1, mean: 69.834711, stdev: 0.000000, max: 69.834711, min: 69.834711) 2015-09-14 18:27:43,164 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 0% 5% 10% 25% 50% 75% 90% 95% 100% 2015-09-14 18:27:43,164 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 70 % 70 % 70 % 70 % 70 % 70 % 70 % 70 % 70 % 2015-09-14 18:27:43,165 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - fetch wait time pct: (count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000) 2015-09-14 18:27:43,165 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 0% 5% 10% 25% 50% 75% 90% 95% 100% 2015-09-14 18:27:43,165 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 0 % 0 % 0 % 0 % 0 % 0 % 0 % 0 % 0 % 2015-09-14 18:27:43,166 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - other time pct: (count: 1, mean: 30.165289, stdev: 0.000000, max: 30.165289, min: 30.165289) 2015-09-14 18:27:43,166 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 0% 5% 10% 25% 50% 75% 90% 95% 100% 2015-09-14 18:27:43,166 INFO [SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - 30 % 30 % 30 % 30 % 30 % 30 % 30 % 30 % 30 %
yarn-cluster模式
./bin/spark-sql --master yarn-cluster --jars /home/dp/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jar Error: Cluster deploy mode is not applicable to Spark SQL shell. Run with --help for usage help or --verbose for debug output 2015-09-14 18:28:28,291 INFO [Thread-0] util.Utils (Logging.scala:logInfo(59)) - Shutdown hook called Cluster deploy mode 不支持的
启动 spark-shell
standalone模式./bin/spark-shell --master spark:master:7077 --jars /home/stark_summer/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jar
yarn-client模式
./bin/spark-shell --master yarn-client --jars /home/dp/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jar sqlContext.sql("from o2o_app SELECT count(appkey,name1,name2)").collect().foreach(println)
尊重原创,拒绝转载,/article/1810531.html
相关文章推荐
- sparksql与hive整合
- oracle 11g密码过期但不想更新密码
- redis cluster
- sql server 查询正在运行的脚本
- SQL里的EXISTS与in、not exists与not in
- 查看mysql自增长到哪个id了
- mysql cluster 集群架构配置
- memcached CAS
- oracle驱动问题
- MySql 常用命令及问题处理
- Mysql13 复制2
- NoSql-MongoDB GridFS+ASP.NET MVC实现上传,显示
- SqlServer_事务
- redis cluster主从复制
- LNMP系列——twemproxy缓存代理在memcached 缓存环境应用
- C#_备份sqlserver数据库
- SQL语句执行时间测试
- mysql给用户增删改数据表的权限
- mysql第一天 架构
- Oracle日期函数months_between的用法