Spark-SQL连接Hive
2017-09-24 17:52
387 查看
第一步:修个Hive的配置文件hive-site.xml
添加如下属性,取消本地元数据服务:
修改Hive元数据服务地址和端口:
然后把配置文件hive-site.xml拷贝到Spark的conf目录下
第二步:对于Hive元数据库使用Mysql的把mysql-connector-java-5.1.41-bin.jar拷贝到Spark的jar目录下
到这里已经能够在Scala终端下查询Hive数据库了
但是某人一开始的要求是用Spark-SQL查询Hive呀
于是启动Spark-SQL,启了一天了都是报下面的错误
一开始我查这个bug都是用第一行的报错信息查,都没成功,后面搜了下最后一个报错信息
message:Version information not found in metastore
终于找到问题解决方法了,把hive-site.xml中的hive.metastore.schema.verification的值改为false
原因应该是Hive的jar包和存储元数据信息版本不一致,这里设置不验证就可以了。
参考博客:http://www.cnblogs.com/rocky-AGE-24/p/7345417.html
http://blog.csdn.net/jyl1798/article/details/41087533
http://dblab.xmu.edu.cn/blog/1086-2/
http://blog.csdn.net/youngqj/article/details/19987727
添加如下属性,取消本地元数据服务:
<property> <name>hive.metastore.local</name> <value>false</value> </property>
修改Hive元数据服务地址和端口:
<property> <name>hive.metastore.uris</name> <value>thrift://192.168.10.10:9083</value> <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description> </property>
然后把配置文件hive-site.xml拷贝到Spark的conf目录下
第二步:对于Hive元数据库使用Mysql的把mysql-connector-java-5.1.41-bin.jar拷贝到Spark的jar目录下
到这里已经能够在Scala终端下查询Hive数据库了
但是某人一开始的要求是用Spark-SQL查询Hive呀
于是启动Spark-SQL,启了一天了都是报下面的错误
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:114) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) ... 11 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) ... 17 more Caused by: MetaException(message:Version information not found in metastore. ) at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:6664) at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:6645) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) at com.sun.proxy.$Proxy6.verifySchema(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:572) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) ... 22 more
一开始我查这个bug都是用第一行的报错信息查,都没成功,后面搜了下最后一个报错信息
message:Version information not found in metastore
终于找到问题解决方法了,把hive-site.xml中的hive.metastore.schema.verification的值改为false
<property> <name>hive.metastore.schema.verification</name> <value>false</value> <description> Enforce metastore schema version consistency. True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures proper metastore schema migration. (Default) False: Warn if the version information stored in metastore doesn't match with one from in Hive jars. </description> </property>
原因应该是Hive的jar包和存储元数据信息版本不一致,这里设置不验证就可以了。
参考博客:http://www.cnblogs.com/rocky-AGE-24/p/7345417.html
http://blog.csdn.net/jyl1798/article/details/41087533
http://dblab.xmu.edu.cn/blog/1086-2/
http://blog.csdn.net/youngqj/article/details/19987727
相关文章推荐
- 分布式sparkSQL引擎应用:从远程通过thriftServer连接spark集群处理hive中的数据
- Spark-sql 连接hive中遇到的问题
- spark sql连接hive时找不到驱动
- 2.2、配置Spark-sql(连接Hive)
- spark-sql读取映射hbase数据的hive外部表
- SparkSQL与Hive的整合
- Spark-sql 结合hive使用
- SparkSQL On Yarn with Hive,操作和访问Hive表
- 一起学spark(10) -- spark SQL中的结构化数据之一 : Apache Hive
- spark.sql.hive.convertMetastoreParquet参数优化
- 第69课:SparkSQL通过Hive数据源实战学习笔记
- SparkSQL 通过jdbc连接Mysql(68)
- Spark SQL整合Hive使用
- Hive on Spark 与Spark SQL比较
- spark sql 无法访问 hive metastore问题解决
- Spark SQL 与 Spark SQL on Hive 区别
- Spark SQL 与 Spark SQL on Hive 区别
- spark SQL学习(spark连接hive)
- SparkSQL和Hive自定义函数对比
- [Spark][Hive][Python][SQL]Spark 读取Hive表的小例子