您的位置:首页 > 大数据

hive使用中遇到的问题2

2016-04-12 15:44 417 查看
遇到了奇葩的问题,如下:

select m9.serial_id, m9.max_trade_time
from (select m0.serial_id,
m0.round_id,
m0.max_trade_time,
m0.bet_change_times
from base_game_analyze_lhdb m0
where m0.dt = '20151001') m9,
(select c.serial_id,
c.round_id,
c.bet_change_times,
max(c.max_trade_time) as max_time
from base_game_analyze_lhdb c
where c.dt = '20151001'
and c.serial_id = '5009664253594'
and c.round_id = '4'
group by c.serial_id, c.round_id, c.bet_change_times) n5
where m9.serial_id = n5.serial_id
and m9.round_id = n5.round_id
and m9.max_trade_time = n5.max_time;
查询报错:

Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"_col0":"5009664253594","_col1":"4","_col3":""}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"_col0":"5009664253594","_col1":"4","_col3":""}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException
at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:403)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
... 9 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at org.apache.hadoop.io.Text.set(Text.java:225)
at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:267)
at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:204)
at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.populateCachedDistributionKeys(ReduceSinkOperator.java:433)
at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:342)
... 13 more

但如果修改一下hql,如下:
select m9.serial_id, m9.max_trade_time
from (select m0.serial_id,
m0.round_id,
m0.max_trade_time,
m0.bet_change_times
from base_game_analyze_lhdb m0
where m0.dt = '20151001') m9,
(select c.serial_id,
c.round_id,
c.bet_change_times,
max(c.max_trade_time) as max_time
from base_game_analyze_lhdb c
where c.dt = '20151001'
and c.serial_id = '5009664253594'
and c.round_id = '4'
group by c.serial_id, c.round_id, c.bet_change_times) n5
where m9.serial_id = n5.serial_id
and m9.round_id = n5.round_id
and m9.max_trade_time = n5.max_time
<span style="color:#ff0000;">and n5.bet_change_times = m9.bet_change_times;</span>
只是简单多了一个查询条件,则执行成功。

经过研究发现,此类sql,n5的group by 字段必须与m9的对应字段全部关联,才可查出数据,否则查询结果为空。在n5数据多时,则会报上面的错误(数组下标越界)。

但是还没理解错误的原因,查看表中数据或根据业务,添加的条件是可有可无的,而且,哪怕需要此条件,不加条件查询出笛卡尔积也就好了啊,但查询没有结果。

特此mark一下!
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  技术 大数据 hive