[HIve - LanguageManual] LateralView
2015-01-26 11:38
369 查看
Lateral View Syntax
Description
Example
Multiple Lateral Views
Outer Lateral Views
Version
Icon
Prior to Hive 0.6.0, lateral view did not support the predicate push-down optimization. In Hive 0.5.0 and earlier, if you used a WHERE clause your query may not have compiled. A workaround was to add
Version
Icon
From Hive 0.12.0, column aliases can be omitted. In this case, aliases are inherited from field names of StructObjectInspector which is returned from UTDF.
An example table with two rows:
and the user would like to count the total number of times an ad appears across all pages.
A lateral view with explode() can be used to convert
The resulting output will be
Then in order to count the number of times a particular ad appears, count/group by can be used:
For example, the following could be a valid query:
LATERAL VIEW clauses are applied in the order that they appear. For example with the following base table:
The query:
Will produce:
A query that adds an additional LATERAL VIEW:
Will produce:
Icon
Introduced in Hive version 0.12.0
The user can specify the optional
For example, the following query returns an empty result:
But with the
it will produce:
238 val_238 NULL
86 val_86 NULL
311 val_311 NULL
27 val_27 NULL
165 val_165 NULL
409 val_409 NULL
255 val_255 NULL
278 val_278 NULL
98 val_98 NULL
Description
Example
Multiple Lateral Views
Outer Lateral Views
Lateral View Syntax
Description
Lateral view is used in conjunction with user-defined table generating functions such asexplode(). As mentioned in Built-in Table-Generating Functions, a UDTF generates zero or more output rows for each input row. A lateral view first applies the UDTF to each row of base table and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias.
Version
Icon
Prior to Hive 0.6.0, lateral view did not support the predicate push-down optimization. In Hive 0.5.0 and earlier, if you used a WHERE clause your query may not have compiled. A workaround was to add
set hive.optimize.ppd=false;before your query. The fix was made in Hive 0.6.0; seehttps://issues.apache.org/jira/browse/HIVE-1056: Predicate push down does not work with UDTF's.
Version
Icon
From Hive 0.12.0, column aliases can be omitted. In this case, aliases are inherited from field names of StructObjectInspector which is returned from UTDF.
Example
Consider the following base table namedpageAds. It has two columns:
pageid(name of the page) and
adid_list(an array of ads appearing on the page):
Column name | Column type |
---|---|
pageid | STRING |
adid_list | Array<int> |
pageid | adid_list |
---|---|
front_page | [1, 2, 3] |
contact_page | [3, 4, 5] |
A lateral view with explode() can be used to convert
adid_listinto separate rows using the query:
pageid (string) | adid (int) |
---|---|
"front_page" | 1 |
"front_page" | 2 |
"front_page" | 3 |
"contact_page" | 3 |
"contact_page" | 4 |
"contact_page" | 5 |
int adid | count(1) |
1 | 1 |
2 | 1 |
3 | 2 |
4 | 1 |
5 | 1 |
Multiple Lateral Views
A FROM clause can have multiple LATERAL VIEW clauses. Subsequent LATERAL VIEWS can reference columns from any of the tables appearing to the left of the LATERAL VIEW.For example, the following could be a valid query:
Array<int> col1 | Array<string> col2 |
[1, 2] | [a", "b", "c"] |
[3, 4] | [d", "e", "f"] |
int mycol1 | Array<string> col2 |
1 | [a", "b", "c"] |
2 | [a", "b", "c"] |
3 | [d", "e", "f"] |
4 | [d", "e", "f"] |
int myCol1 | string myCol2 |
1 | "a" |
1 | "b" |
1 | "c" |
2 | "a" |
2 | "b" |
2 | "c" |
3 | "d" |
3 | "e" |
3 | "f" |
4 | "d" |
4 | "e" |
4 | "f" |
Outer Lateral Views
VersionIcon
Introduced in Hive version 0.12.0
The user can specify the optional
OUTERkeyword to generate rows even when a
LATERAL VIEWusually would not generate a row. This happens when the UDTF used does not generate any rows which happens easily with
explodewhen the column to explode is empty. In this case the source row would never appear in the results.
OUTERcan be used to prevent that and rows will be generated with
NULLvalues in the columns coming from the UDTF.
For example, the following query returns an empty result:
OUTERkeyword
238 val_238 NULL
86 val_86 NULL
311 val_311 NULL
27 val_27 NULL
165 val_165 NULL
409 val_409 NULL
255 val_255 NULL
278 val_278 NULL
98 val_98 NULL
相关文章推荐
- [Hive - LanguageManual] Archiving for File Count Reduction
- [Hive - LanguageManual] DML: Load, Insert, Update, Delete
- [HIve - LanguageManual] Hive Operators and User-Defined Functions (UDFs)
- [Hive - LanguageManual] Import/Export
- [HIve - LanguageManual] XPathUDF
- [HIve - LanguageManual] Joins
- [HIve - LanguageManual] Join Optimization (不懂)
- [Hive - LanguageManual] Select base use
- [HIve - LanguageManual] Union
- [Hive - LanguageManual] GroupBy
- [HIve - LanguageManual] Subqueries
- [Hive - LanguageManual] Sampling
- [Hive - LanguageManual] VirtualColumns
- [Hive - LanguageManual] Create/Drop/Alter Database Create/Drop/Truncate Table
- [Hive - LanguageManual ] Windowing and Analytics Functions (待)
- [Hive - LanguageManual] Alter Table/Partition/Column
- [Hive - LanguageManual] Create/Drop/Alter -View、 Index 、 Function
- [Hive - LanguageManual ] Explain (待)
- [Hive - LanguageManual] Create/Drop/Grant/Revoke Roles and Privileges / Show Use
- [Hive - LanguageManual] Hive Concurrency Model (待)