您的位置:首页 > 其它

[Hive - LanguageManual ] Windowing and Analytics Functions (待)

2015-01-26 12:18 459 查看

LanguageManual WindowingAndAnalytics

Skip to end of metadata

Added by Lefty Leverenz, last edited by Lefty Leverenz on Aug 01, 2014 (view change)

show comment

Go to start of metadata

Windowing and Analytics Functions

Windowing and Analytics Functions

Enhancements to Hive QL

Examples

PARTITION BY with one partitioning column, no ORDER BY or window specification

PARTITION BY with two partitioning columns, no ORDER BY or window specification

PARTITION BY with one partitioning column, one ORDER BY column, and no window specification

PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification

PARTITION BY with partitioning, ORDER BY, and window specification

WINDOW clause

LEAD using default 1 row lead and not specifying default value

LAG specifying a lag of 3 rows and default value of 0

Enhancements to Hive QL

Version

Icon

Introduced in Hive version 0.11.

This section introduces the Hive QL enhancements for windowing and analytics functions. See "Windowing Specifications in HQL" (attached to HIVE-4197) for details. HIVE-896 has more information, including links to earlier documentation in the initial comments.

All of the windowing and analytics functions operate as per the SQL standard.

The current release supports the following functions for windowing and analytics:

Windowing functions

LEAD

The number of rows to lead can optionally be specified. If the number of rows to lead is not specified, the lead is one row.

Returns null when the lead for the current row extends beyond the end of the window.

LAG

The number of rows to lag can optionally be specified. If the number of rows to lag is not specified, the lag is one row.

Returns null when the lag for the current row extends before the beginning of the window.

FIRST_VALUE

LAST_VALUE

The OVER clause

OVER with standard aggregates:

COUNT

SUM

MIN

MAX

AVG

OVER with a PARTITION BY statement with one or more partitioning columns of any primitive datatype.

OVER with PARTITION BY and ORDER BY with one or more partitioning and/or ordering columns of any datatype.

OVER with a window specification. Windows can be defined separately in a WINDOW clause. Window specifications support these standard options:

ROWS ((CURRENT ROW) | (UNBOUNDED | [num]) PRECEDING) AND (UNBOUNDED | [num]) FOLLOWING


Icon

The OVER clause supports the following functions, but it does not support a window with them (see HIVE-4797):

Ranking functions: Rank, NTile, DenseRank, CumeDist, PercentRank.

Lead and Lag functions.

Analytics functions

RANK

ROW_NUMBER

DENSE_RANK

CUME_DIST

PERCENT_RANK

NTILE

Examples

This section provides examples of how to use the Hive QL windowing and analytics functions in SELECT statements. See HIVE-896 for additional examples.

PARTITION BY with one partitioning column, no ORDER BY or window specification

PARTITION BY with two partitioning columns, no ORDER BY or window specification

PARTITION BY with one partitioning column, one ORDER BY column, and no window specification

PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification

PARTITION BY with partitioning, ORDER BY, and window specification

There can be multiple
OVER
clauses in a single query. A single
OVER
clause only applies to the immediately preceding function call. In this example, the first OVER clause applies to COUNT(b) and the second OVER clause applies to SUM(b):

Aliases can be used as well, with or without the keyword AS:

WINDOW clause

LEAD using default 1 row lead and not specifying default value

LAG specifying a lag of 3 rows and default value of 0

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: