您的位置:首页 > 其它

Hive(四)——电商交易项目案例

2016-09-24 16:43 134 查看
电商交易项目案例

Sdate定义了日期的分类,将每天分别赋予所属的月份、星期、季度等属性,

字段分别为日期、年月、年、月、日、周几、第几周、季度、旬、半月;

Stock定义了订单表头,字段分别为订单号、交易位置、交易日期;

StockDetail文件定义了订单明细,该表和Stock以交易号进行关联,

字段分别为订单号、行号、货品、数量、价格、金额;

CREATE TABLE sdate(

dateID string, --日期

theyearmonth string,--年月

theyear string,--年

themonth string,--月

thedate string,--日

theweek string,--周几

theweeks string,--第几周

thequot string,--季度

thetenday string,--旬

thehalfmonth string --半月



ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

LINES TERMINATED BY '\n' ;

CREATE TABLE stock(

ordernumber string,--订单号

locationid string,--交易位置

dateID string --交易日期



ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

LINES TERMINATED BY '\n' ;

CREATE TABLE stockdetail(

ordernumber string,--订单号

rownum int,--行号

itemid string, --货品

qty int, --数量

price int, --价格

amount int --总金额

) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 

LINES TERMINATED BY '\n' ;

创建表,导入数据

load data local inpath '/home/zkpk/tbdata/sdate.txt'

overwrite into table sdate;

load data local inpath '/home/zkpk/tbdata/stock.txt'

overwrite into table stock;

load data local inpath '/home/zkpk/tbdata/stockdetail.txt'

overwrite into table stockdetail;

1、计算所有订单每年的总金额
算法分析:
要计算所有订单每年的总金额,首先需要获取所有订单的订单号、订单日期和订单金信息,
然后把这些信息和日期表进行关联,
获取年份信息,最后根据这四个列按年份归组统计获取所有订单每年的总金额。

关于三张表:stock a, stockdetail b, sdate c

select c.theyear,sum(b.amount) 

from stock a,stockdetail b,sdate c

where a.ordernumber=b.ordernumber and a.dateid=c.dateid

group by c.theyear order by c.theyear;

Result:

2004 3265696

2005 13247234

2006 13670416

2007 16711974

2008 14670698

2009 6322137

2010 210924

2、计算所有订单每年最大金额订单的销售额
算法分析:

该算法分为两步:

1.按照日期和订单号进行归组计算,
 获取所有订单每天的销售数据;
stock a,stockdetail b

select a.dateid, a.ordernumber,sum(b.amount) as sumofamount

from stock a,stockdetail b

where a.ordernumber=b.ordernumber

group by a.dateid,a.ordernumber;

2.把第一步获取的数据和日期表进行关联获取的年份信息,
然后按照年份进行归组,使用Max函数,获取所有订单每年最大金额订单的销售额。

sdate c,第一步获取的数据 d

select c.theyear,max(d.sumofamount) from sdate c,

(select a.dateid, a.ordernumber,sum(b.amount) as sumofamount

from stock a,stockdetail b

where a.ordernumber=b.ordernumber

group by a.dateid,a.ordernumber)d

where c.dateid=d.dateid

group by c.theyear order by c.theyear;

Result:

2004 23612

2005 38180

2006 36124

2007 159126

2008 55828

2009 25810

2010 13063

3、统计所有订单中季度销售额前10位

stock a,stockdetail b,sdate c

select c.theyear,c.thequot,sum(b.amount) as sumofamount

from stock a,stockdetail b,sdate c

where a.ordernumber=b.ordernumber and a.dateid=c.dateid

group by c.theyear,c.thequot

order by sumofamount desc limit 10;

Result:

2008 1
5252819

2007 4
4613093

2007 1
4446088

2006 1
3916638

2008 2
3886470

2007 3
3870558

2007 2
3782235

2006 4
3691314

2005 1
3592007

2005 3
3304243

4、列出销售金额在100000以上的单据(订单号)

stock a,stockdetail b

select a.ordernumber,sum(b.amount) as sumofamount

from stock a,stockdetail b

where a.ordernumber=b.ordernumber

group by a.ordernumber

having sumofamount>100000;

Result:

HMJSL00009024 119058

HMJSL00009958 159126

5、所有订单中每年最畅销货品

第一步:

统计出每年每种货品的销售总金额

stock a,stockdetail b,sdate c

===================================

select c.theyear,b.itemid,sum(b.amount) as sumofamount

from stock a,stockdetail b,sdate c

where a.ordernumber=b.ordernumber and a.dateid=c.dateid

group by c.theyear,b.itemid;

Result:

.........

2010 ZX219365210101
299

2010 ZX219373110101
269

2010 ZX219373810101
-269

2010 ZX219373812201
657

2010 ZX219381020101
1196

2010 ZX219392110101
299

2010 ZX219392112201
598

2010 ZX219392212201
598

2010 yl427465200101
398

第二步:

在第一步的数据上,统计出每年最大的销售总金额

将第一步的数据集起别名为d;

select d.theyear,max(sumofamount) as maxofamount from 

(select c.theyear,b.itemid,sum(b.amount) as sumofamount

from stock a,stockdetail b,sdate c

where a.ordernumber=b.ordernumber and a.dateid=c.dateid

group by c.theyear,b.itemid) d

group by d.theyear;

Result:

2004 53374

2005 56569

2006 113684

2007 70226

2008 97981

2009 30029

2010 4494

第三步:所有订单中每年最畅销货品

e:每年每种货品的销售总金额

f:每年最大的销售总金额

select distinct e.theyear,e.itemid,f.maxofamount from 

(select c.theyear,b.itemid,

sum(b.amount) as sumofamount from stock a,stockdetail b,sdate c

where a.ordernumber=b.ordernumber and a.dateid=c.dateid 

group by c.theyear,b.itemid) e, 

(select d.theyear,max(d.sumofamount) as maxofamount from

(select c.theyear,b.itemid,sum(b.amount) as sumofamount 

from stock a,stockdetail b,sdate c 

where a.ordernumber=b.ordernumber and a.dateid=c.dateid 

group by c.theyear,b.itemid) d 

group by d.theyear) f 

where e.theyear=f.theyear and e.sumofamount=f.maxofamount 

order by e.theyear;

Result:

2004 JY424420810101
53374

2005 24124118880102
56569

2006 JY425468460101
113684

2007 JY425468460101
70226

2008 E2628204040101
97981

2009 YL327439080102
30029

2010 SQ429425090101
4494
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: