您的位置:首页 > 数据库 > SQL

hadoop学习之HIVE(3.3):hiveSQL语句详解(DML)

2016-12-04 17:23 471 查看
本教程使用3个表来讲解常见hiveSQL的用法

一,3表数据如下:

1,student.txt

95001,李勇,男,20,CS
95002,刘晨,女,19,IS
95003,王敏,女,22,MA
95004,张立,男,19,IS
95005,刘刚,男,18,MA
95006,孙庆,男,23,CS
95007,易思玲,女,19,MA
95008,李娜,女,18,CS
95009,梦圆圆,女,18,MA
95010,孔小涛,男,19,CS
95011,包小柏,男,18,MA
95012,孙花,女,20,CS
95013,冯伟,男,21,CS
95014,王小丽,女,19,CS
95015,王君,男,18,MA
95016,钱国,男,21,MA
95017,王风娟,女,18,IS
95018,王一,女,19,IS
95019,邢小丽,女,19,IS
95020,赵钱,男,21,IS
95021,周二,男,17,MA
95022,郑明,男,20,MA


2,score.txt

95001,1,81
95001,2,85
95001,3,88
95001,4,70
95002,2,90
95002,3,80
95002,4,71
95002,5,60
95003,1,82
95003,3,90
95003,5,100
95004,1,80
95004,2,92
95004,4,91
95004,5,70
95005,1,70
95005,2,92
95005,3,99
95005,6,87
95006,1,72
95006,2,62
95006,3,100
95006,4,59
95006,5,60
95006,6,98
95007,3,68
95007,4,91
95007,5,94
95007,6,78
95008,1,98
95008,3,89
95008,6,91
95009,2,81
95009,4,89
95009,6,100
95010,2,98
95010,5,90
95010,6,80
95011,1,81
95011,2,91
95011,3,81
95011,4,86
95012,1,81
95012,3,78
95012,4,85
95012,6,98
95013,1,98
95013,2,58
95013,4,88
95013,5,93
95014,1,91
95014,2,100
95014,4,98
95015,1,91
95015,3,59
95015,4,100
95015,6,95
95016,1,92
95016,2,99
95016,4,82
95017,4,82
95017,5,100
95017,6,58
95018,1,95
95018,2,100
95018,3,67
95018,4,78
95019,1,77
95019,2,90
95019,3,91
95019,4,67
95019,5,87
95020,1,66
95020,2,99
95020,5,93
95021,2,93
95021,5,91
95021,6,99
95022,3,69
95022,4,93
95022,5,82
95022,6,100


3,course.txt

1,数据库
2,数学
3,信息系统
4,操作系统
5,数据结构
6,数据处理


二,将表保存到数据仓库(hive'表)

1,针对student.txt建表

create table student(
id int,
name string,
gender string,
age int,
master string)
row format delimited
fields terminated by ','
stored as textfile;
load data local inpath '.../student.txt' overwrite into table student;


2,针对score.txt建表

create table score(
id int,
courseId int,
score int)
row format delimited
fields terminated by ','
stored as textfile;
load data local inpath '.../score.txt' overwrite into table score;


3,针对course.txt建表

create table course(
courseId int,
courseName string)
row format delimited
fields terminated by ','
stored as textfile;
load data local inpath '.../course.txt' overwrite into table course;


三,hiveSQL语句示例:

1,查询全体学生的学号与姓名
select id, name from student;
2,查询选修了课程的学生姓名
select name from student  where master is not null;

############################################
----hive的group by 和集合函数

3,查询学生的总人数
select count(1) from student;
4,计算1号课程的学生平均成绩
select avg(score) from score  where id = 1;
5,查询各科成绩平均分
select avg(score) from score group by id;
6,查询选修1号课程的学生最高分数
  	select max(score) from score where id=1;
7,求各个课程号及相应的选课人数
select id,count(1) from score group by id;
8,查询选修了3门以上的课程的学生学号
select id from score group by id having count(1)>3;

########################################################

----hive的Order By/Sort By/Distribute By
9,查询学生信息,结果按学号全局有序
select * from student order by id;
10,查询学生信息,结果按性别分组再按年龄有序
select * from student distribute by gender sort by age;
使用distribute by时,要先set mapred.reduce.tasks = NUM; 其中NUM >= “distribute by”后面字段的种类。
########################################################
----Join查询
11,查询每个学生及其选修课程的情况
  	select student.*,score.* from student join score on(student.id =sc.id);
12,查询学生的得分情况。
  	select student.name,course.coursename,score.score from student join score on student.id=score.id join course on score.courseid=course.courseid;
13,查询选修2号课程且成绩在90分以上的所有学生。
select stu.id,stu.name,score.courseid,score.score from stu join score on stu.id=score.id where score.courseid=2; 
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: