您的位置:首页 > 数据库

Hive常用SQL命令操作

2015-10-22 16:55 281 查看


创建表 hive>
CREATE TABLE pokes (foo INT, bar STRING); 创建表并创建索引字段ds hive>
CREATE TABLE invites (foo INT, bar STRING) PARTITIONED BY (ds STRING); 显示所有表 hive>
SHOW TABLES; 按正条件(正则表达式)显示表, hive>
SHOW TABLES '.*s'; 表添加一列 hive>
ALTER TABLE pokes ADD COLUMNS (new_col INT); 添加一列并增加列字段注释 hive>
ALTER TABLE invites ADD COLUMNS (new_col2 INT COMMENT 'a comment'); 更改表名 hive>
ALTER TABLE events RENAME TO 3koobecaf; 删除列 hive>
DROP TABLE pokes;元数据存储 将文件中的数据加载到表中
hive> LOAD DATA LOCAL INPATH './examples/files/kv1.txt' OVERWRITE INTO TABLE pokes; 加载本地数据,同时给定分区信息 hive> LOAD DATA LOCAL INPATH './examples/files/kv2.txt' OVERWRITE INTO TABLE invites PARTITION (ds='2008-08-15'); 加载DFS数据 ,同时给定分区信息 hive> LOAD DATA INPATH '/user/myname/kv2.txt'
OVERWRITE INTO TABLE invites PARTITION (ds='2008-08-15'); The above command will load data from an HDFS file/directory to the table. Note that loading data from HDFS will result in moving the file/directory. As a result, the operation is almost instantaneous. SQL
操作 按先件查询 hive> SELECT a.foo FROM invites a WHERE a.ds='<DATE>'; 将查询数据输出至目录 hive> INSERT OVERWRITE DIRECTORY
'/tmp/hdfs_out' SELECT a.* FROM invites a WHERE a.ds='<DATE>'; 将查询结果输出至本地目录 hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/local_out' SELECT a.* FROM pokes a; 选择所有列到本地目录 hive> INSERT OVERWRITE TABLE events SELECT a.* FROM profiles a; hive> INSERT OVERWRITE TABLE
events SELECT a.* FROM profiles a WHERE a.key < 100; hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/reg_3' SELECT a.* FROM events a; hive> INSERT OVERWRITE DIRECTORY '/tmp/reg_4' select a.invites, a.pokes FROM profiles a; hive> INSERT OVERWRITE DIRECTORY '/tmp/reg_5'
SELECT COUNT(1) FROM invites a WHERE a.ds='<DATE>'; hive> INSERT OVERWRITE DIRECTORY '/tmp/reg_5' SELECT a.foo, a.bar FROM invites a; hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/sum' SELECT SUM(a.pc) FROM pc1 a; 将一个表的统计结果插入另一个表中 hive> FROM invites a INSERT
OVERWRITE TABLE events SELECT a.bar, count(1) WHERE a.foo > 0 GROUP BY a.bar; hive> INSERT OVERWRITE TABLE events SELECT a.bar, count(1) FROM invites a WHERE a.foo > 0 GROUP BY a.bar; JOIN hive> FROM pokes t1 JOIN invites t2 ON (t1.bar = t2.bar) INSERT OVERWRITE
TABLE events SELECT t1.bar, t1.foo, t2.foo; 将多表数据插入到同一表中 FROM src INSERT OVERWRITE TABLE dest1 SELECT src.* WHERE src.key < 100 INSERT OVERWRITE TABLE dest2 SELECT src.key, src.value WHERE src.key >= 100 and src.key < 200 INSERT OVERWRITE TABLE dest3 PARTITION(ds='2008-04-08',
hr='12') SELECT src.key WHERE src.key >= 200 and src.key < 300 INSERT OVERWRITE LOCAL DIRECTORY '/tmp/dest4.out' SELECT src.value WHERE src.key >= 300; 将文件流直接插入文件 hive> FROM invites a INSERT OVERWRITE TABLE events SELECT TRANSFORM(a.foo, a.bar) AS (oof, rab)
USING '/bin/cat' WHERE a.ds > '2008-08-09'; This streams the data in the map phase through the script /bin/cat (like hadoop streaming). Similarly - streaming can be used on the reduce side (please see the Hive Tutorial or examples)

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  hive sql hive命令