您的位置：首页 > 其它

表篇分区

2016-04-12 15:26 316 查看

转载自：/article/3865122.html

3.表篇分区

不用于关系数据库partition中的字段可以不再table中，但是partition中的字段可以如同table中column一样使用这样可以加快查询速度，因为只用查找一个目下文件就可以了这里分区分为单分区partition一个column，多分区partition多个column单分区就一个目录，多分区也是一个目录，并嵌套多个目录
实例：按照 country 和 state 给employee多分区
CREATE TABLE employees (
name STRING,
salary FLOAT,
subordinates ARRAY<STRING>,
deductions MAP<STRING, FLOAT>,
address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>
)
PARTITIONED BY (country STRING, state STRING);
查看partition
show partitions employees;
SHOW PARTITIONS employees PARTITION(country='US');
添加partition(不区分大小写)
alter table employees add partition(country='US',state='dallas')
alter table employees add partition(country='US',state='dallas') location '/home/hadoop/us-dallas'
alter table employees add partition(country='US',state='dallas') location '/home/hadoop/us-dallas' partition(country='US',state='ca') location '/home/hadoop/us-dallas'
删除partition,分区数据和元数据都被删除
alter table employees drop partition(country='us',state='dallas');
向分区中添加数据
load data inpath '/home/hadoop/resource/dallas' into table employees partition(country='us',state='dallas');

分区的属性
set hive.mapred.mode=strict;属性禁止没有where的语句执行在partition的table上（防止数据量巨大得table，执行这样没有限制的语句）
set hive.mapred.mode=nonstrict;

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航