如何区分cassandra里primary key,partition key,cluster key,clustering key
2015-09-10 12:43
447 查看
此文来自 stackoverflow, 有时候自己也搞混, mark 一下:
There is a lot of confusion around this, I will try to make it as simple as possible.
The primary key is a general concept to indicate one or more columns used to retrieve data from a Table.
The primary key may be SIMPLE
That means that it is made by a single column.
But the primary key can also be COMPOSITE (aka COMPOUND),
generated from more columns.
In a situation of COMPOSITE primary key, the "first part" of the key is called PARTITION
KEY (in this example key_part_one is the partition key) and the second part of the key is theCLUSTERING
KEY (key_part_two)
Please note that the both partition and clustering key can be made by more columns
Behind these names ...
The Partition Key is responsible for data distribution accross your nodes. //partition
key 在cassandra内部先转换为primary key 的hash, 根据hash 找到对应Node的token , 这样就直接来到了需要的nodes 节点上
The Clustering Key is responsible for data sorting within the partition.
The Primary Key is equivalent to the Partition
Key in a single-field-key table.
The Composite/Compund Key is just a multiple-columns key
Further usage information: DATASTAX DOCUMENTATION
EDIT due to further requests
Small usage and content examples
SIMPLE KEY:
table content
COMPOSITE/COMPOUND KEY can retrieve "wide rows"
table content
But you can query with all key ...
query output
注意:上面select的例子是针对一下table:
而不是:
Important note: the partition key is the minimum-specifier needed to perform a query using where clause. If you have a composite partition key, like the following: //特别注意: 当使用where时, primary key 是 cassandra 处理的最小单元。
eg:
You can perform query only passing at least both col1 and col2, these are the 2 columns that defines the partition key. The "general" rule to make query is you have to pass at least all partition key columns, then you can add each key in the order they're set.
so the valid queries are (excluding secondary indexes)
col1 and col2
col1 and col2 and col10
col1 and col2 and col10 and col 4
invalid:
col1 and col2 and col4
anything that does not contain both col1 and col2
There is a lot of confusion around this, I will try to make it as simple as possible.
The primary key is a general concept to indicate one or more columns used to retrieve data from a Table.
The primary key may be SIMPLE
create table stackoverflow ( key text PRIMARY KEY, data text );
That means that it is made by a single column.
But the primary key can also be COMPOSITE (aka COMPOUND),
generated from more columns.
create table stackoverflow ( key_part_one text, key_part_two int, data text, PRIMARY KEY(key_part_one, key_part_two) );
In a situation of COMPOSITE primary key, the "first part" of the key is called PARTITION
KEY (in this example key_part_one is the partition key) and the second part of the key is theCLUSTERING
KEY (key_part_two)
Please note that the both partition and clustering key can be made by more columns
create table stackoverflow ( k_part_one text, k_part_two int, k_clust_one text, k_clust_two int, k_clust_three uuid, data text, PRIMARY KEY((k_part_one,k_part_two), k_clust_one, k_clust_two, k_clust_three) );
Behind these names ...
The Partition Key is responsible for data distribution accross your nodes. //partition
key 在cassandra内部先转换为primary key 的hash, 根据hash 找到对应Node的token , 这样就直接来到了需要的nodes 节点上
The Clustering Key is responsible for data sorting within the partition.
The Primary Key is equivalent to the Partition
Key in a single-field-key table.
The Composite/Compund Key is just a multiple-columns key
Further usage information: DATASTAX DOCUMENTATION
EDIT due to further requests
Small usage and content examples
SIMPLE KEY:
insert into stackoverflow (key, data) VALUES ('han', 'solo'); select * from stackoverflow where key='han';
table content
key | data ----+------ han | solo
COMPOSITE/COMPOUND KEY can retrieve "wide rows"
insert into stackoverflow (key_part_one, key_part_two, data) VALUES ('ronaldo', 9, 'football player'); insert into stackoverflow (key_part_one, key_part_two, data) VALUES ('ronaldo', 10, 'ex-football player'); select * from stackoverflow where key_part_one = 'ronaldo';
table content
key_part_one | key_part_two | data --------------+--------------+-------------------- ronaldo | 9 | football player ronaldo | 10 | ex-football player
But you can query with all key ...
select * from stackoverflow where key_part_one = 'ronaldo' and key_part_two = 10;
query output
key_part_one | key_part_two | data --------------+--------------+-------------------- ronaldo | 10 | ex-football player
注意:上面select的例子是针对一下table:
create table stackoverflow ( key_part_one text, key_part_two int, data text, PRIMARY KEY(key_part_one, key_part_two) );
而不是:
create table stackoverflow ( k_part_one text, k_part_two int, k_clust_one text, k_clust_two int, k_clust_three uuid, data text, PRIMARY KEY((k_part_one,k_part_two), k_clust_one, k_clust_two, k_clust_three) );
Important note: the partition key is the minimum-specifier needed to perform a query using where clause. If you have a composite partition key, like the following: //特别注意: 当使用where时, primary key 是 cassandra 处理的最小单元。
eg:
PRIMARY KEY((col1, col2), col10, col4))
You can perform query only passing at least both col1 and col2, these are the 2 columns that defines the partition key. The "general" rule to make query is you have to pass at least all partition key columns, then you can add each key in the order they're set.
so the valid queries are (excluding secondary indexes)
col1 and col2
col1 and col2 and col10
col1 and col2 and col10 and col 4
invalid:
col1 and col2 and col4
anything that does not contain both col1 and col2
相关文章推荐
- iOS 如何在ARC下 使用MRC的类库
- Android项目中TabHost标签切换
- 使用SBT开发Akka第一个案例源码解析消息、main入口、MasterActor
- removeAll
- 使用CodeMaid自动程序排版[转]
- 百度云管家开始耍流氓了
- storm操作语句.docx
- EasyCodeScanner导入Xcode报错
- Hacker之路技能树(1)
- 回望英语-3
- 零基础学python-15.1 为什么需要编写函数
- 零基础学python-15.1 为什么需要编写函数
- MsgBox-官方文档
- 使用位运算计算两个整数的加减
- 前端环境安装
- 深入理解C# 静态类与非静态类、静态成员的区别
- Hibernate笔记——缓存机制详细分析
- 第二届云鼎奖“中国最具潜力企业奖”花落安畅
- 黑马程序员_java05_反射
- ML 徒手系列 拉格朗日乘子法