您的位置:首页 > 编程语言

hbase scan查询代码分析

2014-01-15 10:05 225 查看
Scan查询过程
步骤1. HTable.getScanner()

关掉之前在server端打开的Scanner,防止server端过多的资源占用

client端:ScannerCallable.call() -> close(scannerId)
server端:HRegionServer.close(scannerId)

根据localStartKey在指定region上打开scanner

client端:ScannerCallable.call() -> openScanner(regionName,scan)
server端:

创建RegionScanner
把scanner加入server的map集合
为新生成的scanner创建Lease

步骤2. ResultScanner.next()

从client端缓存中或者server端获取kv

client端:cache.poll() 或者 next(scannerId, caching)
server端:HRegionServer.next(scannerId,nbRows)

RegionScannerImpl.nextRaw(List outResults, int limit, String metric)

Scanner的种类

Server端:InternalScanner & KeyValueScanner
Client端:ResultScanner
其他(HFileScanner、MetaScanner)

1. InternalScanner

是server端内部较高层次的scanner抽象,实现类:

RegionScannerImpl
StoreScanner
KeyValueHeap

接口包括:

next(),返回KeyValue List
close(),关闭scanner并释放server段资源

2. KeyValueScanner

是底层的scanner,用来获取KeyValue,实现类有:

StoreScanner
StoreFileScanner
KeyValueHeap
NonLazyKeyValueScanner 每次都会做doRealSeek(forward)?reseek(kv):seek(kv);

MemStoreScanner
StoreScanner
KeyValueHeap

常用接口:

peek()
next()
seek() 定位到指定的KeyValue
reseek() 从当前scanner位置之后的定位到KeyValue
requestSeek()

KeyValueHeap

在Region层面用来组合访问多个store,在Store层面用来组合访问memstore和storefiles
PriorityQueue存储Scanner,KVScannerComparator对scanner进行排序,先比较peak的kv,再比较SequenceID

MemStoreScanner = Long.MAX_VALUE
StoreFileScanner = SequenceID
StoreScanner = 0

pollRealKV()从PriorityQueue中寻找可以做real seek的scanner

ScanQueryMatcher

在查找KV过程中确定是否包含当前KV,以及接下来如何操作
StoreScanner.getScanners(matcher) -> StoreFileScanner
MatchCode的十种状态

INCLUDE
INCLUDE_AND_SEEK_NEXT_ROW : moreRowsMayExistAfter(),getKeyForNextRow()
INCLUDE_AND_SEEK_NEXT_COL : getKeyForNextColumn()
DONE
DONE_SCAN
SEEK_NEXT_ROW : moreRowsMayExistAfter()
SEEK_NEXT_COL : getKeyForNextColumn()
SKIP : heap.next()
SEEK_NEXT_USING_HINT : getNextKeyHint()
NEXT(没用到): Do not include, jump to next StoreFile or memstore (in time order)

public MatchCode match(KeyValue kv)

比较是否是相同row
比较版本是否过期
检查是否被删除
检查是否在time range
Filters过滤
ColumnTracker检查

ColumnTracker

ScanWildcardColumnTracker
ExplicitColumnTracker

DeleteTracker

ScanDeleteTracker

针对删除的查询策略

retainDeletesInOutput
keepDeletedCells=true,不会再做删除检查
seePastDeleteMarkers
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: