关于HBase MVCC的设计原理以及MVCC所引起的一个scan问题
2014-06-09 17:16
615 查看
最近在使用HBase0.94版本的时,偶尔会出现,HRegionInfo was null or empty in Meta 的警告
java.io.IOException: HRegionInfo was null or empty in Meta for writetest, row=lot_let,9399239430349923234234,99999999999999
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:170)
在客户端的MetaScanner.metaScan实现中
metaTable = new HTable(configuration, HConstants.META_TABLE_NAME);
Result startRowResult = metaTable.getRowOrBefore(searchRow,HConstants.CATALOG_FAMILY);
if (startRowResult == null) { throw new TableNotFoundException("Cannot find row in .META. for table: " + Bytes.toString(tableName) + ", row=" + Bytes.toStringBinary(searchRow)); }
byte[] value = startRowResult.getValue(HConstants.CATALOG_FAMILY,
HConstants.REGIONINFO_QUALIFIER);
if (value == null || value.length == 0) { throw new IOException("HRegionInfo was null or empty in Meta for " + Bytes.toString(tableName) + ", row=" + Bytes.toStringBinary(searchRow)); }
可以发现在扫描MetaScanner,rowkey所在的范围在Meta 表中不存在;通过RPC定位到服务端的实现
HRegion中:
public Result getClosestRowBefore(final byte [] row, final byte [] family)
throws IOException {
if (coprocessorHost != null) {
Result result = new Result();
if (coprocessorHost.preGetClosestRowBefore(row, family, result)) {
return result;
}
}
// look across all the HStores for this region and determine what the
// closest key is across all column families, since the data may be sparse
checkRow(row, "getClosestRowBefore");
startRegionOperation();
this.readRequestsCount.increment();
try {
Store store = getStore(family);
// get the closest key. (HStore.getRowKeyAtOrBefore can return null)
KeyValue key = store.getRowKeyAtOrBefore(row);
Result result = null;
if (key != null) {
Get get = new Get(key.getRow());
get.addFamily(family);
result = get(get, null);
}
if (coprocessorHost != null) {
coprocessorHost.postGetClosestRowBefore(row, family, result);
}
return result;
} finally {
closeRegionOperation();
}
}
在 KeyValue key = store.getRowKeyAtOrBefore(row);中获得了Meta表的rowkey,但是在后续的实现中
if (key != null) {
Get get = new Get(key.getRow());
get.addFamily(family);
result = get(get, null);
}
获得空的result导致了这个问题;
为什么会存在这个现象。
先讲一下HBase 的MVCC的原理,
MVCC是保证数据一致性的手段,HBase在写数据的过程中,需要经过好几个阶段,写HLog,写memstore,更新MVCC;
只有更新了MVCC,才算真正memstore写成功,其中事务的隔离需要有mvcc的来控制,比如读数据不可以获取别的线程还未提交的数据。
1、put、delete数据都会调用applyFamilyMapToMemstore
HRegion中
private long applyFamilyMapToMemstore(Map<byte[], List<KeyValue>> familyMap,
MultiVersionConsistencyControl.WriteEntry localizedWriteEntry) {
long size = 0;
boolean freemvcc = false;
try {
if (localizedWriteEntry == null) {
//开始一个写memstore,mvcc中的memstoreWrite++,并add待write pending队列中
localizedWriteEntry = mvcc.beginMemstoreInsert();
freemvcc = true;
}
for (Map.Entry<byte[], List<KeyValue>> e : familyMap.entrySet()) {
byte[] family = e.getKey();
List<KeyValue> edits = e.getValue();
Store store = getStore(family);
for (KeyValue kv: edits) {
kv.setMemstoreTS(localizedWriteEntry.getWriteNumber());
size += store.add(kv);
}
}
} finally {
if (freemvcc) {
mvcc.completeMemstoreInsert(localizedWriteEntry);
}
}
return size;
}
mvcc.completeMemstoreInsert,更新mvcc 的memstoreRead,也就是可以读的位置, 并通知readWaiters.notifyAll(),释放因flushcache调用waitForRead引起的阻塞;
waitForRead参见以下代码:
public void waitForRead(WriteEntry e) {
boolean interrupted = false;
synchronized (readWaiters) {
//小于,表示还有写未提交
while (memstoreRead < e.getWriteNumber()) {
try {
readWaiters.wait(0);
} catch (InterruptedException ie) {
// We were interrupted... finish the loop -- i.e. cleanup --and then
// on our way out, reset the interrupt flag.
interrupted = true;
}
}
}
if (interrupted) Thread.currentThread().interrupt();
}
2、 在flushcache的过程中,获取到memstore中的keyvalues后,会调用mvcc.waitForRead(w)(因memstore所有的keyvalue,包括还未真正提交的,所以要等待其他事务提交后,才可以进行后续的flush操作,保证事务的一致性。
w = mvcc.beginMemstoreInsert();
mvcc.advanceMemstore(w);
mvcc.waitForRead(w);
3、scan数据
在RegionScannerImpl.next方法实现中:
public synchronized boolean next(List<KeyValue> outResults, int limit)
throws IOException {
if (this.filterClosed) {
throw new UnknownScannerException("Scanner was closed (timed out?) " +
"after we renewed it. Could be caused by a very slow scanner " +
"or a lengthy garbage collection");
}
startRegionOperation();
readRequestsCount.increment();
try {
// This could be a new thread from the last time we called next().
//this.readPoint在构造的时,初始化(readpoint为当前hregion的mvcc中的memstoreRead,为当前可读的点)和当前线程绑定
MultiVersionConsistencyControl.setThreadReadPoint(this.readPt);
在MemStore中过滤掉还未提交的事务(新的keyvalue中有最新的point)
protected KeyValue getNext(Iterator<KeyValue> it) {
long readPoint = MultiVersionConsistencyControl.getThreadReadPoint();
while (it.hasNext()) {
KeyValue v = it.next();
//过滤掉大于当前线程readPoint的keyvalue
if (v.getMemstoreTS() <= readPoint) {
return v;
}
}
return null;
}
纵观MVCC的整个过程,再分析HRegion中的getClosestRowBefore方法实现,
KeyValue key = store.getRowKeyAtOrBefore(row);
这个调用不会进行MVCC的控制,可以读到memstore中所有的数据
而get方法是会进行MVCC进行控制的,所以一种可能情况是在get调用的时, store.getRowKeyAtOrBefore(row)读到的key值还未提交,
所有都过滤掉了,查询范围为null。
java.io.IOException: HRegionInfo was null or empty in Meta for writetest, row=lot_let,9399239430349923234234,99999999999999
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:170)
在客户端的MetaScanner.metaScan实现中
metaTable = new HTable(configuration, HConstants.META_TABLE_NAME);
Result startRowResult = metaTable.getRowOrBefore(searchRow,HConstants.CATALOG_FAMILY);
if (startRowResult == null) { throw new TableNotFoundException("Cannot find row in .META. for table: " + Bytes.toString(tableName) + ", row=" + Bytes.toStringBinary(searchRow)); }
byte[] value = startRowResult.getValue(HConstants.CATALOG_FAMILY,
HConstants.REGIONINFO_QUALIFIER);
if (value == null || value.length == 0) { throw new IOException("HRegionInfo was null or empty in Meta for " + Bytes.toString(tableName) + ", row=" + Bytes.toStringBinary(searchRow)); }
可以发现在扫描MetaScanner,rowkey所在的范围在Meta 表中不存在;通过RPC定位到服务端的实现
HRegion中:
public Result getClosestRowBefore(final byte [] row, final byte [] family)
throws IOException {
if (coprocessorHost != null) {
Result result = new Result();
if (coprocessorHost.preGetClosestRowBefore(row, family, result)) {
return result;
}
}
// look across all the HStores for this region and determine what the
// closest key is across all column families, since the data may be sparse
checkRow(row, "getClosestRowBefore");
startRegionOperation();
this.readRequestsCount.increment();
try {
Store store = getStore(family);
// get the closest key. (HStore.getRowKeyAtOrBefore can return null)
KeyValue key = store.getRowKeyAtOrBefore(row);
Result result = null;
if (key != null) {
Get get = new Get(key.getRow());
get.addFamily(family);
result = get(get, null);
}
if (coprocessorHost != null) {
coprocessorHost.postGetClosestRowBefore(row, family, result);
}
return result;
} finally {
closeRegionOperation();
}
}
在 KeyValue key = store.getRowKeyAtOrBefore(row);中获得了Meta表的rowkey,但是在后续的实现中
if (key != null) {
Get get = new Get(key.getRow());
get.addFamily(family);
result = get(get, null);
}
获得空的result导致了这个问题;
为什么会存在这个现象。
先讲一下HBase 的MVCC的原理,
MVCC是保证数据一致性的手段,HBase在写数据的过程中,需要经过好几个阶段,写HLog,写memstore,更新MVCC;
只有更新了MVCC,才算真正memstore写成功,其中事务的隔离需要有mvcc的来控制,比如读数据不可以获取别的线程还未提交的数据。
1、put、delete数据都会调用applyFamilyMapToMemstore
HRegion中
private long applyFamilyMapToMemstore(Map<byte[], List<KeyValue>> familyMap,
MultiVersionConsistencyControl.WriteEntry localizedWriteEntry) {
long size = 0;
boolean freemvcc = false;
try {
if (localizedWriteEntry == null) {
//开始一个写memstore,mvcc中的memstoreWrite++,并add待write pending队列中
localizedWriteEntry = mvcc.beginMemstoreInsert();
freemvcc = true;
}
for (Map.Entry<byte[], List<KeyValue>> e : familyMap.entrySet()) {
byte[] family = e.getKey();
List<KeyValue> edits = e.getValue();
Store store = getStore(family);
for (KeyValue kv: edits) {
kv.setMemstoreTS(localizedWriteEntry.getWriteNumber());
size += store.add(kv);
}
}
} finally {
if (freemvcc) {
mvcc.completeMemstoreInsert(localizedWriteEntry);
}
}
return size;
}
mvcc.completeMemstoreInsert,更新mvcc 的memstoreRead,也就是可以读的位置, 并通知readWaiters.notifyAll(),释放因flushcache调用waitForRead引起的阻塞;
waitForRead参见以下代码:
public void waitForRead(WriteEntry e) {
boolean interrupted = false;
synchronized (readWaiters) {
//小于,表示还有写未提交
while (memstoreRead < e.getWriteNumber()) {
try {
readWaiters.wait(0);
} catch (InterruptedException ie) {
// We were interrupted... finish the loop -- i.e. cleanup --and then
// on our way out, reset the interrupt flag.
interrupted = true;
}
}
}
if (interrupted) Thread.currentThread().interrupt();
}
2、 在flushcache的过程中,获取到memstore中的keyvalues后,会调用mvcc.waitForRead(w)(因memstore所有的keyvalue,包括还未真正提交的,所以要等待其他事务提交后,才可以进行后续的flush操作,保证事务的一致性。
w = mvcc.beginMemstoreInsert();
mvcc.advanceMemstore(w);
mvcc.waitForRead(w);
3、scan数据
在RegionScannerImpl.next方法实现中:
public synchronized boolean next(List<KeyValue> outResults, int limit)
throws IOException {
if (this.filterClosed) {
throw new UnknownScannerException("Scanner was closed (timed out?) " +
"after we renewed it. Could be caused by a very slow scanner " +
"or a lengthy garbage collection");
}
startRegionOperation();
readRequestsCount.increment();
try {
// This could be a new thread from the last time we called next().
//this.readPoint在构造的时,初始化(readpoint为当前hregion的mvcc中的memstoreRead,为当前可读的点)和当前线程绑定
MultiVersionConsistencyControl.setThreadReadPoint(this.readPt);
在MemStore中过滤掉还未提交的事务(新的keyvalue中有最新的point)
protected KeyValue getNext(Iterator<KeyValue> it) {
long readPoint = MultiVersionConsistencyControl.getThreadReadPoint();
while (it.hasNext()) {
KeyValue v = it.next();
//过滤掉大于当前线程readPoint的keyvalue
if (v.getMemstoreTS() <= readPoint) {
return v;
}
}
return null;
}
纵观MVCC的整个过程,再分析HRegion中的getClosestRowBefore方法实现,
KeyValue key = store.getRowKeyAtOrBefore(row);
这个调用不会进行MVCC的控制,可以读到memstore中所有的数据
而get方法是会进行MVCC进行控制的,所以一种可能情况是在get调用的时, store.getRowKeyAtOrBefore(row)读到的key值还未提交,
所有都过滤掉了,查询范围为null。
相关文章推荐
- 关于HBase MVCC的设计原理以及MVCC所引起的一个scan问题
- UserControl 中包含封装了集合对象的属性被设计器自动初始化所引起的错误!也有关于List的问题
- 刚发现了一个问题,关于vs2005 datagridview的,我发现在设计行标头的HeaderCell.Value的时候要是设置RowTemplate.Height 的值>= 17则行标头的那个黑三角就显示出来了,要是小于17就不能显示了,想问问大家,是怎么回事?
- 用java写关于删除一个字符串的字符以及删除一个字符串与另一个字符串中相同字符的问题
- 关于智能家居主界面,通话记录界面设计的相关问题(ViewPager以及自定view)
- 关于inline-block,间隙距离去除,以及div自动换行,高度等问题的一个demo
- 关于加载设计器遇到一个或多个错误问题的解决方案
- 解决一个mysql关于按文章以及评论的最早时间的排序问题
- 关于sql2008数据库还原中,“with move 子句可用于重新定位一个或多个文件”的问题以及解决
- 一段旧代码,引起的关于OO中一个问题的思考
- 一个有关访问量统计的数据库设计以及isqlplus的一个设置问题
- 关于shtml页面include问题解决方案因为utf-8的BOM头引起的出现一个空行
- 请教大家一个问题,有关于数据库的设计
- UserControl 中包含封装了集合对象的属性被设计器自动初始化所引起的错误!也有关于List的问题
- 学习类,笔记!关于一个函数里面定义多个类以及相互访问的问题
- 关于修复VS2008提示加载安装组件出现问题和点击VS的设计窗口出现一个WINDOW installer的提示的问题的解决办法
- B2C外贸网站产品设计和功能需求,一个产品设置了在多个类别里面都可以看到,以及Email模板问题
- GZIP压缩原理分析(09)——第四章 基于gzip的HTTP压缩详解(四03) 处理细节(关于流压缩的问题)以及本章总结
- 【niubi-job——一个分布式的任务调度框架】----框架设计原理以及实现