您的位置:首页 > 其它

Performance testing HBase using YCSB

2011-01-17 19:53 447 查看
There are many new serving databases available, including:

PNUTS

BigTable

HBase

Hypertable

Azure

Cassandra

CouchDB

Voldemort

MongoDb

Dynomite

…and many others

It is difficult to decide which system is right for your
application, partially because the features differ between systems, and
partially because there is not an easy way to compare the performance of
one system versus another.

The goal of the YCSB
project is to develop a
framework and common set of workloads for evaluating the performance of
different “key-value” and “cloud” serving stores. The project comprises
two things:

The YCSB
Client, an extensible workload generator

The Core workloads, a set of workload scenarios to be executed by the generator

Although the core workloads provide a well rounded picture of a
system’s performance, the Client is extensible so that you can define
new and different workloads to examine system aspects, or application
scenarios, not adequately covered by the core workload. Similarly, the
Client is extensible to support benchmarking different databases.
Although we include sample code for benchmarking HBase, Cassandra and
MongoDB, it is straightforward to write a new interface layer to
benchmark your favorite database.

A common use of the tool is to benchmark multiple systems and compare
them. For example, you can install multiple systems on the same hardware
configuration, and run the same workloads against each system. Then you
can plot the performance of each system (for example, as latency versus
throughput curves) to see when one system does better than another.

文章来源: http://blog.lars-francke.de/2010/08/16/performance-testing-hbase-using-ycsb/

I assume most of you know what HBase
is but just in case here is a snippet from Wikipedia
:

HBase is an open source, non-relational, distributed database modeled after Google’s BigTable and is written in Java.

Yahoo has published a paper
and the accompanying tool
(YCSB) about Benchmarking Cloud Serving Systems with YCSB
.
At the moment I am not interested in comparing different database
systems against each other but instead to only benchmark HBase. This is
useful to test custom patches and their performance impact or to test
different configuration options.

No matter which kind of workload you choose however keep in mind that
this is an artificial benchmark and it can’t replace a test with your
real data and load.

In this short blog post I’m going to outline how to get YCSB running
against a current version of HBase. I’m going to show this on a single
machine. In a real test setup you should of course be running YCSB on a
different machine (or multiple machines
) than your HBase cluster. A YCSB benchmark consists of two phases: a load
and a transaction
phase. The load
phase measures various statistics while importing a bunch of data into the database while the transaction
phase does just that, i.e. transactions on the data. There are multiple
predefined workloads that mimic typical database usage scenarios and
you can also define your own.

Requirements/Setup

I am using a clean Ubuntu 10.04 installation but this should work on other distributions just as well.

While you’ll probably run it against an already set up cluster I will
be using HBase in standalone mode here in its second development
release of 0.89.

For YSCB I’ve used the latest version checked out from Github but the latest released version (0.1.2
at the time of this writing) should work equally well. So do this:

$ sudo apt-get -y install ant openjdk-6-jdk git-core
$ export JAVA_HOME= /usr/lib/jvm/java-6-openjdk/
$ wget http: //apache .easy-webs.de /hbase/hbase-0 .89.20100726 /hbase-0 .89.20100726-bin. tar .gz
$ tar xvzf hbase-0.89.20100726-bin. tar .gz
$ hbase-0.89.20100726 /bin/start-hbase .sh
$ hbase-0.89.20100726 /bin/hbase shell
create 'usertable' , 'family'
exit
$ git clone http: //github .com /brianfrankcooper/YCSB .git
$ cp hbase-0.89.20100726 /lib/ * YCSB /db/hbase/lib
$ cd YCSB
$ ant
$ ant dbcompile-hbase


As you can see YCSB requires a table called
usertable

in HBase and it has to contain one column family with an arbitrary name (i.e.
family

in my case). YCSB also needs all the libraries (jars) that the HBase
client needs to run. The easiest is to just copy everything from HBase’s
lib

directory to the appropriate directory in YCSB.

Running YCSB

At this point we should have HBase running somewhere and YCSB and its HBase driver compiled. Time to load some data into HBase.

A few things to note here:

This loads only 1000 records into HBase. You will want to increase the number to 100 million or more on a real test.

The documentation
is pretty good so make sure to read it should you have problems.

The documentation suggests not specifying properties (like
recordcount) on the command line but in a property file instead. You’ll
find instructions on how to do this on the aforementioned page.

The
-s

parameter causes YCSB to print status messages to System.err every ten seconds, remove it if you don’t want them.

After the load operation has finished you can find statistics in the
load.dat

file

Now we’ll run the transactions part of the workload (again, for explanations see the documentation of YCSB):

or

After each run you should inspect the
transactions.dat

file. For explanations I’ll once again refer to the documentation. We’ve used
workloada

in these examples but there are in fact multiple predefined workloads (which are listed and explained in the documentation
).

That’s it. As you can see YCSB is pretty easy to set up. I still hope
this guide was helpful in getting started with it. Let me know if you
have any questions.

So you have a HBase cluster running somewhere and now you’re trying
to run YCSB from another machine but it doesn’t work because it can’t
connect to ZooKeeper?

If so try to copy your hbase-site.xml config from your cluster in the classpath of YCSB and try again.

Copy your
hbase-site.xml

with all the configuration options to the
db/hbase/conf

directory and add it to your classpath like this:
java -cp "build/ycsb.jar:db/hbase/lib/*:db/hbase/conf/" ...


更多信息参考:

Getting Started

https://github.com/brianfrankcooper/YCSB/wiki/Getting-Started

Running a Workload:

https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: