您的位置:首页 > 数据库

Trafodion: 针对HBase的SQL事务支持

2014-10-31 13:49 232 查看
Introduction

Trafodion is an open source initiative from HP, incubated at HP Labs

and HP-IT, to develop an enterprise-class SQL-on-HBase solution

targeted for big data transactional or operational workloads. HP has

developed transactional SQL technologies with more than 20 years of

investment into database technology and solutions. Trafodion brings

this core technology to the Hadoop ecosystem. The name 'Trafodion'

(the Welsh word for transactions, pronounced ‘Tra-vod-eee-on':) was

chosen specifically to emphasize the differentiation that Trafodion

provides in closing a critical gap in the Hadoop ecosystem. To find out

more about the origin and the name of the project, please visit

www.hp.com/go/trafodion.

Target workloads

Hadoop workloads span from long-running batch mode to low-latency

operational workloads as shown in the figure below. The three

categories on the right side are analytic workloads and are regarded as

well-suited for Hadoop and therefore have garnered the most attention.

In contrast, the leftmost workload defined as “Operational” is a new class

of workloads that encompasses OLTP workloads as well as transactions

that include social and mobile data interactions and observations using

a mixture of structured and semi-structured data.



Traditionally, these workloads have been handled by relational

databases. But, relational databases have scalability issues and do not

provide schema flexibility required in certain cases. Hadoop addresses

these limitations. Combined with Hadoop’s perceived benefits of

significantly reduced costs, there is growing interest and pressure to

embrace these workloads in the Hadoop ecosystem.

As operational workloads represent business needs, they typically consist

of a constant flow of transactions requiring low- latency response times

for read/write access. Additionally, these workloads are characterized by:

• Data integrity with ACID-compliant protection

• High availability, concurrency and scalability

• Multi-structured data

• Rapidly evolving data requirements

Features

Currently, there is no existing open source SQL-on-HBase solution

that adequately meets these requirements. Trafodion provides the

following functionality to support transactional workloads in

Hadoop:

• ACID-compliant distributed transaction protection

over multiple SQL statements, tables and rows

• Rich, full-functioned ANSI SQL language support using

ODBC/JDBC connectivity interfaces

• Performance improvements for transactional

workloads by leveraging compile-time and run-time

optimizations

• Support for large data sets using parallel-aware

query optimizer

Trafodion intends to leverage the full capabilities of Hadoop

ecosystem:

• Schema flexibility provided by HBase column family

structures

• Snapshot capability with versioning support in Hadoop

• High Availability and Disaster Recovery support with

replication and snapshotting capabilities

Benefits

Trafodion delivers a full-featured and optimized

transactional SQL-on-HBase DBMS solution with full

transactional data protection. These capabilities help

overcome Hadoop’s weaknesses in terms of supporting

transactional workloads.



With Trafodion, customers gain the following benefits:

• Ability to leverage SQL expertise versus complex MapReduce

programming

• Seamless support for existing transactional applications

• Ability to develop next generation highly scalable, real-time

transaction processing applications

• Reduction in data latency for down-steam analytic workloads

And they also gain the following benefits inherent in Hadoop

ecosystem:

• Reduced infrastructure costs

• Massive scalability and granular elasticity

• Improved data availability and disaster recovery protection

Trafodion: Transactional SQL on HBase

Architecture

The Trafodion software architecture consists of three distinct layers:

the client layer, the SQL database services layer, and the storage

engine layer as shown in the figure below.



The first layer is the Client Services layer where the application resides

and accesses the Trafodion database via standard ODBC/JDBC

interface using a Trafodion-supplied Windows or Linux client driver.

The second layer is the SQL layer where Trafodion provides a relational

schema abstraction on top of HBase, encapsulating all of the services

required for managing Trafodion database objects. These services

include connection management, transaction management, optimized

plan generation, and execution against Trafodion database objects.

Trafodion features a mature query optimizer that can generate parallel

query plans, eliminating the need for complex MapReduce programming

development.

The third layer is the Storage Engine layer which consists of standard

Hadoop services including HBase, HDFS, and Zookeeper. Trafodion

database objects are stored in native Hadoop (HBase/HDFS) database

structures. Trafodion handles the mapping of SQL requests into native

HBase calls transparently on behalf of the application.

Key innovations

Trafodion’s Distributed Transaction Management (DTM) component

provides protection to transactions spanning multiple SQL statements,

multiple tables, or multiple rows of a single table. Additionally,

Trafodion DTM provides protection in a distributed cluster configuration

across multiple HBase regions using an inherent two-phase commit

protocol. DTM provides support for implicit (auto-commit) and explicit

(BEGIN, COMMIT, ROLLBACK WORK) transaction control.

Trafodion provides many compile-time and run-time optimizations for

varying transactional workloads ranging from singleton row accesses

for OLTP-like transactions to highly complex SQL statements used for

operational reporting purposes.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: