Archive

Archive for March, 2014

Big Data with Apache Accumulo Preserving Security with Open Source

March 13, 2014 Leave a comment

Episode 19 of the podcast was a talk with Adam Fuchs.

Adam talked about Apache Accumulo which is a system built for doing random i/o with peta bytes of data.

Distributing the computation to the data with cell level security is where Accumulo really shines.

Accumulo provides a richer data model than simple key-value stores, but is not a fully relational database. Data is represented as key-value pairs, where the key and value are comprised of the following elements:

Key Value
Row ID Column Timestamp
Family Qualifier Visibility

All elements of the Key and the Value are represented as byte arrays except for Timestamp, which is a Long. Accumulo sorts keys by element and lexicographically in ascending order. Timestamps are sorted in descending order so that later versions of the same Key appear first in a sequential scan. Tables consist of a set of sorted key-value pairs.

Accumulo stores data in tables, which are partitioned into tablets. Tablets are partitioned on row boundaries so that all of the columns and values for a particular row are found together within the same tablet. The Master assigns Tablets to one TabletServer at a time. This enables row-level transactions to take place without using distributed locking or some other complicated synchronization mechanism. As clients insert and query data, and as machines are added and removed from the cluster, the Master migrates tablets to ensure they remain available and that the ingest and query load is balanced across the cluster.

images/data_distribution.png
images/failure_handling.png
Subscribe to the Podcast and here all of what Adam had to say.
You can get started using Apache Accumulo with our development environment https://github.com/stealthly/hdp-accumulo
/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 Twitter: @allthingshadoop
********************************************/