Archive

Archive for October, 2013

Using Apache Drill for Large Scale, Interactive, Real-Time Analytic Queries

October 29, 2013 Leave a comment

Episode #17 of the podcast is a talk with Jacques Nadeau  available also on iTunes

Apache Drill http://incubator.apache.org/drill/, a modern interactive query engine that runs on top of Hadoop.

Jacques talked about how Apache Drill is a modern query engine that is meant to be a query layer on top of all big data open source systems. Apache Drill is being designed to make the storage engine as plug-able so it could be the interface for any big data storage engine. The first release came out recently to allow developers to understand the data pipeline.

Leveraging an efficient columnar storage format, an optimistic execution engine and a cache-conscious memory layout, Apache Drill is blazing fast. Coordination, query planning, optimization, scheduling, and execution are all distributed throughout nodes in a system to maximize parallelization.

drill_runtime

Perform interactive analysis on all of your data, including nested and schema-less. Drill supports querying against many different schema-less data sources including HBase, Cassandra and MongoDB. Naturally flat records are included as a special case of nested data.

json

Strongly defined tiers and APIs for straightforward integration with a wide array of technologies.

arch

Subscribe to the podcast and listen to what Jacques had to say.  Available also on iTunes

/*********************************
Joe Stein
Founder, Principal Consultant
Big Data Open Source Security LLC
http://www.stealth.ly
Twitter: @allthingshadoop
**********************************/

Advertisements