Archive for March, 2016

Distributed Trace for Distributed Systems

March 23, 2016 Leave a comment


Originally posted on

For a while now folks have known about and have been using distributed trace within their infrastructure systems. If you aren’t familiar with the concept take a look at Google’s paper about it. Basically, it gives you macro level insight into how a distributed system behaves and allows you to optimize your “Data Center Apps” like Chrome Dev Tools lets you optimize your web apps.

Ok, so what do you need?

1) A Mesos Cluster. Sorry if you don’t have one of these yet, you should. Apache Mesos and Mesosphere’s DCOS have revolutionized how businesses utilize their compute resources. Instead of managing each resource individually, the technology unifies the compute in your data center into one centrally managed resource. Don’t worry if you are new to Mesos, hang tight and we will come back to it after discussing the other required elements.

2) Stack Deploy. This software  is one of the open source components of Elodina’s Sawfly Platform as a service product and is available at Stack deploy was developed to run and manage schedulers (Kafka, Cassandra, HDFS, etc) in orchestrated ways with other applications. These can be state-full or state-less (yes with Docker too but not required) and are intended to be deployed in a multi tenant and micro segmented software defined data center model.

3) Run This Stack!

The stack will deploy:

  • 1x Exhibitor-Mesos
  • 1x Exhibitor
  • 1x DSE-Mesos
  • 1x Cassandra node
  • 1x Kafka-Mesos
  • 1x Kafka 0.8 broker
  • 1x Zipkin-Mesos
  • 1x Zipkin Collector
  • 1x Zipkin Query
  • 1x Zipkin Web

Should you want to customize the number of Kafka brokers, Cassandra nodes, or Zipkin topic name, do so by modifying the corresponding fields (e.g. node count, cpu, mem, etc) in the stack file.

In order to deploy this stack the only thing that needs to be specified is the zookeeper url:

export SD_API=stackdeploy_url>
# stackdeploy_user would most be probably “admin”
export SD_USER=stackdeploy_user>
export SD_KEY=stackdeploy_key>

./stack-deploy add –file zipkin-full.stack
# for “Mesos Consul”-enabled clusters zk_url would probably be zookeeper.service:2181
./stack-deploy run zipkin-full –var “zk_url=zk_url>”

Note: if you want to restart your stack for some reason, make sure to clean up theExhibitor Mesos framework znode storage in Zookeeper if the Exhibitor phase was already completed. Failure to do this step may cause subsequent deployments of the stack to fail.

Launching from existing infrastructure

If you want to include Zipkin in already running infrastructure, use the following file: zipkin-standalone.stack

In order to zipkin stand alone, you will need a Cassandra cluster and 0.8 Kafka brokers accesible from your clusterfor Zipkin to use.

# Presuming you have SD_API, SD_USER and SD_KEY already set

./stack-deploy add –file stacks/zipkin-standalone.stack
./stack-deploy run zipkin-standalone –var “kafka_zk_url=kafka_zk_url>” –var “cassandra_url=cassandra_contact_points>”

Launching from existing infrastructure

If you want to include Zipkin in already running infrastructure, use the following file: zipkin-standalone.stack

In order to run, you will need a Cassandra node and 0.8 Kafka broker on your cluster which you get in the full stack or use the setup you have is ok too.

Is it working?

The Easiest way to verify if the Zipkin stack has launched successfully is to use the trace generator provided by Zipkin Mesos framework. Please refer to the corresponding documentation section on details:

After the traces aresent, you can check them out by opening Zipkin web UI in your browser. If you plan on using  stack file on Consul-enabled clusters, the link will look like this:


And you are good to go!

If you need a client example how to generate these trace events, take a look at our Go Zipkin sample for that utility here

To learn more about Elodina please contact us or come meet us in person at theApache Mesos NYC or Apache Kafka NYC Meetups about getting started.

~ Joestein


Categories: Uncategorized

Hadoop isn’t dead but you might be doing it wrong!

March 18, 2016 Leave a comment

I haven’t blogged (or podcasted for that matter) in a while. There are lots of different reasons for that and I am always happy to chat and grab tea if folks are interested but after attending this year’s HIMSS conference I just couldn’t hold it in anymore.

I went to HIMSS so excited it was supposed to be the year of Big Data! Everything was about transformation and interoperability and OMGZ the excitement.

The first keynote Monday evening was OFF THE HOOK The rest of the time myself and two of my colleagues where at the expo. It is basically CES for Healthcare (if you don’t know what CES is then think DEFCON for Healthcare… or something). Its big.

But where was the Big Data?

Not really anywhere … There were 3 recognizable “big data companies” and one of them was in the booth as a partner for cloud services. It was weird. What happened?

One of the engineers from Cerner has a lightening talk at the Kafka Summit, go Cerner!!

Didn’t everyone get the memo? We need to help reduce costs of patient care!

Here are two ways to help reduce costs of patient care!

  1. (Paraphrasing Michael Dell from his keynote) Innovation funding for Healthcare IT will come from optimizing your data center resources.
  2. (This one is from me but inspired by Bruce Schneier) Through Open Source we can enable better systems by sharing in the R&D costs and also make them more secure.

Totally agree with #1, have seen it first hand people saving 82% of their data center bill. Not even using spot (or as they say “preemptive“) instances yet. Amazing!

As for #2, you have to realize that different people are good at different things. One person can write anything but sometimes 2 or 3 or 45 of them can write it better…. at least make sure the tests always keep passing and evolving properly, etc, etc, etc, stewardship, etc.

Besides all of that, the conference was great. There were a lot of companies and people I recognized and bumped into and it was great to catch up.

I was also really REALLY excited to see how far physician signatures and form signing has (finally) come in healthcare removing all that paper. Fax is almost dead but there are still a couple of companies kicking.

One last thing, the cyber security part of the expo was also disappointing. I know it was during the RSA Conference but Healthcare needs good solutions too. For that there were a good set of solutions not bad in some cases legit and known (thanks for showing up!) but the “pavilion” was downstairs in the back left corner. Maybe if HIMSS coincided with Strata it would have been different, hard to say.

There was one tweet about it (at least) not sure if there were more.

So, Big Data, Healthcare, Security, OH MY! I am in!

I will be talking more about problems and solutions with using the Open Source Interoperable XML based FHIR standard in Healthcare removing the need to integrate and make interoperable HL7 systems in New York City on 03/29/2016 and getting into realtime stream processing on Mesos.

I will also be conducting a training on SMACK Stack 1.0 (Streaming Mesos Analytics Cassandra Kafka) using telephone systems and API to start stream events and interactions with different systems because of them. Yes, I bought phones and yes you get to keep yours.

What has attracted me (for almost 2 years now) to running on Mesos Hadoop systems and eco-system components is the ease it brings for the developers, systems engineers, data scientists, analysts and the users of the software systems that run (as a service often) those components. There are lots of things to research and read in those cases I would

1) scour my blog

2) read this

3) and this

4) your own thing

Hadoop! Mesos!


~ Joestein

p.s. if you have something good to say about Hadoop and want to talk about it and it is gripping and good and gets back to the history and continued efforts. Let me know. Thanks!