I haven’t blogged (or podcasted for that matter) in a while. There are lots of different reasons for that and I am always happy to chat and grab tea if folks are interested but after attending this year’s HIMSS conference I just couldn’t hold it in anymore.
I went to HIMSS so excited it was supposed to be the year of Big Data! Everything was about transformation and interoperability and OMGZ the excitement.
The first keynote Monday evening was OFF THE HOOK http://www.himssconference.org/education/sessions/keynote-speakers. The rest of the time myself and two of my colleagues where at the expo. It is basically CES for Healthcare (if you don’t know what CES is then think DEFCON for Healthcare… or something). Its big.
But where was the Big Data?
Not really anywhere … There were 3 recognizable “big data companies” and one of them was in the booth as a partner for cloud services. It was weird. What happened?
One of the engineers from Cerner has a lightening talk at the Kafka Summit, go Cerner!!
Didn’t everyone get the memo? We need to help reduce costs of patient care!
Here are two ways to help reduce costs of patient care!
- (Paraphrasing Michael Dell from his keynote) Innovation funding for Healthcare IT will come from optimizing your data center resources.
- (This one is from me but inspired by Bruce Schneier) Through Open Source we can enable better systems by sharing in the R&D costs and also make them more secure.
As for #2, you have to realize that different people are good at different things. One person can write anything but sometimes 2 or 3 or 45 of them can write it better…. at least make sure the tests always keep passing and evolving properly, etc, etc, etc, stewardship, etc.
Besides all of that, the conference was great. There were a lot of companies and people I recognized and bumped into and it was great to catch up.
I was also really REALLY excited to see how far physician signatures and form signing has (finally) come in healthcare removing all that paper. Fax is almost dead but there are still a couple of companies kicking.
One last thing, the cyber security part of the expo was also disappointing. I know it was during the RSA Conference but Healthcare needs good solutions too. For that there were a good set of solutions not bad in some cases legit and known (thanks for showing up!) but the “pavilion” was downstairs in the back left corner. Maybe if HIMSS coincided with Strata it would have been different, hard to say.
There was one tweet about it https://twitter.com/elodinadotnet/status/705176912964923393 (at least) not sure if there were more.
So, Big Data, Healthcare, Security, OH MY! I am in!
I will be talking more about problems and solutions with using the Open Source Interoperable XML based FHIR standard in Healthcare removing the need to integrate and make interoperable HL7 systems in New York City on 03/29/2016 http://www.meetup.com/Apache-Mesos-NYC-Meetup/events/229503464/ and getting into realtime stream processing on Mesos.
I will also be conducting a training on SMACK Stack 1.0 (Streaming Mesos Analytics Cassandra Kafka) using telephone systems and API to start stream events and interactions with different systems because of them. Yes, I bought phones and yes you get to keep yours.
What has attracted me (for almost 2 years now) to running on Mesos Hadoop systems and eco-system components is the ease it brings for the developers, systems engineers, data scientists, analysts and the users of the software systems that run (as a service often) those components. There are lots of things to research and read in those cases I would
1) scour my blog
4) your own thing
p.s. if you have something good to say about Hadoop and want to talk about it and it is gripping and good and gets back to the history and continued efforts. Let me know. Thanks!
When I spoke with Arun a year or so a go YARN was NextGen Hadoop and there have been a lot of updates, work done and production experience since.
Besides Yahoo! other multi thousand node clusters have been and are running in production with YARN. These clusters have shown 2x capacity throughput which resulted in reduced cost for hardware (and in some cases being able to shut down co-los) while still gaining performance improvements overall to previous clusters of Hadoop 1.X.
I got to hear about some of what is in 2.4 and coming in 2.5 of Hadoop:
- Application timeline server repository and api for application specific metrics (Tez, Spark, Whatever).
- web service API to put and get with some aggregation.
- plugable nosql store (hbase, accumulo) to scale it.
- Preemption capacity scheduler.
- Multiple resource support (CPU, RAM and Disk).
- Labels tag nodes with labels can be labeled however so some windows and some linux and ask for resources with only those labels with ACLS.
- Hypervisor support as a key part of the topology.
- Hoya generalize for YARN (game changer) and now proposed as Slider to the Apache incubator.
We talked about Tez which provides complex DAGs of queries to translate what you want to-do on Hadoop without the work arounds for making it have to run in MapReduce. MapReduce was not designed to be re-workable out side of the parts of the Job it gave you for Map, Split, Shuffle, Combine, Reduce, Etc and Tez is more expressible exposing a DAG API.
Now becomes with Tez:
There were also some updates on Hive v13 coming out with sub queries, low latency queries (through Tez), high precision decimal points and more!
Subscribe to the podcast and listen to all of what Bikas and Arun had to say.