Resource scheduling and task launching with Apache Mesos and Apache Aurora at Twitter
Bill explained how Twitter, using Apache Mesos and Apache Aurora, gets more for their money for the hardware and saves engineering time (both development and operations) by utilizing fine grained resources scheduling across their infrastructure. Bill talked a bit how the power of what he saw and experienced at Google with Borg is how they wanted to run things at Twitter and what they built Aurora for. Now after years of running in production at Twitter, Aurora is open source, part of the Apache foundation and available for use. Lots of new use cases that they didn’t see coming have become very powerful for their teams and Bill went into more detail about that too.
Bill also talked about the type of instrumentation that was done with features in Aurora to get to a place where now all new systems and almost all legacy systems at Twitter are run on top of Aurora. Bill went into detail about how that works in regards to Twitter’s cache and how the SLA features of Aurora make this a reality. Aurora is amazing providing end users (everyone from engineers to analysts) the ability to have full access to the potential resources of their hardware clusters. Aurora provides features like quotas and preemption so that any user can be provided the access to the compute resources of the entire hardware infrastructure without worry of abuse to hog resources and keep production always as the priority.
Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively. Mesos is built using the same principles as the Linux kernel, only at a different level of abstraction. The Mesos kernel runs on every machine and provides applications (e.g., Hadoop, Spark, Kafka, Elastic Search) with API’s for resource management and scheduling across entire datacenter and cloud environments.
Apache Aurora is a Mesos framework. A Mesos frameworks is a scheduler of resources and launcher of tasks. Aurora provides a Job abstraction consisting of a Task template and instructions for creating near-identical replicas of that Task. Typically a Task is a single Process corresponding to a single command line, such as
python2.6 my_script.py. However, sometimes you must colocate separate Processes together within a single Task, which runs within a single container and
chroot, often referred to as a “sandbox”. For example, if you run multiple cooperating agents together such as
installer, and master or slave processes. Thermos provides a Process abstraction under the Mesos Tasks.
To use and get up to speed on Aurora, you should look the docs in this directory in this order:
- How to deploy Aurora or, how to install Aurora on virtual machines on your private machine (the Tutorial uses the virtual machine approach).
- As a user, get started quickly with a Tutorial.
- For an overview of Aurora’s process flow under the hood, see the User Guide.
- To learn how to write a configuration file, look at our Configuration Tutorial. From there, look at the Aurora + Thermos Reference.
- Then read up on the Aurora Command Line Client.
- Find out general information and useful tips about how Aurora does Resource Isolation.
For some more great background on Mesos and Aurora please check out these three videos.
Datacenter Management with Apache Mesos
An intro video to Apache Aurora
Past, Present, Future of Apache Aurora
To hear everything that Bill had to say please subscribe to the podcast.