Name: SMACK Stack and Beyond - Building Fast Data Pipelines - Jörg Schad & Matt Jarvis, Mesosphere
Start: 2017-09-13T16:00:00-0700
End: 2017-09-13T16:40:00-0700

September 11-14, 2017 - Los Angeles, CA
Click Here For Information & Registration

Back To Schedule

SMACK Stack and Beyond - Building Fast Data Pipelines - Jörg Schad & Matt Jarvis, Mesosphere

Feedback form is now closed.

Our world seems to move faster and faster and so are our requirements for data analytics. For many use cases such as fraud detection or reacting on sensor data the response times of traditional batch processing are simply to slow. In order to be able to react to such events close to real-time, we need to beyond the classical batch processing and utilize stream processing systems such as Apache Spark Streaming, Apache Flink, or Apache Storm.
But these systems are not sufficient by itself. For an efficient and fault-tolerant setup we also need to a message queue and storage system. One common example for such fast data pipelines is the SMACK stack which stands for
- Spark (Streaming) - the stream processing system
- Mesos - the cluster orchestrator
- Akka - the system for providing custom actors for reacting upon the analyses
- Cassandra - storage system
- Kafka - message queue

Setting up such pipeline in a scalable, efficient and fault-tolerant manner is not trivial.
This talk will first discuss several alternatives for the various parts in the stack, e.g., what are the tradeoffs between Spark Streaming and Apache Flink; when should I use ArangoDB or Apache Cassandra.
We will then discuss the challenges and best practices for setting up such pipelines in order.
The talk will finish with a demo of a fast data pipelines with Apache Flink, ArangoDB, and Apache Kafka deployed on DC/OS.

Speakers

Jörg Schad

CTO, ArangoDB

Jörg Schad is the CTO at ArangoDB. In a previous life, he has worked on or built machine learning pipelines in healthcare, distributed systems, including early Kubernetes code at Mesosphere, and in-memory databases. He received his Ph.D. for research about distributed databases and... Read More →

OSS Summit pdf

Wednesday September 13, 2017 4:00pm - 4:40pm PDT
Gold 1

CloudOpen Tracks

Experience Level Intermediate

Open Source Summit North America 2017

Jörg Schad

Attendees (25)

Open Source Summit North America 2017

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Jörg Schad

Attendees (25)