Budapest Data 2015 has ended
Back To Schedule
Thursday, June 4 • 15:40 - 16:10
Bootstrap Real Time pipeline in 30 minutes

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

In a world where every "Thing" is producing lots of data, ingesting and processing that large volume of data becomes a big problem. In today's dynamic world, firms have to react to changing conditions very fast, or even better in real time. In this talk we will take on this interesting challenge using latest and greatest tools from Big Data community. We will try to combine awesomeness of Kafka, a resilient pub-sub messaging system, with the powers of Spark streaming for scalable, high-throughput, fault-tolerant stream processing of live data streams. Combining different systems to get even a more powerful system is great, but has its own complexity. With a demo of building a pipeline to ingest and process real time data using these systems, we will explore how the two systems can be intertwined to make the most out of the combined system.

avatar for Ashish Singh

Ashish Singh

Software Engineer, Cloudera
Ashish Singh is a Software Engineer, working with Cloudera to empower Hadoop ecosystem to answer bigger questions. He contributes to Apache Kafka, Hive, Parquet and Sentry. Prior to joining Cloudera, he worked on optimizing MPI collective communications on High Performance Computing... Read More →

Thursday June 4, 2015 15:40 - 16:10 CEST
Mátyás II.

Attendees (0)