Budapest Data 2015 has ended
Back To Schedule
Friday, June 5 • 09:00 - 12:00
Introduction to Apache Hadoop

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Originally inspired by Google's GFS and MapReduce papers, Apache Hadoop is an open source framework offering scalable, distributed, fault-tolerant data storage and processing on standard hardware. This session explains what Hadoop is and where it best fits into the modern data center. You'll learn the basics of how it offers scalable data storage and processing, some important "ecosystem" tools that complement Hadoop's capabilities, and several practical ways organizations are using these tools today. Additionally, you'll learn about the basic architecture of a Hadoop cluster and some recent developments that will further improve Hadoop's scalability and performance.

Basic knowledge: None

This tutorial would use Cloudera QuickStart VM for demo (http://www.cloudera.com/content/cloudera/en/downloads/quickstart_vms.html). The attendees are welcome to download the VM beforehand on their laptops and follow along with the demo instructions. It is, however, not required.

avatar for Mark Grover

Mark Grover

Software Engineer, Cloudera
Mark is the co-author of O'Reilly's Hadoop Application Architectures book, a committer on Apache Bigtop and a committer and PMC member on Apache Sentry (incubating). He has contributed code to Apache Hadoop, Apache Hive, Apache Sqoop and Apache Flume projects. He is also a section... Read More →

Friday June 5, 2015 09:00 - 12:00 CEST
Mátyás I

Attendees (0)