More than 200 Hadoop Developers Schmooze It Up at Yahoo! | Hadoop Summit Event Registration Now Open

Thanks to the more than 200 developers who came to Yahoo! Wednesday night for our monthly Hadoop User Group meeting. It was our largest turnout ever and people stuck around until 9 pm.

The discussion centered around Yahoo!’s Hadoop Security release, a presentation of Hadoop research by a Berkeley PhD student and an overview of the Flightcaster flight delay prediction service built on Hadoop.

At the beginning of the meeting we announced Hadoop Summit 2010 - June 29th, Santa Clara, a one day event for technology leaders and application developers featuring keynotes from Yahoo!, Amazon Web Services, Cloudera and Facebook. We are introducing multiple track sessions around Hadoop programming, case studies and cutting-edge research. Registration and paper submission are available here.

Hadoop User Group March Meeting Recap:

It was great to see many new faces. The interesting mix of experienced developers and Hadoop "newbies" lead to many productive discussions:

A few interesting comments from attendees, posted at the event meetup page:

“Today's meetup was really informative and very good presentation on Hadoop Online Project.. keep it coming!!". Dave Jespersen, VP Engineering at MapR Technologies

“It was good to see security enhancements. Streaming and flight delay predictor were very cool and interesting". Arul Ganesh, Java Developer.

“The security and performance presentations were excellent. I enjoyed the climate and the chance to meet and talk with other Hadoop enthusiasts. Great job organizing and putting these events together. I sincerely appreciate it". Sreeni Jaladanki, Engineering Manager

For those of you who were unable to attend in person, the session's details and slides are posted below, we will publish the video recording soon. Stay tuned!

Owen O'Malley from the Yahoo! Hadoop Team provided an overview of the upcoming Hadoop Security release. Owen described the features and capabilities included as well as operational benefits. Yahoo! is very excited about adding security capabilities to Hadoop and views this as major milestone in continuing to make Hadoop an enterprise-grade platform. Stay tuned for a detailed post on security coming to this blog soon.


Tyson Condie a Ph.D. student at the University of California, Berkeley, presented the innovative research around Hadoop Online efforts lead by Prof. Joseph M. Hellerstein . Tyson described a modified MapReduce architecture that allows data to be pipelined between operators. This extends the MapReduce programming model beyond batch processing, can reduce completion times and improve system utilization. Tyson included examples from the HOP - Hadoop Online Prototype project.


Bradford Cross from Flightcaster provided an exciting overview on the FlightCaster flight delays prediction service and some cool insights into the airline industry. Bradford described how they built a scalable machine learning and data analysis platform using Clojure dynamic programming language wrapping Cascading and Hadoop. Bradford demonstrated how the use of Hadoop makes building scalable systems much simpler.


We at Yahoo!, definitely see the importance of Hadoop – the ability to process massive data sets is core to our business - and we are continuing to invest heavily in the technology and the community to make it even better. We love being at the center of discussion and debate around Hadoop.

Please join us at the Hadoop Summit to continue the conversation.

As always, we are looking for exciting technologies and experiences you want to share.
Please email presentation requests at the Hadoop Bay Area User Group Meetup page.
See you all on April 21st, 2010. Registration is available here, agenda will be published soon


Dekel Tankel

Director, Product Management

Cloud Computing at Yahoo!