August HUG Recap

Thanks to the around 175 developers who came to Yahoo! recently for our monthly Hadoop User Group meeting. The energy in the packed room was phenomenal, and conversations continued long after the formal sessions.

IMG00068.jpg

Hundreds of Hadoop Fans Flock to Yahoo! for the Hadoop User Group

The event started with Arun Murthy from Yahoo! describing the best practices for developing MapReduce applications. Arun introduced the concept of a Grid Pattern which, similar to Design Pattern, represents a general reusable solution for applications running on the Grid. Finally, Arun talked about the anti-patterns of applications running on the Apache Hadoop clusters.

Part 1:

Part 2:

Next, Stefan Groschupf, the co-founder and CTO of Datameer, discussed the challenges in social media analytics and how to overcome these using big data analytics built on Hadoop in his “Social Media: What’s Really the Buzz?” talk. The demo was very helpful in visualizing the true thought leads and influencers in social media conversations. These leaders and influences are becoming increasingly important, so that companies can better understand who is having an impact on their customers' buying decisions. This talk gave a very good perspective of how the power of Hadoop can be used to crunch large amounts of data and then visually rendered.

Part 1:

Part 2:

Part 3:

Finally, Matei Zaharia from UC Berkeley talked about Mesos: A Flexible Cluster Resource Manager. The talk highlighted Mesos features and how organizations can consolidate multiple application workloads into a single cluster. The demo showed off the benefits of Mesos and highlighted its ability to run multiple isolated instances of Hadoop on the same cluster. The fault tollerance of Mesos was successfully demonstrated too. Subsequent to the mail session, Matei and team talked about Spark, MapReduce-like framework that adds support for iterative jobs. Spark functional programming model similar to MapReduce was demonstrated capable of caching data between iterations making it very efficient for interactive analysis of big datasets. Spark in addition was demonstrated on the same cluster running alongside Hadoop.

Part 1:

Part 2:

Part 3:

We at Yahoo! embrace Hadoop, and are looking for exciting technologies and experiences you want to share. Please contact me via the Hadoop Bay Area User Group Meetup page.