Hadoop User Group

Hadoop User Group meetings have now been held in Beijing, Berlin, London, New York, San Diego and Washington DC, in addition to the Bay Area, with one in the works in Bangalore. In the Bay Area, we typically host them on the third Wednesday of each month at the Yahoo! campus in Santa Clara.

The meeting last week featured Matei Zaharia from UC Berkeley talking about the Fair Scheduler for Hadoop. The need for a scheduler has been a known requirement for quite a while, and Matei got started working on this while he was an intern at Facebook. His talk described their goals of providing fast response time for small jobs and guaranteed SLA’s for production jobs. It then discussed the concept of pools, the scheduling algorithm for assigning resource capacity, as well as installation, configuration and administration of the scheduler.

This was followed by a talk from Aaraon Kimball from Cloudera on Importing Data from MySQL which discussed techniques for loading data from databases into HDFS.

Next month’s user group meeting will feature Yahoo!’s Milind Bhandarkar talking about performance enhancement techniques for Hadoop developers.