Full house @ Apache Hadoop India Summit 2011

This year's summit, held in Bangalore on February 16, was about big insights and big participation...

Hadoop isn't the next big thing. It has already arrived and getting bigger as you read this! "100 per cent of Fortune 500 companies are either using, or planning to use Hadoop," said Todd Papaioannou, VP – Cloud Infrastructure, quoting an industry source during his keynote at the Apache Hadoop India Summit 2011.

Over 700 cloud computing enthusiasts from all over India representing various industries and academia converged at the J.N. Tata auditorium in IISc, for their dose of deep insights on Hadoop and Cloud Computing. The auditoriums were packed and the tweets said it all "…at #hadoopsummit and lovin it!."

 


In his welcome note, Hari Vasudev, VP – Cloud Platform Group, offered a glimpse of the Hadoop scale at Yahoo!, "….being used on 40,000 servers, handling 170 petabytes of data, and running more than 1 million hadoop jobs per month…." Hari spoke about Yahoo!'s commitment to Open Source and announced the new Bangalore Hadoop User Group. The first meet is slated for March 16. (More details at http://tech.groups.yahoo.com/group/hug-blr/)

 

Todd's keynote on Hadoop and the Future of Cloud Computing discussed why Y! has invested in Hadoop & Open Source, how Y! uses Hadoop to power some of its most innovative technology products, and the future of big data and cloud computing.

He described the humongous amount of data being generated as "data exhaust," which "Hadoop can help handle, and process into business insights." Todd sees Hadoop maturing to handle the exponential growth in big data in the near future.

On why Hadoop was made open source, Todd said, "In terms of business, infrastructure is not a differentiator for Yahoo!. But there have been multiple benefits. You get good ideas from all over, have a talent pool to tap into and you don't have to worry about the infrastructure becoming legacy."

The morning session also saw Prof. D. Janakiram from IIT Madras, speaking on Programming Abstractions for Smart Apps on Cloud. Prof. Janakiram is investigating alternate programming models for large data; he described the model as Deformable Mesh Abstraction (DMA), which is useful for AI applications and is able to support need-based task creations, which is not possible with Map Reduce.

Sanjay Radia, Cloud Architect, Yahoo! talked about upcoming features in HDFS including HDFS federation (Jira HDFS-1052) which tries to address the scalability limitations of the namespace.

Sundara Nagarajan, Director – R&D, Storage Works Division at HP and Milind Bhandarkar from LinkedIn, were among the other speakers in the first half. There were also two lightning talks by Amazon and Informatica.

The afternoon sessions were divided into three tracks: Platform, Application and Research.

The sessions in the platform track focused on various future developments in Hadoop and its ecosystem. In his talk on Hadoop NextGen, Sharad Agrawal covered the new resource scheduler for Hadoop, which will provide better scalability, performance, and alternative programming models. More details in this blog post. Other talks covered new developments in PIG, Hive, GridSim and Oozie.

In the application track, well-known architects and technologists from various industries showcased how Hadoop is being used to crunch big data and extract useful business knowledge. But it isn't just industry that is benefiting from Hadoop.

The Research track saw professors and research students from IISc Bangalore, IIT-Bombay, IIIT-Hyderabad, PSG Tech Coimbatore and TCE Madurai. The talks covered how Hadoop has been used in academic research and different ways to improve the Hadoop scheduler.

There was also a talk by Dr. Rituraj Kumar from CAIR (Center for Artificial Intelligence & Robotics) DRDO Labs on "Adaptive parallel computing over distributed military computing infrastructures." He explained how Hadoop can be effectively used in the military domain too.

Toward the end of the event, Bala R. Girisaballa from Yahoo! moderated a panel discussion featuring Sanjay Radia from Yahoo!, Dr Vasudev Varma from IIIT-Hyd, Sanjay Krishnamurthy from Informatica, and Namit Jain from Facebook.

Videos will be made available shortly. Look out for that.

See photos from the event.

Twittersweep! (#hadoopsummit)

Great day at #hadoopsummit today.

at #hadoopsummitand lovin it! Waitin for platform track...

At Apache Hadoop summit 2011. Awesome so far.

Listening to some nice talks at the #hadoopsummit

YahooINNews on Twitter:

houseful at the jn tata auditorium. many from the audience have come from out of Bangalore

data becoming more important than computing: the data centre of the future will be built on a converged infrastructure

build to scale: zynga's cityville had 100 million users in 43 days

Sessions
cellpadding="0" cellspacing="0">
style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">08:30
AM - 09:00 AM style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;"
colspan="3">Registration and Coffee style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">09:00
AM - 09:15 AM style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;"
colspan="3">Welcome Speech
Hari Vasudev - VP, Cloud Platform Group, Yahoo!
href="http://playlist.yahoo.com/makeplaylist.dll/Hadoop-01-a-Keynote-Hari-Vasudev-desktop.m4v?pt=rd&sdm=web&sid=121599918&ufn=Welcome">Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">09:15
AM - 09:45 AM style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;"
colspan="3">Keynote Address - Hadoop and the Future of
Cloud Computing

Todd Papaioannou - VP, Cloud Architecture, Yahoo!
href="http://www.slideshare.net/ydn/ahis2011-keynote-hadoop-the-future-of-cloud-computing">Slides
href="http://playlist.yahoo.com/makeplaylist.dll/Hadoop-02-a-Keynote-Toddp-desktop.m4v?pt=rd&sdm=web&sid=121601019&ufn=Keynote%20Address%20-%20Hadoop%20and%20the%20Future%20of%20Cloud%20Computing">Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">09:45
AM - 10:15 AM style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;"
colspan="3">Keynote Address - Programming Abstractions
for Smart Apps on Clouds

Prof. D. Janakiram - Department of CSE, Indian Institute of
Technology (IIT), Madras
href="http://www.slideshare.net/ydn/ahis2011-keynote-programming-abstractions-for-smart-apps-on-clouds">Slides
href="http://playlist.yahoo.com/makeplaylist.dll/Hadoop-03-a-Keynote-Jankiram-desktop.m4v?pt=rd&sdm=web&sid=121600525&ufn=Programming%20Abstractions%20on%20Cloud">Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">10:15
AM - 10:45 AM style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;"
colspan="3">Keynote Address - - Exploring the Future IT
Infrastructure, Cloud

Sundara Nagarajan - Director of R&D, Storage Works Division, HP
href="http://www.slideshare.net/ydn/ahis2011-keynote-exploring-the-future-it-infrastructurecloud-included">Slides
href="http://playlist.yahoo.com/makeplaylist.dll/Hadoop-04-a-Keynote-Nagarajan-desktop.m4v?pt=rd&sdm=web&sid=121601158&ufn=Future%20IT%20infrastructure">Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">10:45
AM - 11:00 AM style="text-align:center;background-color:rgb(104, 118, 132);color:rgb(255, 255, 255);"
colspan="3">Coffee Break style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">11:00
AM - 11:30 AM style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;"
colspan="3">Keynote Address - Federated HDFS
Dr. Sanjay Radia - Cloud Architect, Yahoo!
href="http://www.slideshare.net/ydn/ahis2011-keynote-hdfs-federation">Slides
href="http://playlist.yahoo.com/makeplaylist.dll/Hadoop-05-a-Keynote-Sanjay-Radia-desktop.m4v?pt=rd&sdm=web&sid=121672565&ufn=HDFS%20Federation">Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">11:30
AM - 12:00 PM style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;"
colspan="3">Keynote Address - Scaling Hadoop
Applications

Dr. Milind Bhandarkar - LinkedIn
href="http://www.slideshare.net/ydn/ahis2011-keynote-scaling-hadoop-applications">Slides
href="http://playlist.yahoo.com/makeplaylist.dll/Hadoop-06-a-Keynote-Milind-Bhandarkar-desktop.m4v?pt=rd&sdm=web&sid=121600574&ufn=Scaling%20Hadoop%20Apps">Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">12:00
PM - 12:15 PM style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;"
colspan="3">Lightning talk by Amazon
href="http://www.slideshare.net/ydn/ahis2011-lightening-talkamazon">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">12:15
PM - 12:30 PM style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;"
colspan="3">Lightning talk "Informatica and Big Data" by
Informatica
href="http://www.slideshare.net/ydn/ahis2011-lightening-talkinformatica-and-big-data">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">12:30
PM - 01:15 PM style="text-align:center;background-color:rgb(104, 118, 132);color:rgb(255, 255, 255);"
colspan="3">Lunch Break style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;"> style="text-align:center;background-color:rgb(104, 118, 132);color:rgb(255, 255, 255);">
Platform Track style="text-align:center;background-color:rgb(104, 118, 132);color:rgb(255, 255, 255);">
Application Track style="text-align:center;background-color:rgb(104, 118, 132);color:rgb(255, 255, 255);">
Research Track style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">01:15
PM - 01:45 PM style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">
Hadoop NextGen
Sharad Agrawal - Yahoo!
href="http://www.slideshare.net/ydn/apache-hadoop-india-simmit-2011-p-the-next-generation-of-hadoop-mapreduce">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">
Online content optimization using Hadoop
Dr. Shail Aditya - Yahoo!
href="http://www.slideshare.net/ydn/ahis2011-application-online-content-optimization-using-hadoop">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">
Middleware Frameworks for Adaptive Executions and
Visualizations of Climate and Weather Applications on grids

Dr. Sathish Vadhiyar - IISc Bangalore
href="http://www.slideshare.net/ydn/ahis2011-research-middleware-frameworks-for-adaptive-executions-and-visualizations-of-climate-and-weather-applications-on-grids">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">01:45
PM - 02:15 PM style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Pig, Making Hadoop Easy
Alan Gates - Yahoo!
href="http://www.slideshare.net/ydn/ahis2011-platform-pigmaking-hadoop-easy">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Making Hadoop Enterprise ready with Amazon Elastic Map/Reduce
Simone Brunozzi - Amazon
href="http://www.slideshare.net/ydn/ahis2011-application-making-hadoop-enterprise-ready-with-amazon-elastic-mapreduce">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Comparison between Extension of Fairshare Scheduler and
a Novel SLA based Learning Scheduler in Hadoop

Dr G Sudha Sadasivam , N Priya - PSG Tech, Coimbatore
href="http://www.slideshare.net/ydn/ahis2011-research-an-extension-of-fairsharescheduler-and-a-novel-sla-based-learning-scheduler-in-hadoop">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">02:15
PM - 02:45 PM style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">
Data on Grid (GDM)
Venkatesh S - Yahoo!
href="http://www.slideshare.net/ydn/ahis2011-platform-data-infrastructure-on-hadoop">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">
Hadoop Avatar at eBay
Srinivasan Rengarajan, Mohit Soni - eBay
href="http://www.slideshare.net/ydn/ahis2011-application-hadoop-avatar-at-ebay">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">
VirtPerf: A Capacity Planning Tool for Virtual Environment
Dr. Umesh Bellur - IIT, Bombay
href="http://www.slideshare.net/ydn/apache-hadoop-india-simmit-2011-r-profiling-application-performance">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">02:45
PM - 03:15 PM style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Hive Evolution
Namit Jain - Facebook
href="http://www.slideshare.net/ydn/ahis2011-platform-hive-evolution">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Feeds processing at Yahoo! : One Hadoop, one platform, 2
systems

Jean-Christophe Counio - Yahoo!
href="http://www.slideshare.net/ydn/ahis2011-application-feeds-processing-at-yahoo">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Scheduling in MapReduce using Machine Learning Techniques
Dr. Vasudev Varma - IIIT Hyderabad
href="http://www.slideshare.net/ydn/apache-hadoop-india-simmit-2011-r-scheduling-in-mapreduce-using-machine-learning-techniques">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">03:15
PM - 03:30 PM style="text-align:center;background-color:rgb(104, 118, 132);color:rgb(255, 255, 255);"
colspan="3">Coffee Break style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">03:30
PM - 04:00 PM style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Making Hadoop Secure
Devaraj Das - Yahoo!
href="http://www.slideshare.net/ydn/ahis2011-platform-making-apache-hadoop-secure">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Hadoop 101
Basant Verma - Yahoo!
href="http://www.slideshare.net/ydn/ahis2011-application-mapreduce-programming-best-practices">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Adaptive parallel computing over distributed military
computing infrastructures

Dr. Rituraj Kumar - DRDO Labs
href="http://www.slideshare.net/ydn/ahis2011-research-adaptive-parallel-computing-over-distributed-military-computing-infrastructures">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">04:00
PM - 04:30 PM style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">
Simulation and Performance
Ranjit Mathew - Yahoo!
href="http://www.slideshare.net/ydn/ahis2011-platform-hadoop-simulation-and-performance">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">
Searching Information Inside Hadoop Platform
Abinash - Bizosys Technologies
href="http://www.slideshare.net/ydn/ahis2011-application-searching-information-inside-hadoop-platform">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">
Provisioning Hadoop’s Map Reduce in Cloud for Effective
Storage as a Service

Dr. Shalinie S.M. - TCE Madurai
href="http://www.slideshare.net/ydn/apache-hadoop-india-simmit-2011-r-provisioning-hadoops-mapreduce-in-cloud-for-effective-storage-as-a-service">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">04:30
PM - 05:00 PM style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Oozie - Workflow for Hadoop
Andreas N - Yahoo!
href="http://www.slideshare.net/ydn/apache-hadoop-india-summit-2011-talk-oozie-workflow-for-hadoop-by-andreas-n">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Data Integration on Hadoop
Sanjay Kaluskar - Informatica
href="http://www.slideshare.net/ydn/ahis2011-application-data-integration-on-hadoop">Slides
Video style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">
Framework for a suite of algorithms for predictive modelling
on Hadoop

Vaijanath Rao, Rohini Uppuluri - AOL India
href="http://www.slideshare.net/ydn/ahis2011-research-framework-for-a-suite-of-coclustering-algorithms-for-predictive-modeling-on-hadoop">Slides
Video style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;">05:00
PM - 05:45 PM style="background-color:rgb(255, 255, 255);text-align:center;vertical-align:top;"
colspan="3">Panel Discussion style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;">05:45
PM - 06:00 PM style="text-align:center;background-color:rgb(237, 244, 249);vertical-align:top;"
colspan="3">Closing
Apache Hadoop and Hadoop are trademarks of The Apache Software Foundation.