This year's summit, held in Bangalore on February 16, was about big insights and big participation...
Hadoop isn't the next big thing. It has already arrived and getting bigger as you read this! "100 per cent of Fortune 500 companies are either using, or planning to use Hadoop," said Todd Papaioannou, VP Cloud Infrastructure, quoting an industry source during his keynote at the Apache Hadoop India Summit 2011.
Over 700 cloud computing enthusiasts from all over India representing various industries and academia converged at the J.N. Tata auditorium in IISc, for their dose of deep insights on Hadoop and Cloud Computing. The auditoriums were packed and the tweets said it all " at #hadoopsummit and lovin it!."
In his welcome note, Hari Vasudev, VP Cloud Platform Group, offered a glimpse of the Hadoop scale at Yahoo!, " .being used on 40,000 servers, handling 170 petabytes of data, and running more than 1 million hadoop jobs per month ." Hari spoke about Yahoo!'s commitment to Open Source and announced the new Bangalore Hadoop User Group. The first meet is slated for March 16. (More details at http://tech.groups.yahoo.com/group/hug-blr/)
Todd's keynote on Hadoop and the Future of Cloud Computing discussed why Y! has invested in Hadoop & Open Source, how Y! uses Hadoop to power some of its most innovative technology products, and the future of big data and cloud computing.
He described the humongous amount of data being generated as "data exhaust," which "Hadoop can help handle, and process into business insights." Todd sees Hadoop maturing to handle the exponential growth in big data in the near future.
On why Hadoop was made open source, Todd said, "In terms of business, infrastructure is not a differentiator for Yahoo!. But there have been multiple benefits. You get good ideas from all over, have a talent pool to tap into and you don't have to worry about the infrastructure becoming legacy."
The morning session also saw Prof. D. Janakiram from IIT Madras, speaking on Programming Abstractions for Smart Apps on Cloud. Prof. Janakiram is investigating alternate programming models for large data; he described the model as Deformable Mesh Abstraction (DMA), which is useful for AI applications and is able to support need-based task creations, which is not possible with Map Reduce.
Sanjay Radia, Cloud Architect, Yahoo! talked about upcoming features in HDFS including HDFS federation (Jira HDFS-1052) which tries to address the scalability limitations of the namespace.
Sundara Nagarajan, Director R&D, Storage Works Division at HP and Milind Bhandarkar from LinkedIn, were among the other speakers in the first half. There were also two lightning talks by Amazon and Informatica.
The afternoon sessions were divided into three tracks: Platform, Application and Research.
The sessions in the platform track focused on various future developments in Hadoop and its ecosystem. In his talk on Hadoop NextGen, Sharad Agrawal covered the new resource scheduler for Hadoop, which will provide better scalability, performance, and alternative programming models. More details in this blog post. Other talks covered new developments in PIG, Hive, GridSim and Oozie.
In the application track, well-known architects and technologists from various industries showcased how Hadoop is being used to crunch big data and extract useful business knowledge. But it isn't just industry that is benefiting from Hadoop.
The Research track saw professors and research students from IISc Bangalore, IIT-Bombay, IIIT-Hyderabad, PSG Tech Coimbatore and TCE Madurai. The talks covered how Hadoop has been used in academic research and different ways to improve the Hadoop scheduler.
There was also a talk by Dr. Rituraj Kumar from CAIR (Center for Artificial Intelligence & Robotics) DRDO Labs on "Adaptive parallel computing over distributed military computing infrastructures." He explained how Hadoop can be effectively used in the military domain too.
Toward the end of the event, Bala R. Girisaballa from Yahoo! moderated a panel discussion featuring Sanjay Radia from Yahoo!, Dr Vasudev Varma from IIIT-Hyd, Sanjay Krishnamurthy from Informatica, and Namit Jain from Facebook.
Videos will be made available shortly. Look out for that.
Great day at #hadoopsummit today.
at #hadoopsummitand lovin it! Waitin for platform track...
At Apache Hadoop summit 2011. Awesome so far.
Listening to some nice talks at the #hadoopsummit
YahooINNews on Twitter:
houseful at the jn tata auditorium. many from the audience have come from out of Bangalore
data becoming more important than computing: the data centre of the future will be built on a converged infrastructure
build to scale: zynga's cityville had 100 million users in 43 days
AM - 09:00 AM
colspan="3">Registration and Coffee
AM - 09:15 AM
Hari Vasudev - VP, Cloud Platform Group, Yahoo!
AM - 09:45 AM
colspan="3">Keynote Address - Hadoop and the Future of
Todd Papaioannou - VP, Cloud Architecture, Yahoo!
AM - 10:15 AM
colspan="3">Keynote Address - Programming Abstractions
for Smart Apps on Clouds
Prof. D. Janakiram - Department of CSE, Indian Institute of
Technology (IIT), Madras
AM - 10:45 AM
colspan="3">Keynote Address - - Exploring the Future IT
Sundara Nagarajan - Director of R&D, Storage Works Division, HP
AM - 11:00 AM
AM - 11:30 AM
colspan="3">Keynote Address - Federated HDFS
Dr. Sanjay Radia - Cloud Architect, Yahoo!
AM - 12:00 PM
colspan="3">Keynote Address - Scaling Hadoop
Dr. Milind Bhandarkar - LinkedIn
PM - 12:15 PM
colspan="3">Lightning talk by Amazon
PM - 12:30 PM
colspan="3">Lightning talk "Informatica and Big Data" by
PM - 01:15 PM
PM - 01:45 PM
Sharad Agrawal - Yahoo!
Online content optimization using Hadoop
Dr. Shail Aditya - Yahoo!
Middleware Frameworks for Adaptive Executions and
Visualizations of Climate and Weather Applications on grids
Dr. Sathish Vadhiyar - IISc Bangalore
PM - 02:15 PM
Pig, Making Hadoop Easy
Alan Gates - Yahoo!
Making Hadoop Enterprise ready with Amazon Elastic Map/Reduce
Simone Brunozzi - Amazon
Comparison between Extension of Fairshare Scheduler and
a Novel SLA based Learning Scheduler in Hadoop
Dr G Sudha Sadasivam , N Priya - PSG Tech, Coimbatore
PM - 02:45 PM
Data on Grid (GDM)
Venkatesh S - Yahoo!
Hadoop Avatar at eBay
Srinivasan Rengarajan, Mohit Soni - eBay
VirtPerf: A Capacity Planning Tool for Virtual Environment
Dr. Umesh Bellur - IIT, Bombay
PM - 03:15 PM
Namit Jain - Facebook
Feeds processing at Yahoo! : One Hadoop, one platform, 2
Jean-Christophe Counio - Yahoo!
Scheduling in MapReduce using Machine Learning Techniques
Dr. Vasudev Varma - IIIT Hyderabad
PM - 03:30 PM
PM - 04:00 PM
Making Hadoop Secure
Devaraj Das - Yahoo!
Basant Verma - Yahoo!
Adaptive parallel computing over distributed military
Dr. Rituraj Kumar - DRDO Labs
PM - 04:30 PM
Simulation and Performance
Ranjit Mathew - Yahoo!
Searching Information Inside Hadoop Platform
Abinash - Bizosys Technologies
Provisioning Hadoopâs Map Reduce in Cloud for Effective
Storage as a Service
Dr. Shalinie S.M. - TCE Madurai
PM - 05:00 PM
Oozie - Workflow for Hadoop
Andreas N - Yahoo!
Data Integration on Hadoop
Sanjay Kaluskar - Informatica
Framework for a suite of algorithms for predictive modelling
Vaijanath Rao, Rohini Uppuluri - AOL India
PM - 05:45 PM
PM - 06:00 PM