Since 2005, Yahoo! has been the most active contributor and developer of Apache Hadoop, and we’ve been doing it because we think it’s a game changer for the Internet and the enterprise. In case you’re new to big data technologies, Hadoop is the open source technology at the epicenter of big data and cloud computing, helping companies to get value from their data and to better manage their businesses. More than a thousand of you have been a part of the excitement at our Bay Area Hadoop User Group (HUG) meetings on our Sunnyvale campus over the past year.
Today we are hosting that same size group in one day — 1,000 developers, researchers, and Hadoop enthusiasts are gathering for our 3rd Annual Hadoop Summit, in Santa Clara. Our new Chief Product Officer, Blake Irving, will open the Summit with a keynote about how Hadoop helps us to deliver great experiences for our 600 million Yahoo! consumers. We’ll be announcing some major Yahoo! contributions to Hadoop. And, we’re hosting leaders from across the Hadoop ecosystem for keynotes and technical sessions — from Amazon Web Services, to Facebook, Twitter, LinkedIn, NetFlix, IBM, Cloudera, Karmasphere, Datameer, and more.
Our latest contributions to Hadoop include Hadoop with Security and Oozie, Yahoo!’s workflow engine for Hadoop. The addition of authentication and a robust workflow engine will further fuel the wider adoption of Hadoop. These enhancements allow Hadoop adopters to better manage their big data. We hope it opens the enterprise door even wider to cloud computing, enabling organizations of all types to realize the power of Hadoop.
Here’s a quick run-down on those updates:
Hadoop Security This is a significant security update to Hadoop, and it is available in beta today in the Yahoo! Distribution of Hadoop. This release integrates Hadoop with Kerberos, providing secure access and processing of business-sensitive data. This enables organizations to leverage and extract value from their data and hardware investment in Hadoop across the enterprise while maintaining data security, allowing new collaborations and applications with business-critical data. Read more about the beta version details.
Oozie, Yahoo!’s workflow engine for Hadoop Oozie is an open-source workflow solution to manage jobs running on Hadoop, including HDFS, Pig, and MapReduce. Oozie — a name for an elephant tamer — was designed for Yahoo!’s rigorous use case of managing complex workflows and data pipelines at global scale. It is integrated with Hadoop Security and is quickly becoming the de-facto standard for ETL (extraction, transformation, loading) processing at Yahoo!.
Hadoop is the technology behind every click on Yahoo!. We believe it will continue to be critical to companies with Internet-scale data-processing needs. Additionally, we are seeing Hadoop adoption in other enterprises as their data needs grow. If you want to hear more about the Hadoop Summit, follow #hadoopsummit on Twitter, or look for our photos on Flickr.