San Jose Convention Center on on June 13-14, 2012 Hortonworks and Yahoo! are proud to host the 5th annual Hadoop Summit. The two-day event will feature many of the thought leaders from the Apache Hadoop community who will showcase successful Hadoop use cases, share development and administration tips and tricks and educate organizations about how [...]
1,650 attendees from 400 companies across 12 countries descend on Santa Clara, CA to hear the latest on Big Data issues, trends and technology Today we kicked off the fourth annual Hadoop Summit at the Santa Clara Convention Center, bringing together some of the most influential thought leaders in the space, including Yahoo!, Facebook and [...]
It’s been six years since the Hadoop project was started. These years have been an open source fairy tale. Starting from two developers with a vision, Hadoop now has a vibrant community of hundreds of contributors, and thousands of interested developers beyond that. Hadoop has gone from a powerful solution for specific problems to a [...]
The YDN engineering team wowed attendees with YQL at last week’s Hadoop India Summit hosted by Yahoo! in Bangalore.
This year’s summit, held in Bangalore on February 16, was about big insights and big participation.
Yahoo! India R&D is proud to announce the Apache Hadoop India Summit 2011, a one-day event which will take place in Bangalore, India, on February 16th. Space is limited. Please check out session details and speaker bios.
Yahoo! India R&D is proud to announce the Apache Hadoop India Summit 2011, a one-day event which will take place in Bangalore, India, on February 16th. Space is limited. Please register now.
If you are active user of Hadoop and would like to share your knowledge, experience and ideas at the February Hadoop India Summit, please submit your sesssion proposal by January 12, 2011.
This initiative lets universities conduct research using Yahoo!’s supercomputing resources — approximately 4,000 processors.
iPod: Download high-resolution version Oozie v1 is a PDL workflow server engine for Hadoop that enables creating workflow jobs composed of several map-reduce jobs, Pig jobs, HDFS operations, and Java processes. Workflow jobs are monitored as single unit via Web services, a Java API, and/or a Web console. Oozie v1 is in production in Yahoo!, [...]
iPod: Download high-resolution version Existing best practices for MapReduce graph algorithms have significant shortcomings that limit performance, especially with respect to partitioning, serializing, and distributing the graph. Jimmy Lin (working with Michael Schatz), University of Maryland, presents three design patterns that address designing scalable graph algorithms, and can be used to accelerate a large class [...]
iPod: Download high-resolution version Many Amazon Web Services customers leverage Hadoop inside Amazon Elastic MapReduce, to solve problems ranging from mining clickstream data for targeted advertising, to scientific applications. In this panel, Amazon Web Services customers will Discuss a diverse set of use cases where Hadoop is being applied todayTalk about the enterprise readiness of [...]
iPod: Download high-resolution version Worldwide spam volumes this year are forecast to rise by 30% to 40% compared with 2009. Spam recently reached a record 92% of total email. Spammers have turned their attention to social media sites as well. In 2008, there were few Facebook phishing messages; Facebook is now the second most phished [...]
iPod: Download high-resolution version A set-similarity join (SSJ) finds pairs of set-based records such that each pair is similar enough based on a similarity function and a threshold. Many applications require efficient SSJ solutions, such as record linkage and plagiarism detection. This talk studies how to efficiently perform SSJs on large data sets using Hadoop. [...]
iPod: Download high-resolution version Hadoop is a powerful platform for data analysis and processing, but many struggle to understand how it fits in with regard to existing infrastructure and systems. A series of common integration points, technologies, and patterns are defined and illustrated in this presentation. Eric Sammer looks at job initiation, sequencing and scheduling, [...]