Posts in the Hadoop category

hadoop_summit_logo_top2

Hadoop Summit 2012 – Registration now open!

San Jose Convention Center on on June 13-14, 2012 Hortonworks and Yahoo! are proud to host the 5th annual Hadoop Summit. The two-day event will feature many of the thought leaders from the Apache Hadoop community who will showcase successful Hadoop use cases, share development and administration tips and tricks and educate organizations about how [...]

Yahoo! Kicks off Fourth Annual Hadoop Summit

1,650 attendees from 400 companies across 12 countries descend on Santa Clara, CA to hear the latest on Big Data issues, trends and technology Today we kicked off the fourth annual Hadoop Summit at the Santa Clara Convention Center, bringing together some of the most influential thought leaders in the space, including Yahoo!, Facebook and [...]

Hadoop: The Future is Bright

It’s been six years since the Hadoop project was started. These years have been an open source fairy tale. Starting from two developers with a vision, Hadoop now has a vibrant community of hundreds of contributors, and thousands of interested developers beyond that. Hadoop has gone from a powerful solution for specific problems to a [...]

yql-cloud-2ndlevel

YDN at Hadoop India Summit

The YDN engineering team wowed attendees with YQL at last week’s Hadoop India Summit hosted by Yahoo! in Bangalore.

hadoop-205x85

Full house @ Apache Hadoop India Summit 2011

This year’s summit, held in Bangalore on February 16, was about big insights and big participation.

Apache Hadoop India Summit 2011 – Session Details

Yahoo! India R&D is proud to announce the Apache Hadoop India Summit 2011, a one-day event which will take place in Bangalore, India, on February 16th. Space is limited. Please check out session details and speaker bios.

Apache Hadoop India Summit 2011

Yahoo! India R&D is proud to announce the Apache Hadoop India Summit 2011, a one-day event which will take place in Bangalore, India, on February 16th. Space is limited. Please register now.

hadoop_2nd

Hadoop India Summit 2011 – Call for Papers Now Open

If you are active user of Hadoop and would like to share your knowledge, experience and ideas at the February Hadoop India Summit, please submit your sesssion proposal by January 12, 2011.

m45-feature

M45 Cloud Computing initiative adds 4 top universities

This initiative lets universities conduct research using Yahoo!’s supercomputing resources — approximately 4,000 processors.

hadoop-alejandroabdelnur

Hadoop2010: Workflow on Hadoop Using Oozie

iPod: Download high-resolution version Oozie v1 is a PDL workflow server engine for Hadoop that enables creating workflow jobs composed of several map-reduce jobs, Pig jobs, HDFS operations, and Java processes. Workflow jobs are monitored as single unit via Web services, a Java API, and/or a Web console. Oozie v1 is in production in Yahoo!, [...]

hadoop-jimmylin

Hadoop2010: Algorithms in MapReduce

iPod: Download high-resolution version Existing best practices for MapReduce graph algorithms have significant shortcomings that limit performance, especially with respect to partitioning, serializing, and distributing the graph. Jimmy Lin (working with Michael Schatz), University of Maryland, presents three design patterns that address designing scalable graph algorithms, and can be used to accelerate a large class [...]

hadoop-elasticpanel

Hadoop2010: Amazon Elastic MapReduce Panel

iPod: Download high-resolution version Many Amazon Web Services customers leverage Hadoop inside Amazon Elastic MapReduce, to solve problems ranging from mining clickstream data for targeted advertising, to scientific applications. In this panel, Amazon Web Services customers will Discuss a diverse set of use cases where Hadoop is being applied todayTalk about the enterprise readiness of [...]

hadoop-spamchallenge

Hadoop2010: Winning the Big Data SPAM Challenge

iPod: Download high-resolution version Worldwide spam volumes this year are forecast to rise by 30% to 40% compared with 2009. Spam recently reached a record 92% of total email. Spammers have turned their attention to social media sites as well. In 2008, there were few Facebook phishing messages; Facebook is now the second most phished [...]

hadoop-chenli

Hadoop2010: Efficient Parallel Set-Similarity Joins

iPod: Download high-resolution version A set-similarity join (SSJ) finds pairs of set-based records such that each pair is similar enough based on a similarity function and a threshold. Many applications require efficient SSJ solutions, such as record linkage and plagiarism detection. This talk studies how to efficiently perform SSJs on large data sets using Hadoop. [...]

hadoop-ericsammer

Hadoop2010: Integration Patterns & Practices

iPod: Download high-resolution version Hadoop is a powerful platform for data analysis and processing, but many struggle to understand how it fits in with regard to existing infrastructure and systems. A series of common integration points, technologies, and patterns are defined and illustrated in this presentation. Eric Sammer looks at job initiation, sequencing and scheduling, [...]