
« Welcome to the YDN Hadoop & Distributed Computing Blog | Main | Getting Paid to Test Open Source Software »
November 30, 2007
A few weeks ago, a project called Pig went into incubation at the Apache Software Foundation.
Since you're probably scratching your head about what that sentence means, let me break it down for you. Pig is a project that began in Yahoo! Research and we're building an open source community to further develop it via the Apache Software Foundation (ASF). Right now it's in the initial phases of becoming a full-fledged project under the ASF umbrella. That's commonly referred to as incubation, since it is hosted by the Apache Incubator. If you'd like more details, check out the Pig Proposal on the Incubator wiki.
The Incubator project is the entry path into The Apache Software Foundation (ASF) for projects and codebases wishing to become part of the Foundation's efforts. All code donations from external organisations and existing external projects wishing to join Apache enter through the Incubator.
Great. So what's this Pig thing all about? I asked that question of Olga Natkovich, one of the Pig developers here at Yahoo.
Pig is a high-level language (PigLatin) for data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
In my mind, Pig is to Hadoop as SQL is to relational databases. It's the language and logic that'll open up access to a much wider audience of people: anyone who can write a query. Today you usually need to sit down write code to make use of the results from processing data on a Hadoop cluster. By building a robust query layer on top of Hadoop, the barrier gets quite a bit lower.
See Also: Yahoo Pig and Google Sawzall (Greg Linden)
Jeremy Zawodny
Yahoo! Developer Network
Posted at November 30, 2007 8:15 AM
Hadoop is a trademark of the Apache Software Foundation.
Copyright © 2008 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Copyright Policy - Job Openings