April 29, 2011
Last month the HCatalog project (formerly known as Howl) was accepted into the Apache Incubator. We have already branched for a 0.1 release, which we hope to push in the next few weeks. Given all this activity, I thought it would be a good time to write a post on the motivation behind HCatalog, what [...]
August 18, 2010
Yahoo! has begun evaluating Hive for use as part of its Hadoop stack. Since, in many peoples’ minds, Hive and Pig are roughly equivalent and Pig Latin is very close to SQL, this has led to some confusion. Why are we interested in using both technologies? As we have looked at our workloads and analyzed [...]
January 29, 2010
I have been asked by users who are going to construct a data pipeline whether they should use Pig Latin or SQL. For those of you who are not familiar with Pig, it is a platform for analyzing large data sets. It is built on Hadoop and provides ease of programming, optimization opportunities and extensibility. [...]