I have a new deadline to beat for this post. Wednesday February 16, 2011 6 pm, that's my new deadline. That’s when we hold the next HUG at the Yahoo! Sunnyvale Campus at the URL’s Café. BTW, in case you are looking for it, here’s the campus in more details: http://www.wikimapia.org/#lat=37.4181633&lon=-122.0250607&z=18&l=0&m=b&search=yahoo
Why is that my new deadline? Mostly, because I missed my old deadline which was a commitment to the 200 or so Hadoopers that attended the January HUG session that the presentations from the January HUG would be made available soon on the Hadoop blog on YDN, here: http://developer.yahoo.com/blogs/hadoop/
If I don’t beat this new deadline, I have this picture in my mind of the same 200 Hadoopers that showed up at the January HUG showing up at the February HUG surrounding me to remind me of my commitment to share the slides from the January HUG.
So, here are the Wednesday January 19, 2011 HUG presentations in order of appearance:
New features in Pig 0.8: Pig 0.8 focused on extending Pig's usability. We added the ability to write UDFs in scripting languages like Python, gave users better access to statistics, and created PigUnit to help users test their Pig Latin, to name only a few. Of course we continued to improve performance too by enabling compression of intermediate results and collecting together small blocks into a single mapper. We'll cover these and more in this overview of Pig 0.8s new features plus talk about what we're working on now for 0.9.
Presenter: Alan Gates, Yahoo!
Kafka: LinkedIn's Real-time Data Stream System: Kafka is a distributed, real-time, persistent messaging system developed at LinkedIn. It supports horizontally distributing message production, brokering and consumption over commodity machines. This system serves as the backbone of LinkedIn's log aggregation and activity processing system, providing data feeds for Hadoop as well as real-time consumers.
Presenter: Jay Kreps, LinkedIn
Howl: Table Management Service for Hadoop: Howl is a project that aims at providing a table management service. Data processors using Hadoop have a common need for table management services. The goal of this service is to track data that exists in a Hadoop grid and present that data to users in a tabular format. The table management service will present data in an uniform format to all tools like Map Reduce, Streaming, Pig, and Hive, by providing interfaces to each of these data processing tools.
Presenter: Devaraj Das, Yahoo!
If you haven’t yet signed up for the February HUG, you can sign up here:
Looking forward to meeting you at the February HUG.