The Bay Area Hadoop User Group (HUG) meets this Wednesday, November 17, at 6:00 PM at the Surf Cafe, Yahoo! Building E, 701 First Avenue, in Sunnyvale, CA. We expect around 50 Hadoop fans.
We'll start off with socializing and beer. The first talk, at 6:30, is "Business Intelligence for Big Data," which will delve into the strengths and weaknesses of Hadoop for data transformation and reporting. Also on tap are examples of code-free data transformations in Hadoop and how to create a Hadoop/Hive/Datamart stack sans coding.
At 7:00, we'll go into "Using Hadoop for Indexing for Biometric Data, High Resolution Images, Voice/Audio Clips, and Video Clips" and Fuzzy Table. Fuzzy Table is a distributed, low latency, fuzzy-matching database built over Hadoop that enables fast fuzzy searching of content that cannot be easily indexed or ordered.
At 7:30, Ramkumar Vadali and Scott Chen, Facebook, talk about how HDFS Raid helps save disk space by reducing the number of replicas created for blocks. Facebook uses XOR-based HDFS Raid to save 3PB of disk space in its warehouse. Work is in progress to implement more sophisticated encoding schemes like Reed-Solomon, which results in even more disk savings.
Come join us. Register at the HUG meetup to receive information regarding this meeting.