Our own Sanjay gave a talk at OSCON on The Hadoop Distributed Filesystem (HDFS) which provides scalable, fault-tolerant, and high performance data storage and retrieval for Internet scale data applications. This talk presents an overview of HDFS and then dives under the hood to look at its implementation, performance characteristics, and planned enhancements.
TRT: 40:43 MM:SS
ABOUT: Sanjay leads the Hadoop Distributed File System project at Yahoo where it is in daily use for large clusters of several thousand machines. Previously he has held senior positions at Cassatt, Sun Microsystems and INRIA where he has developed systems software for distributed systems and grid/utility computing infrastructures. He has published numerous papers and holds several patents. Sanjay has PhD in Computer Science from University of Waterloo, Canada.