allowFullScreen='true' src='https://s.yimg.com/m/up/ypp/default/player.swf' flashvars='vid=21232308&autoPlay=0'>
The MapReduce framework is now a de facto standard for massive dataset computations. However, many of the elementary graph algorithms are inherently sequential and appear to be hard to parallelize (often requiring number of rounds proportional to the diameter of the graph). In this talk, Sergei Vassilvitskii, Yahoo! Labs, describes a different approach, called filtering, to implementing fundamental graph algorithms, like computing connected components and minimum spanning trees. He notes that filtering can also be applied to speed up general clustering algorithms like k-means. Finally, he describes how to apply the technique to find tight-knit friend groups in a social network.
Read More »from Hadoop2010: XXL Graph Algorithms