Hadoop2010: Winning the Big Data SPAM Challenge

allowFullScreen='true' src='https://s.yimg.com/m/up/ypp/default/player.swf' flashvars='vid=21232268&autoPlay=0'>

iPod: Download high-resolution version

Worldwide spam volumes this year are forecast to rise by 30% to 40% compared with 2009. Spam recently reached a record 92% of total email. Spammers have turned their attention to social media sites as well. In 2008, there were few Facebook phishing messages; Facebook is now the second most phished organization online. Even though Twitter has managed to recently bring its spam rate down to as low as 1%, the absolute volume of spam is still massive given its tens of millions of users. Dealing with spam introduces a number of Big Data challenges. The sheer size and scale of the data is enormous. In addition, spam in social media involves the need to understand very complex patterns of behavior as well as to identify new types of spam. This presentation discusses how data analytics built on Hadoop can help businesses keep spam from spiraling out of control.

Baycat logo
Media Production by BAYCAT, a non-profit community media producer that educates and employs underserved youth and adults in the digital media arts.