Posts in the Video category

cocktails321x65 copy

Yahoo! Announces Cocktails – Shaken, Not Stirred

Developers, time to geek out. Yahoo! has been working behind the scenes for the past several months on an exciting new technology that we think will deeply impact the web developer community. We call it “Cocktails” and it’s the technology powering Livestand, which we launched today at Product Runway. “Cocktails” is a mix of HTML5, [...]

Introducing Yahoo! WebPlayer

Today we’re announcing the first beta version of Yahoo! WebPlayer. At its core, Yahoo! WebPlayer is a powerful Web based media player written in HTML and JavaScript. It supports a variety of video and audio media formats and services, for example Yahoo! Video, YouTube, MP3 or WMA. Beyond being a flexible and universal media player, [...]

Yahoo Application Platform logo

Screencast: Introducing the Yahoo! Application Platform

This screencast shows you how to create a simple YAP application using the example from the tutorial in the YAP Developers Guide.

hadoop-alejandroabdelnur

Hadoop2010: Workflow on Hadoop Using Oozie

iPod: Download high-resolution version Oozie v1 is a PDL workflow server engine for Hadoop that enables creating workflow jobs composed of several map-reduce jobs, Pig jobs, HDFS operations, and Java processes. Workflow jobs are monitored as single unit via Web services, a Java API, and/or a Web console. Oozie v1 is in production in Yahoo!, [...]

hadoop-jimmylin

Hadoop2010: Algorithms in MapReduce

iPod: Download high-resolution version Existing best practices for MapReduce graph algorithms have significant shortcomings that limit performance, especially with respect to partitioning, serializing, and distributing the graph. Jimmy Lin (working with Michael Schatz), University of Maryland, presents three design patterns that address designing scalable graph algorithms, and can be used to accelerate a large class [...]

hadoop-elasticpanel

Hadoop2010: Amazon Elastic MapReduce Panel

iPod: Download high-resolution version Many Amazon Web Services customers leverage Hadoop inside Amazon Elastic MapReduce, to solve problems ranging from mining clickstream data for targeted advertising, to scientific applications. In this panel, Amazon Web Services customers will Discuss a diverse set of use cases where Hadoop is being applied todayTalk about the enterprise readiness of [...]

hadoop-spamchallenge

Hadoop2010: Winning the Big Data SPAM Challenge

iPod: Download high-resolution version Worldwide spam volumes this year are forecast to rise by 30% to 40% compared with 2009. Spam recently reached a record 92% of total email. Spammers have turned their attention to social media sites as well. In 2008, there were few Facebook phishing messages; Facebook is now the second most phished [...]

hadoop-chenli

Hadoop2010: Efficient Parallel Set-Similarity Joins

iPod: Download high-resolution version A set-similarity join (SSJ) finds pairs of set-based records such that each pair is similar enough based on a similarity function and a threshold. Many applications require efficient SSJ solutions, such as record linkage and plagiarism detection. This talk studies how to efficiently perform SSJs on large data sets using Hadoop. [...]

hadoop-ericsammer

Hadoop2010: Integration Patterns & Practices

iPod: Download high-resolution version Hadoop is a powerful platform for data analysis and processing, but many struggle to understand how it fits in with regard to existing infrastructure and systems. A series of common integration points, technologies, and patterns are defined and illustrated in this presentation. Eric Sammer looks at job initiation, sequencing and scheduling, [...]

hadoop-dougcampbell

Hadoop2010: Online Content Optimization

iPod: Download high-resolution version One of the most interesting problems we work on at Yahoo! is to provide the most relevant content to our users. This involves being able to track what are the interests of our users; mining the ever-changing content pool to see what is relevant, popular for our users. There is also [...]

hadoop-nathanmarz

Hadoop2010: Cascalog Query Language

iPod: Download high-resolution version Cascalog is an interactive query language for Hadoop with a focus on simplicity, expressiveness, and flexibility intended to be used by Analysts and Developers alike. Cascalog eschews the SQL syntax for a simpler and more expressive syntax based on Datalog. With this added expressiveness, Cascalog can query existing data stores "out [...]

hadoop-jaykreps

Hadoop2010: Data Apps & Infrastructure at LinkedIn

iPod: Download high-resolution version LinkedIn runs a number of large-scale Hadoop calculations to power its features — from computing similar profiles, jobs, and companies, to predicting People You May Know recommendations to help users find their professional connections. This talk covers how Hadoop fits into a production data cycle for a consumer-scale social network, including [...]

hadoop-keithwiley

Hadoop2010: Parallel Image Stacking

iPod: Download high-resolution version Keith Wiley, University of Washington, talks about parallel distributed image stacking and mosaicing with Hadoop, and reports on his experience implementing a scalable image-processing pipeline for the SDSS database using Hadoop. This multi-Terabyte imaging dataset provides a good testbed for algorithm development since its scope and structure approximate future surveys. His [...]

hadoop-sergeivassilvitskii

Hadoop2010: XXL Graph Algorithms

iPod: Download high-resolution version The MapReduce framework is now a de facto standard for massive dataset computations. However, many of the elementary graph algorithms are inherently sequential and appear to be hard to parallelize (often requiring number of rounds proportional to the diameter of the graph). In this talk, Sergei Vassilvitskii, Yahoo! Labs, describes a [...]

hadoop-jeremybruestle

Hadoop2010: Hadoop for Genomics

iPod: Download high-resolution version The field of genomics is of increasing importance to research and medicine. As the physical cost of DNA sequencing continues to drop, biologists are collecting ever larger data sets, requiring more sophisticated data processing. Hadoop is an excellent platform on which to build a consistent set of tools for genomics research. [...]