Blog Posts by Yahoo! Developer Network

  • New YQL release makes select* from internet even more convenient

    We've launched some pretty big updates during the last few months, for running server side JS code in your Open Data Tables, and new verbs for modifying and updating data on the web. Our release today consists of a lot of features (and bug fixes) that have been identified by developers on twitter, on our forums, or by our colleagues and ourselves - we use YQL all the time and certain things annoy us too!

    The two major new feature additions are: the SET verb and the yql.storage tables.

    SET enables you to configure static variables, such as API keys, secrets, and other required values, independently of YQL statements and web service calls. Why is this so useful? Consider the following example which uses the Guardian content search Open Data Table:

    select * from guardian.content.search where api_key="1234567890" and q='environment'

    Currently, every time I use the Guardian tables I need to send my API key - which really doesn't change very much. With set, I can just define it

    Read More »from New YQL release makes select* from internet even more convenient
  • Mahadev Konar – ZooKeeper

    Mahadev Konar from Yahoo! introduces ZooKeeper, a Apache Hadoop project. ZooKeeper is a highly available, scalable, distributed service for configuration, consensus, group membersip, leader election, naming, and coordination. This is a deep dive look at the features and functionality ZooKeeper provides, with examples of how it's used at Yahoo!.



    For a better quality version, higher resolution, click below:
    iPodDownload NOW

    DesktopDownload NOW

    Read More »from Mahadev Konar – ZooKeeper
  • Owen O’Malley on the Future of MapReduce

    Owen O'Malley, Hadoop architect at Yahoo!, describes the current state of affairs for MapReduce jobs in Hadoop (which hasn't changed much since 2005), and walks through a variety of approaches that are being implemented to ensure better forward- and backward compatibility.



    For a better quality version, higher resolution, click below:
    iPodDownload NOW

    DesktopDownload NOW

    Read More »from Owen O’Malley on the Future of MapReduce
  • Alejandro Abdelnur on Workflow – Oozie

    Yahoo!'s Alejandro Abdelnur presents an in-depth look at Oozie, a server workflow engine that runs MapReduce and Pig jobs.



    For a better quality version, higher resolution, click below:
    iPodDownload NOW


    Read More »from Alejandro Abdelnur on Workflow – Oozie
  • Alan Gates – Getting more of out of Pig

    The Apache site describes Pig as ”... a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.”

    Yahoo! Research developed Pig to facilitate programming in Hadoop. As quoted in The Register, Doug Cutting describes Pig as “SQL for MapReduce.”

    In this video, Alan Gates from Yahoo!’s Grid team describes what Pig is, why you should use it, what you can do with it, and future plans for the Pig project.



    For a better quality version, higher resolution, click below:
    iPodDownload NOW

    DesktopDownload NOW

    Read More »from Alan Gates – Getting more of out of Pig
  • Zheng Shao and Namit Jain – Hive

    Hive is open-source data warehouse infrastructure built on top of Hadoop, started at Facebook. In this talk, Namit Jain and Zheng Shao discuss how and why Facebook uses Hive. They present Hive's progress and roadmap and describe how the open source community can contribute to the evolution of Hive.

    Hive is a system for managing and querying structured data built on top of Hadoop: it uses MapReduce for execution, HDFS for storage, and adds metadata on raw files.

    Advanced data warehousing is a *huge* priority for Facebook -- in March 2008 the service was generating about 1TB per day in March 2008; in mid-2009, data production had increased to 10TB per day.



    For a better quality version, higher resolution, click below:
    iPodDownload NOW

    DesktopDownload NOW

    Read More »from Zheng Shao and Namit Jain – Hive
  • Yahoo! Engineers talk about the new homepage

    In case you missed this little gem when it ran on the Yahoo! homepage last week, here's a very compressed look at some of the many technologies--from frontend (YUI 3) to back (Hadoop)--that go into producing the Yahoo.com homepage. This video nano-story is told by the developers who built the new homepage, and the engineers who make sure it's readily available (and personally relevant) to the millions of people around the world who think of it as home. Enjoy!

    Read More »from Yahoo! Engineers talk about the new homepage
  • Today’s News and Yahoo!’s Developer Program

    Given the Yahoo! Microsoft news today, many of you are wondering what will happen to Yahoo!’s search offerings for developers. In particular, we’ve received a number of questions about our two most popular search services: SearchMonkey, which allows developers to use structured data to enhance the usefulness of Yahoo! search results, and BOSS, our popular full-featured search API.

    For SearchMonkey and BOSS, we currently do not have anything concrete to tell you. Clearly, we’ll need to work with Microsoft to determine what makes the most sense for you and for us. For more details, please see Ashim Chhabra's post to developers on the Yahoo! Search BOSS group.

    We’ve also received questions about the future of Yahoo!'s other developer offerings, such as YUI, YQL , and Pipes. We wanted to let you know that today’s news does not affect these products. None of our other non-search developer products are affected. Yahoo! remains fully committed to supporting and adding new features to these

    Read More »from Today’s News and Yahoo!’s Developer Program
  • Hadoop Summit: Futures Panel

    In the closing session of the Hadoop Summit Developers' Track, a panel of Hadoop leaders take a look at what's ahead. Yahoo!'s Sanjay Radia discusses backwards compatibility and the future of HDFS; Owen O'Malley covers MapReduce and security futures; Doug Cutting, the father of Hadoop, talks about Avro, a serialization system; Cloudera's Tom White discusses tools and usability; Facebook's Joydeep Sen Sama talks about Hive; and Yahoo!'s Alan Gates looks at Pig, SQL, and metadata. The panel session ends with lively Q & A.



    For a better quality version, higher resolution, click below:
    iPodDownload NOW

    DesktopDownload NOW

    Read More »from Hadoop Summit: Futures Panel
  • Thomas Sandholm & Dejan Milojicic – OpenCirrus

    Thomas Sandholm and Dejan Milojicic, from HP Labs, describe "Hadoop Scheduling in the OpenCirrus Cloud Testbed." OpenCirrus is a consortium of sites that each contribute at least 1000 cores to the testbed. The OpenCirrus Cloud Testbed is sponsored by HP, Intel, and Yahoo!, with additional support from NSF, and participation from other entities. There are currently 9 sites that share research, applications, infrastructure, and data sets. Their goal: spurring innovation in cloud computing, offering global services, gaining insight into platform and architectural features, and providing open source components like Hadoop.



    For a better quality version, higher resolution, click below:
    iPodDownload NOW

    DesktopDownload NOW

    Read More »from Thomas Sandholm & Dejan Milojicic – OpenCirrus

Pagination

(91 Stories)