Creating YQL Open Data Tables

In this Chapter:

Before You Begin

Before reading this chapter, you should first read Introducing YQL and Using YQL and Open Data Tables. Also, when naming your open data table, you cannot use these YQL reserved words in the table name.

Open Data Tables Reference

The following reference describes the structure of an Open Data Table definition:

The following elements and sub-elements of YQL Open Data Tables are discussed in this reference:

table element

Full Path: root element


<table xmlns="">

This is the root element for the document. A table is the level at which an end-user can ‘select’ information from YQL sources. A table can have many different bindings or ways of retrieving the data. However, we advise that a single table produce a single type of result data.

Attribute Value(s) Notes
xmlns URL The XML Schema file related to this Open Data Table definition.
securityLevel enumeration, any / app / user

The authorization level required to access.

any: Anonymous access; any user can access this table.

app: 2-legged OAuth; involves authorization without access to private user data.

user: 3-legged OAuth, involves authorization of access to user data.

For more information, refer to Open Data Tables Security and Access Control.

https boolean, true or false If true, the table is only available if the user is connected via HTTPS. If missing or false, either HTTP or HTTPS connections are acceptable.

meta element

Full Path:table/meta


  <author>Yahoo Inc.</author>
  <sampleQuery>select * from {table} where has_geo="true" and text="san francisco"</sampleQuery>

Along with the table element, you are required to include the meta sub-element, which provides the following information:

Attribute Description Notes
apiKeyURL The URL to apply for an API key/token for the given table. Only one URL can be specified.
author Information regarding the author of this Web service Examples of author information include an unformatted email, name, or other related information.
documentationURL Additional information about this table or the select called by the table can be found here More then one documentationURL element may be included for each table.
description Plain text description about the table A description of the table.
sampleQuery A sample query that users can run to get output from this table. {table} ``\ should be used in place of the table name, so that the sample can run when used in different namespaces. Multiple ``sampleQuery elements may occur. Each sampleQuery may have a description attribute that contains a description about the sample.

select / insert / update / delete elements

Full Path:






 <select itemPath="" produces="XML">



Situated within each bindings element, there are one of four keywords: select, insert, update, or delete.

The select element describes the information needed for YQL to read data from an API. The insert and update elements describe the information needed to add or modify data from an API, respectively. When removing data, the delete element is used to describe the necessary bindings.

When a keyword such as select or update is repeated within the bindings array, it can be considered to be an alternative way for YQL to call a remote server to get the same type of structured data. Typically, this is needed when the service supports different sets of query parameters (YQL’s “keys”) or combinations of optional query parameters.

Attribute Value(s) Notes
itemPath URLs A dot-path that points to where the repeating data elements occur in the response format. These are the “rows” of your table.
pollingFrequencySeconds Integer The frequency in seconds that the YQL Engine should update the response from a data source. See YQL Streaming for more information.
produces enumeration, XML / JSON The type of data coming back from the Web service.


Unlike XML, JSON objects have no “root” node. To work with the dot notation, YQL creates a “pseudo” root node for JSON responses called “json”. If you need to return a sub-structure from your Open Data Table that fetches or produces JSON, you’ll need to add “json” at the root of the path.

function Element

A stored procedure to bind and run, similar to the built-in Sort and Other Functions.

. Full Path: table/bindings/function


    <function name="concat" />
Attribute Type Description
name string The name of the function.
type enumeration

The type of function. Allowed values:

  • stream - (default) This type of function gets called for each item at which point it can modify and return the modified version of the item. Example: Decorate an item with extra information.
  • reduce - This type of function gets called for each item which is collected/stored and then finally, gets one last call to return the reduced or aggregated response. Examples of YQL built-in reduce functions: sort, truncate, count.

url Element

Full Path: table/bindings/select/urls/url

This is where YQL and the table supporting the service come together. The url element describes the URL that needs to be executed to get data for this table, given the keys in the key elements.


The CDATA/TEXT for this element contains the URL itself that utilizes substitution of values at runtime based on the uri template spec (v3). The names of the values will be substituted and formatted according to the uri template spec, but the simplest method is simply to enclose a key name with curly braces (`` {} ``):

  • All {name}keys found in the URL will be replaced by the same id key value in the keys elements.
  • YQL currently supports both http and https protocols.


``{publisher}/notification/{bucket}.xml ``

YQL will look for key elements with the names publisher and bucket. If the YQL developer does not provide those keys in the WHERE clause (and they are not optional), then YQL detects the problem and will produce an error. If an optional variable is not provided, but is part of the Open Data Table definition, it will be replaced with an empty string. Otherwise, YQL will substitute the values directly into the URL before executing it.

execute Element

Full Path: table/bindings/select/execute

The execute sub-element allows you to invoke server-side JavaScript in place of a GET request. For more information on executing JavaScript, refer to Executing JavaScript in Open Data Tables.


     // Include the flickr signing library
     // GET the flickr result using a signed url
     var fs = new flickrSigner(api_key,secret);
     response.object ={method:method, format:""})).get().response();

key / value / map elements

Full Paths:````





  <key id='guid' type='xs:string' paramType='path' required="true" />
  <key id='ck' type='xs:string' paramType='variable' required="true" />
  <key id='cks' type='xs:string' paramType='variable' required="true" />
  <value id='content' type='xs:string' paramType='variable' required="true" />

There are three type of elements available within the inputs element: key, value, and map.

key element

Each key element represents a named “key” that you provide in the WHERE or INTO clause of SELECT, INSERT, UPDATE, or DELETE statements. YQL inserts these values into the URL request before it is sent to the server. YQL inserts these values into the URL request if the paramType is set to query or path or header. For a variable type, the key named as the id of the element is made available in the execute section of the Open Data Table.

value element

Use the value element to assign a new “value” or update an existing one within an Open Data Table. The value element defines a field that can only be set as an input and therefore cannot be in YQL statements to satify the “where” clause. The value element only works with the INSERT and UPDATE verbs and in different ways.

When used with the insert keyword, the value element appears in the VALUE expression of the YQL statement, indicating that a new value is being passed into the YQL statement, as seen in the following example:

``INSERT into bitly.shorten (login, apiKey, longUrl) VALUES (‘YOUR_LOGIN’, ‘YOUR_API_KEY’, ‘‘) ``

When used with the update keyword, the value element is called from the SET portion of the YQL statement. This indicates that you are “setting” a particular value, as seen in the following example:

``UPDATE table SET status=’Reading the YQL Guide’ where guid = me; ``

map element

Use the map element when you want to use dynamic keys. With this element, YQL uses the value you pass in through the YQL statement as a variable. This variable is used within the execute portion of your Open Data Table to determine what action to take. For example, you may set up a YQL Open Data Table that updates either,, or tinyurl, depending on the value you specify in the YQL statement.

For a dynamic key called type, the actual ID in a YQL query would look like the following:

``field.type = ‘Java’ ``


In the absence of the map element as a binding, all identifiers, not corresponding to a binding element and that appear in a YQL query, are treated as local filters.

The map element can be used for all the four paramTypes. Here is an example of the map element being used in a path:

<map id="field" paramType="path"/>

For a query containing the relational expression field.type='rss', only the dynamic parameter name type would be substituted in the urls element. The URI template would look like the following:{type}/topstories

Key, Value, and Map Element Support within YQL Statements

The following table shows the keywords that support the key, value, and map elements:

  select insert update delete
key yes yes yes yes
value no yes yes no
map yes yes yes yes

Attributes for Key, Value, and Map Elements

The following table provides the attributes available within key, value and map elements:

Attribute Value(s) Supported Keywords Notes
id string select, insert, update, delete The name of the key. This represents what the user needs to provide in the WHERE clause.
as string select, insert, update, delete

The alias of the key used in YQL statements.

If the Web source used in the Open Data Table uses a cryptic or poorly named query parameter, you can use as to specify an alias that developers use in the YQL statement. For example, perhaps you have an id called “q” within your Open Data Table, which actually is a search parameter.

Without aliasing, the equivalent YQL statement would look like this:

select * from where q = "pizza"

You can use the as attribute to create an alias in the following way:

<key id="q" as="query" type="xs:string" paramType="query"/>

You then can use search in your YQL statement like this:

select * from where query = "pizza"

type string select, insert, update, delete The type of data coming back from the Web service.
required boolean select, insert, update, delete A boolean that answers the question: Is this key required to be provided in the WHERE clause on the left-hand side of an equality statement? If not set, any key is optional.
paramType enumeration select, insert, update, delete

Determines how this key is represented and passed on to the Web service:

- query: Add the id and its value as a id=value query string parameter to the URL. - matrix: Add the id and its value as a id=value matrix parameter to the URL path. - header: Add the id and its value as a id: value as an HTTP header to the URL request. -`` path``: Substitute all occurrences of {id} in the url string with the value of the id. Not necessarily only in the path. -`` variable``: Use this key or field as a variable to be used within the execute sub-element instead of being used to format or form the URL.

default string select, insert, update, delete This value is used if one isn’t specified by the developer in the SELECT.
private boolean select, insert, update, delete Hide this key’s value to the user (in both “desc” and “diagnostics”). This is useful for parameters like appid and keys.
const boolean select, insert, update, delete A boolean that indicates whether the default attribute must be present and cannot be changed by the end user. Constant keys are not shown in desc [table].
batchable boolean select, update, delete

A boolean which answers the question: Does this select and URL support multiple key fetches/requests in a single request (batched fetching)?

For more information about batching requests, refer to Batching Multiple Calls in a Single Request.

maxBatchItems integer select, update, delete

How many requests should be combined in a single batch call.

For more information about batching requests, refer to Batching Multiple Calls in a Single Request.

Aliasing within Key, Value, and Map Elements

If you have an obscurely named id in your Open Data Table, you can use an alias to refer to it within YQL statements. For example, perhaps you have an id called “q” within your Open Data Table, which actually is a search parameter. You can use “as” to create an alias in the following way:

<key id="q" as=type="xs:string" paramType="query"/>

select * from where search ="pizza"

paging element

Full Path: table/bindings/select/paging


<paging model="page">
  <start id="page" default="0" />
  <pagesize id="per_page" max="250" />
  <total default="10" />
<paging model="url">
  <nextpage path="ysearchresponse.nextpage" />
<paging model="offset" matrix="true">
  <start id="page" default="0" />
  <pagesize id="per_page" max="250" />
  <total default="10" />

This element describes how YQL should “page” through the web service results, if they span multiple pages, or the service supports offset and counts.

Attribute Value(s) Supported Keywords Notes
model offset, page, url select

The type of model to use to fetch more than the initial result set from the service.

The offset refers to services that allow arbitrary index offsets into the result set.

Use the page value for services that support distinct “pages” of some number of results.

Use the url value for services that support a URL to access further data.

matrix true, false select A boolean that answers the question: Is the parameter matrix style (part of the URI path; delimited), or query parameter style? The default value is false.


When using the url paging model, you can also use the pagesize element to, if the Web service allows, adjust the number of results returns at once.

pagesize element

Full Path: table/bindings/select/paging/pagesize

This element contains Information about how the number of items per request can be specified.

Attribute Value(s) Notes
max integer The maximum size of the requested page. If the total requested is below the max pagesize, then the pagesize will be the total requested. Otherwise, the max pagesize will be the size of the page requested.
id string The name of the parameter that controls this page size.

start element

Full Path: table/bindings/select/paging/start

This element contains Information about how the “starting” item can be specified in the set of results.

Attribute Value(s) Notes
default integer The starting item number (generally 0 or 1); for paging style this value always defaults to 1.
id string The name of the parameter that controls the starting page/offset.

total element

Full Path: table/bindings/select/paging/total

This element contains Information about the total number of results available per request by default.

Attribute Value(s) Notes
default integer The number of items that come back by “default” in YQL if the()syntax is not used when querying the table.

nextpage element

Full Path: table/bindings/select/paging/nextpage

This element contains the location of the next page of results. This is an optional element that is used in conjunction with the parent url element.

Attribute Value(s) Notes
path string The path to the next page of results

YQL Streaming

In general, when a YQL statement references an Open Data Table, the YQL Web service will make an HTTP call based on the URLs listed in the bindings element or execute the JavaScript in the execute element. To get updated results from the YQL Web service, you need to execute this YQL statement again. YQL streaming allows you to get updated results in a different way.

With YQL streaming, your Open Data Table can request YQL to poll data from your resource URI at a specified interval, process the responses, and return the results to the client, without the client having to execute the same YQL statements over and over again to get the updated results; this will reduce network latency and the need to store responses.

Configuring Open Data Tables to Use Streaming

To request streaming in your Open Data Table, you use the pollingFrequencySeconds attribute of the `select binding to specify the interval that you want YQL to make requests. For example, the code snippet below configures YQL to poll the URL for Foursquare search every two seconds:

  <select itemPath="" pollingFrequencySeconds="2"/>
    <key id="oauth_token" type="xs:string" paramType="query" required="true" />
    <key id="phone" type="xs:string" paramType="query" />
    <key id="email" type="xs:string" paramType="query" />
    <key id="twitter" type="xs:string" paramType="query" />
    <key id="twitterSource" type="xs:string" paramType="query" />
   <key id="fbid" type="xs:string" paramType="query" />
   <key id="name" type="xs:string" paramType="query" />


When setting a polling frequency with pollingFrequencySeconds, try to match the update frequency of your data source. If you configure YQL to poll the source every two seconds, but the source usually updates data every minute, you will be increasing the network latency for YQL without the benefit of getting new data.

Using Open Data Tables Configured for Streaming

To use Open Data Tables that are configured for streaming, you must call the YQL Web Service URLs for streaming. You use YQL statements and invoke the Open Data Table in the same way as you would for any Open Data Table.

The following YQL Service URL is used to get public data from an Open Data Table that is configured to use YQL streaming:

The YQL Service URL below requires authorization by OAuth and allows access to both public and private data from an Open Data Table that is configured to use YQL streaming:

The following PHP code snippet shows how to get public data from the Open Data Table books_warehouse.xml that is configured to use YQL streaming:

// The YQL Service URL for streaming public data
// Form YQL query and build URI to YQL Web service    $yql_query = "use '' as bw; select * from bw where isbn-10='9781849510707'";

$yql_query_url = $YQL_STREAMING_URL . "?q=" . urlencode($yql_query) . "&format=json";

// Make call with cURL and get returned response
$session = curl_init($yql_query_url);    curl_setopt($session, CURLOPT_RETURNTRANSFER,true);
$json = curl_exec($session);

// Convert JSON to PHP object
$phpObj =  json_decode($json);

Debugging Open Data Tables and YQL Network Calls

To aid in your debugging efforts, YQL provides the option to have network-level logging. When enabled, all network requests are uncached, so you can iteratively develop Open Data Tables more easily, as well as debug network requests between YQL and the remote service. Network logs display both the request headers and the response content for each network call.


Network capture stops once YQL encounters a default (built-in) table or table fetched from, even if subsequent requests are made using an Open Data Table.

When you enable network-level logging, YQL provides a key within the diagnostics element for each network call that occurs, seen in the following YQL response snippet as id attributes:

     <cache execution-start-time="89" execution-stop-time="90" execution-time="1" method="PUT" type="MEMCACHED"><![CDATA[debug:1492ed2c-d3ba-4376-a731-1382bcdd41e6]]>
    <url execution-start-time="1" execution-stop-time="90" execution-time="89" id="1492ed2c-d3ba-4376-a731-1382bcdd41e6"><![CDATA[;start=0;count=10]]>

The id key can be used within 5 minutes of an execution to see log data.

Enabling Logging

To enable network-level logging, you simply append debug=true to the YQL console URL or API query like this:*%20from%20geo.places%20where%20text%3D%22sunnyvale%22%0A&diagnostics=true&debug=true

Viewing Logs

You can access network-level logs within 5 minutes of running a YQL statement or call. You can view the logs using the YQL Console or using the YQL log URL with the query string parameter id. We will discuss the id attribute and then show how to view logs in the following sections.

id Attribute

When you make a request to the YQL Web Service and add the query string parameters diagnostics=true&debug=true, the response will contain diagnostic information that can be used for debugging. In the returned diagnostic information (shown below), you will see a url element that has the id attribute we mentioned earlier.

  <cache execution-start-time="928" execution-stop-time="928" execution-time="0" method="PUT" type="MEMCACHED"><![CDATA[debug:bd5757d7-5568-465f-b83b-0fb48f987a9f]]></cache>
  <url execution-start-time="1" execution-stop-time="928" execution-time="927" id="bd5757d7-5568-465f-b83b-0fb48f987a9f" proxy="DEFAULT"><![CDATA[]]></url>

This id is associated with a log message and is required to view log messages. We will be showing you how to use the id in the next two sections to view logs from the YQL Console and using the YQL Log API.


For both community and built-in tables, if the env parameter is included in the YQL Web Service URL, the id attribute will not be returned in the response. When using community tables, to have the id attribute returned in the response, you must instead include the table with a USE statement.

Using the YQL Console

  1. From the YQL Console, check the Diagnostics and Debug checkboxes, enter your YQL statement, and click TEST.
  2. From the returned response in the FORMATTED tab, click on the value for the id attribute of the url element in the diagnostic information.
  3. A new tab will open showing log data that includes the HTTP request, the HTTP headers, and the raw response.

Using the YQL Log URL

  1. Append the query string debug=true&diagnostics=true to one of the YQL Web service URLs.

    For example:<your_query_statement>&debug=true&diagnostics=true

  2. Make the HTTP request to the YQL Web service URL with the debug and diagnostics query parameters.

  3. From the returned response, copy the value for the id key provided in the url element of the diagnostics information and append it to the YQL log URL as shown here:<id_value>

  4. You will be returned the dumped log data that includes the HTTP request, the HTTP headers, and the raw response.

Open Data Table Examples

This section includes a few examples of Open Data Tables that showcase the ability of YQL to gather data from external APIs.


For a much larger list of publicly available Open Data Tables, refer to

Open Data Tables Security and Access Control

The securityLevel attribute of the table element determines the type of authentication required to establish a connection. In order for a user to connect to your table, the user must be authorized at the level or higher than the level indicated in the securityLevel attribute. The following table lists whether access is available depending on the value in the securityLevel attribute.

Security of Table (access attribute) Anonymous / No Authorization 2-legged OAuth 3-legged OAuth / cookie
any yes yes yes
app no yes yes
user no no yes

For more information about each level, refer to the securityLevel attribute in the table element.

Batching Multiple Calls into a Single Request

YQL Open Data Tables support the ability to send a list of keys in a single request, batching up what would otherwise be a multiple calls.


In order to do batching in YQL Open Data Tables, the source must support it. An example of a source that supports batching is the Yahoo Social Directory call for ``profiles` <>`_)

Let’s take the example of the Social Directory API and see the URI for profile data:{guid}/profile

In YQL, the table for retrieving this data is social.profile. Here is an example that includes a sub-select:

select * from social.profile where guid in (select guid from social.connections where owner_guid = me)

When performing sub-selects, the inner sub-select returns a set of values, each of which is a call to the URI above. So if a Yahoo user has 3 connections in his profile, the sub-select makes three calls to the Social Directory API:

Fortunately, the Social Directory URI above also supports batching, so a single call can be made to get all three profiles:,2,3)/profile

Since the Social Directory API supports batching, YQL can enable this by defining the key ``guid` as batchable with an extra parameter that denotes the max number of batch items per request:

<key id="guid" type="xs:string" paramType="path" batchable="true" maxBatchItems="3"/>

We also need to modify the Open Table definition to support multiple values for the GUID. Combining the modification to the Open Table definition above with the one below results in a batch call to the Social Directory API for not more than 3 profiles:

<url env="int">{-listjoin|,|guid})/profile</url>


The following section deals with issues you may have while using YQL Open Data Tables:

  • Problem: The SELECT URL doesn’t parse correctly in my Open Data Table definition.

    Solution: Make sure you’ve escaped things correctly for XML; for example: & should be encoded as &amp;.

  • Problem: My Open Data Table definition has multiple bindings with different sets of keys. YQL keeps running the “wrong” select. How can I get YQL to choose the right one?

    Solution: Keep in mind that the order of bindings is important. Once YQL finds a select that satisfies the YQL statement, it uses that one. Try moving the more “specific” select endpoint above the others.

  • Problem: If my API requires authentication, how do I access it?

    Solution: If you want to use an API that requires its own authentication mechanism, you use the execute sub-element within an Open Data Table to manage this authentication.

  • Problem: Open Data Tables seem so complicated? What is the best way to get started?

    Solution: The best way to avoid being overwhelmed is to first look at examples. In general, when creating YQL tables, it is useful to take a bottom-up approach and analyze the result structure of the API(s) that you are encapsulating. First, group together all the services that produce the same result structure. This becomes your “table” or Open Table definition. For each API that produces the response structure, you should create a “select” under the “request” section of the Open Data Table definition. By using this mechanism, you can often consolidate multiple API’s into a single versatile YQL table that allows YQL to do the heavy lifting and keep the implementation details hidden.