Parse JSON using Python

Many of Yahoo!'s Web Service APIs provide the option of JSON as an output format in addition to XML. JSON data structures map directly to Python data types, so this is a powerful tool for directly accessing data without having to write any XML parsing code.

JSON Libraries for Python

There are several JSON libraries for Python, but the most popular is simplejson by Bob Ippolito. You can download simplejson from the Python Package Index.

Example: Searching the Web

import simplejson, urllib
APP_ID = 'YahooDemo' # Change this to your API key
SEARCH_BASE = 'http://search.yahooapis.com/ WebSearchService/V1/webSearch'

class YahooSearchError(Exception):
    pass

def search(query, results=20, start=1, **kwargs):
    kwargs.update({
        'appid': APP_ID,
        'query': query,
        'results': results,
        'start': start,
        'output': 'json'
    })
    url = SEARCH_BASE + '?' + urllib.urlencode(kwargs)
    result = simplejson.load(urllib.urlopen(url))
    if 'Error' in result:
        # An error occurred; raise an exception
        raise YahooSearchError, result['Error']
    return result['ResultSet']

Usage of the above:

>>> info = search('json python')
>>> info['totalResultsReturned']
20
>>> info['totalResultsAvailable']
u'212000'
>>> info['firstResultPosition']
1
>>> results = info['Result']
>>> for result in results:
...     print result['Title'], result['Url']
Python Stuff http://undefined.org/python/
SourceForge.net: json-py https://sourceforge.net/projects/json- py
Introducing JSON http://www.json.org/
...

Each result is a Python dictionary containing a number of keys. The full dictionary for a result looks like this:

>>> from pprint import pprint
>>> pprint(results[0])
{u'Cache': {u'Size': u'22237',
            u'Url': u'http://216.109.125.130/search/cache... '},
 u'ClickUrl': u'http://undefined.org/python/#simple_ json',
 u'DisplayUrl': u'undefined.org/python/#simple_json',
 u'MimeType': u'text/html',
 u'ModificationDate': 1150786800,
 u'Summary': u'... Python Stuff. Support the development of py2app, xattr ...',
 u'Title': u'Python Stuff',
 u'Url': u'http://undefined.org/python/#simple_ json'}

The search() function accepts additional keyword arguments, which are directly translated in to query string arguments that are passed to the API endpoint. There is a full list of optional request arguments in the documentation. The following query searches only the developer.yahoo.com site:

info = search('python', site='developer.yahoo.com')

And here's how to search only PowerPoint files (files with a .ppt extension):

info = search('python', format='ppt')

Further information about the fields returned in the response can be found in the API documentation.

Error handling

The search() function handles API errors by checking for the 'Error' key in the JSON response and, if it is found, converting it in to a YahooSearchError exception. You can check for that exception using the following code:

try:
    info = search('')
except YahooSearchError, e:
    print "An API error occurred."

e.args will contain further information about the error.

Since the search() function uses urlopen(), it is also possible that an IOError will be thrown - if the network is not available for example. Here's a more robust code example that handles that eventuality as well:

try:
    info = search('')
except YahooSearchError, e:
    print "An API error occurred."
except IOError:
    print "A network IO error occured."

Further reading

Related information on the web

Yahoo Forum Discussions