Home | Index

BOSS API Guide

Chapter 2. Web Search

Table of Contents

Service URL Syntax
Simple Query Example
Simple XML Response
Optional Arguments (in addition to Universal arguments described in BOSS overview)
filter
type
view=keyterms
view=searchmonkey
Delicious data
abstract
Response Fields
Recommended Region/Language Usage

The BOSS examples in this section contain reserved characters that need to be escaped. However, for legibility the examples do not display the escape values. Therefore, if you use these examples verbatim to test queries, they will not work. For a list of all reserved characters and their escape values, see “Reserved Characters and Escape Values”.

Service URL Syntax

http://boss.yahooapis.com/ysearch/web/v1/{query}?appid={yourBOSSappid}[&param1=val1&param2=val2&etc]

Simple Query Example

http://boss.yahooapis.com/ysearch/web/v1/foo?appid={yourBOSSappid}&format=xml

Simple XML Response

<ysearchresponse responsecode="200">
  <nextpage><![CDATA[/ysearch/web/v1/foo?appid={yourBOSSappid}&format=xml&start=10]]></nextpage>
  <resultset_web count="10" start="0" totalhits="29440998" deephits="881000000">
    <result>
      <abstract><![CDATA[World <b>soccer</b> coverage
                from ESPN, including Premiership, Serie A, La Liga, and Major League
                <b>Soccer</b>. Get news headlines, live scores, stats, and
                tournament information.]]></abstract>
      <date>2008/06/08</date>
      <dispurl><![CDATA[www.<b>soccernet.com</b>]]></dispurl>
      <clickurl>http://us.lrd.yahoo.com/_ylc=X3oDMTFkNXVldGJyBGFwcGlkA2Jvc3NkZW1vBHBvcwMwBHNlcnZpY2UDWVNlYXJjaARzcmNwdmlkAw--
                /SIG=10u3e8260/**http%3A//www.soccernet.com/</clickurl>
      <size>94650</size>
      <title>ESPN Soccernet</title>
      <url>http://www.soccernet.com/</url>
    </result>
  </resultset_web>
</ysearchresponse>

Optional Arguments (in addition to Universal arguments described in BOSS overview)

filter

Filter out adult or hate content.

type

Specifies document formats (pdf, msoffice,etc).

view Syntax: view=view1,view2, etc
  • view=keyterms will retrieve related words and phrases for each search result.

  • view=searchmonkey_feed will retrieve structured data markup, if available, for the search result in dataRSS format.

  • view=searchmonkey_rdf will retrieve structured data markup, if available, for the search result in rdf format.

  • view=delicious_toptags will retrieve the top public delicious tags for a document and the counts associated with each tag

  • view=delicious_saves will retrieve the number of times a document was saved in delicious

  • view=language identifies the language of the document

See view section below for more detail.
abstract abstract=long will retrieve and display an abstract of a web document up to 300 characters. This expanded abstract provides the requestor with a larger piece of information to work from in a web search query. The default for abstract is an abbreviated description.

When entering Web request arguments you must escape the reserved characters to use them in argument values, although they are sometimes shown unescaped for readability. See “Reserved Characters and Escape Values”.

The following arguments are specific to web search.

filter

Optional: This argument filters out results flagged as containing specific kinds of content. Yahoo! currently supports values for filtering out results flagged as containing pornographic and hate-related content. The Filter argument accepts the values -porn to exclude pornographic content and -hate to exclude hate-related content. The syntax of the filter argument is:

filter=[-porn] [-hate]

Either one or both values may be filtered. The Filter argument applies only to documents in the following languages:

Language

Valid values

Chinese

-porn

Danish

-porn

Dutch

-porn

English

-hate, -porn

Finnish

-porn

French

-porn

German

-porn

Italian

-porn

Japanese

-porn

Korean

-porn

Norwegian

-porn

Portuguese

-porn

Spanish

-porn

Swedish

-porn

If you do not specify hate or porn filtering, the results default to unfiltered content.

In rendering a result set, Yahoo! automatically demotes adult content, so only users actively searching for adult content are likely to see pornographic results. We recommend that you use Filter: -porn only in a restricted search environment intended to reduce the incident of porn results. Turning on filtering may reduce result relevancy. Also, note that, by their nature, the methods used to flag results as -porn or -hate cannot take into account the subjective, widely varied interpretations and categorizations of such content. You are responsible for your use of the results, regardless of their designation.

Example: http://boss.yahooapis.com/ysearch/web/v1/paris?format=xml&count=2&filter=-porn

type

Capital letter A-Z attributes may appear in search results. These and other attribute categories not mentioned above are experimental and will remain undocumented for now.

Optional: This argument specifies what document types to return. The argument value consists of a comma-separated list specifying the document types or type groups to include. A format group is a logical collection of several document formats for simplification. Format currently supports the following document types:

  • html

  • text

  • pdf (Adobe Portable Document Format)

  • xl (Microsoft Excel: xls, xla, xl)

  • msword (Microsoft Word)

  • ppt (Microsoft Power Point)

Format currently supports the following type groups:

  • msoffice: xl, msword, ppt

  • nonhtml: text, pdf, xl, msword, ppt

You can also specify a format group then exclude an item:

  • type=msoffice,-ppt

This example searches for the same query term in the nonhtml type group (text, pdf, xl, msword, ppt):

  • type=nonhtml

You can combine inclusion, exclusion, document types, and type groups like this:

  • type=html,msoffice,-pdf

Example:

http://boss.yahooapis.com/asearch/web/v1/moon?format=xml&count=2&type=msoffice

view=keyterms

Optional: The keyterms field can be used with the view argument to retrieve related words and phrases for each search result.

Example:

http://boss.yahooapis.com/ysearch/web/v1/ireland? appid={yourBOSSappid}&format=xml&view=keyterms

Example XML Output keyterms:

<keyterms>
  <terms>
   <term>Ireland</term>
   <term>island</term>
   <term>the Irish</term>
  </terms>
</keyterms>

view=searchmonkey

The primary way in which SearchMonkey acquires structured data is by using the Yahoo! Web Crawler to scour the web for embedded semantic markup such as microformats or RDF. Additional information can be found here:

http://developer.yahoo.com/searchmonkey/smguide/markup.html

http://developer.yahoo.com/searchmonkey/smguide/compare_microformats.html

SearchMonkey acquires structured data marked up with the following open standards:

Microformats

RDF

  • hAtom - Represents a subset of the Atom syndication format

  • hCalendar - Represents calendar dates and events, using a representation of the iCalendar standard

  • hCard - Represents people, companies, organizations, and places, using a representation of the vCard standard

  • hReview - Represents reviews of products, services, businesses, and events

  • XFN - Represents human relationships using hyperlinks

  • Geo - Represents geograhic coordinates

  • rel-tag - Marks up the destination of a hyperlink as an author-designated tag

  • adr - Represents address information

  • Dublin Core - Allows developers to specify document metadata

  • FOAF - Friend of a Friend specifies personal profiles and social networks

  • SIOC - Specifies elements in blogs, forums and Q&A sites

  • Other supported vocabularies - See the SearchMonkey documentation for a list of other supported vocabularies

Optional: The searchmonkey_feed and searchmonkey_rdf can be used with the view argument to retrieve associated structured data, if available, for a search result.

[Note] Note

SearchMonkey data is NOT currently available in BOSS json output.

Example:

http://boss.yahooapis.com/ysearch/web/v1/ireland? appid={yourBOSSappid}&format=xml&view=searchmonkey_feed (dataRSS format)

or

http://boss.yahooapis.com/ysearch/web/v1/ireland? appid={yourBOSSappid}&format=xml&view=searchmonkey_rdf (rdf format)

Example XML Output searchmonkey_feed:

<searchmonkey_feed>
    <feed>
        <adjunct id="com.yahoo.page.uf.adr" updated="2009-02-01T09:07:13Z" version="1.1">
        <item rel="dc:subject">
          <type typeof="vcard:Address">
          <meta property="vcard:locality">Washington D.C. Metro Area</meta>
          </type>
        </item>
        </adjunct>
        <adjunct id="com.yahoo.page.uf.hcalendar" updated="2009-02-01T09:07:13Z" version="1.1">
        <item rel="dc:subject rel:Event">
          <type typeof="vcal:Vevent">
          <meta property="vcal:dtstart" datatype="xsd:dateTime" data_quality="255">2009-01-01</meta>
          <meta property="vcal:summary">United States of America</meta>
          <meta property="vcal:duration" datatype="xsd:duration">P1M</meta>
          </type>
         </item>
...

Example XML Output searchmonkey_rdf:

<searchmonkey_rdf>
<rdf:RDF>
<rdf:Description rdf:about="http://www.linkedin.com/in/barackobama">
    <dc:subject rdf:nodeID="id1554438591"/>
</rdf:Description>
<rdf:Description rdf:nodeID="id1554438591">
<rdf:type rdf:resource="http://www.w3.org/2006/vcard/ns#Address"/>
</rdf:Description>
<rdf:Description rdf:nodeID="id1554438591">
    <vcard:locality>Washington D.C. Metro Area</vcard:locality>
</rdf:Description>
<rdf:Description rdf:about="http://www.linkedin.com/in/barackobama">
    <dc:subject rdf:nodeID="id1554440066"/>
</rdf:Description>
<rdf:Description rdf:about="http://www.linkedin.com/in/barackobama">
    <rel:Event rdf:nodeID="id1554440066"/>
</rdf:Description>
<rdf:Description rdf:nodeID="id1554440066">
<rdf:type rdf:resource="http://www.w3.org/2002/12/cal#Vevent"/>
</rdf:Description>
<rdf:Description rdf:nodeID="id1554440066">
    <vcal:dtstart rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2009-01-01</vcal:dtstart>
</rdf:Description>
<rdf:Description rdf:nodeID="id1554440066">
    <vcal:summary>United States of America</vcal:summary>
</rdf:Description>
...

Delicious data

[Note] Note

Delicious data is constantly being updated by Delicious users. Delicious URLS must have at least two saves for inclusion, can take several days to be indexed, and are not guaranteed to be available through the BOSS Web Search service. Over 99% of this data is available through the BOSS Web Search service and we will be making tweaks over the coming months to make this fully comprehensive and fresh

view=delicious_toptags

Optional: will retrieve the top public delicious tags for a document and the counts associated with each tag.

Example XML output:

<delicious_toptags>
  <tags>
  <tag>
    <name>mobile</name>
    <count>11</count>
  </tag>
  <tag>
    <name>sony</name>
    <count>4</count>
  </tag>
  </tags>
</delicious_toptags>
view=delicious_saves

Optional: will retrieve the number of times a document was saved in delicious.

Example XML output:

<delicious_saves>12</delicious_saves>
view=language

Optional: Identifies the language of a document.

Example XML output:

<language>english</language>

abstract

Optional: Using abstract=long increases the abstract up to 300 characters and allows BOSS developers to have access to more keywords from the result document. This information may be useful to many for keyword analysis and presentation.

abstract=long

Example:

http://boss.yahooapis.com/ysearch/web/v1/Car%20racing? appid={yourBOSSappid}&format=xml&abstract=long

Example XML Output abstract (default):

   <abstract>
      Official site from the National Association for Stock <b>Car</b> Auto <b>Racing</b>. <b>...</b> Still more to learn about 
      new <b>car</b> after first full year <b>...</b>
   </abstract>

Example XML Output abstract (long):

   <abstract>
      Official site from the National Association for Stock <b>Car</b> Auto <b>Racing</b>. <b>...</b> Still more to learn about 
      new <b>car</b> after first full year. Ragan among Cup drivers honored with "Loopie" award. Keselowski heads list of 2008 
      most popular drivers <b>...</b>
   </abstract>