Content Analysis Documentation for Yahoo! Search

Submitting Content Analysis Queries

Our recently released Content Analysis Web Service detects entities/concepts, categories, and relationships within unstructured content. It ranks those detected entities/concepts by their overall relevance, resolves those if possible into Wikipedia pages, and annotates tags with relevant meta-data. Please give our content analysis service a try to enrich your content.

Request URL

The Content Analysis service is available as an YQL table. The table name is contentanalysis.analyze. If you are not familiar with YQL, see information on constructing YQL queries.

Note: Due to the context being a potentially lengthy string, Content Analysis requests should be submitted using an HTTP POST request rather than GET. If you plan to use serialized PHP output, consider CURL to post the request rather than PHP's file_get_contents(). You'll then need to remove the HTTP headers before evaluating the code with unserialize().

Request parameters

Parameter Value Description
text string (required if url parameter is not used) The content to perform analysis (UTF-8 encoded).
url string (required if text parameter is not used) The url of the web page to perform analysis).
related_entities boolean: true (default), false Whether or not to include related entities/concepts in the response
show_metadata boolean: true (default), false Whether or not to include entity/concept metadata in the response
enable_categorizer boolean: true (default), false Whether or not to include document category information in the response
unique boolean: true, false (default) Whether or not to detect only one occurrence of an entity or a concept that my appear multiple times
max integer: 100 (default) Maximum number of entities/concepts to detect

Sample Text Request :

Response fields

Field Description
categories List of categories. Not present when enabled_categorizer="false".
categories/yct_categories List of YCT categories.
categories/yct_categories/yct_category YCT category. This element has a numeric score attribute. Categories are listed in descending order of scores.
entities List of detected entities/concepts. Not present when no entities/concepts are detected.
Each entity/concept. This element has a numeric score attribute. Entities are listed in descending order of scores.
entities/entity/text Text of the entity/concept as it appeared in the input. This element has the following attributes:
  • starchar - The start character position of the entity/concept in the input
  • endchar - The end character position of the entity/concept in the input
  • star - The start byte position of the entity/concept in the input
  • end - The end byte position of the entity/concept in the input
entities/entity/wiki_url The Wikipedia URL of the entity/concept. Not present when the entity/concept doesn't have a Wikipedia page.
entities/entity/types List of types of the entity/concept. One entity/concept can have multiple types.
entities/entity/types/type One type of the entity/concept (e.g. location, person, organization).
entities/entity/metadata_list List of metadata. Not present when the entity/concept doesn't have any metadata or show_metadata="false". Currently only location type entities/concepts have metadata.
entities/entity/metadata_list/metadata One metadata of the entity/concept.
entities/entity/metadata_list/metadata/geo_area Area of the location.
entities/entity/metadata_list/metadata/geo_country Country of the location.
entities/entity/metadata_list/metadata/geo_isocountrycode ISO country code of the location.
entities/entity/metadata_list/metadata/geo_location Longitude and latitude of the location.
entities/entity/metadata_list/metadata/geo_name Name of the location.
entities/entity/metadata_list/metadata/geo_placetype Place type of the location.
entities/entity/metadata_list/metadata/geo_state State name of the location.
entities/entity/metadata_list/metadata/geo_statecode State abbreviation of the location.
entities/entity/metadata_list/metadata/geo_town Town of the location.
entities/entity/metadata_list/metadata/woe_id WOE id of the location.
entities/entity/related_entities List of related entities/concepts. Not present when the entity/concept doesn't have any related entities/concepts or related_entities="false".
entities/entity/related_entities/wikipedia List of related Wikipedia entities/concepts.
entities/entity/related_entities/wikipedia/wiki_url URL of the related Wikipedia entity/concept.

Sample response

The following is a sample response for the sample query above:


Rate Limits

The Content Analysis service is limited to 5,000 queries per IP address per day and to noncommercial use. See information on rate limiting.

Terms of Use

Please see our Usage Policy to learn about acceptable uses and how to request additional queries.

Errors

The Content Analysis service returns the standard errors. There are no service-specific errors.

Support & Community

The Content Analysis Search service is discussed on the yws-search-general mailing list.

Yahoo Groups Discussions