Our recently released Content Analysis Web Service detects entities/concepts, categories, and relationships within unstructured content. It ranks those detected entities/concepts by their overall relevance, resolves those if possible into Wikipedia pages, and annotates tags with relevant meta-data. Please give our content analysis service a try to enrich your content.
The Content Analysis service is available as an YQL table. The table name is contentanalysis.analyze. If you are not familiar with YQL, see information on constructing YQL queries.
Note: Due to the context being a potentially
lengthy string, Content Analysis requests should be submitted using an
HTTP POST request rather than GET. If you plan to use serialized
PHP output, consider CURL to
post the request rather than PHP's file_get_contents().
You'll then need to remove the HTTP headers before evaluating the
code with unserialize().
| Parameter | Value | Description |
|---|---|---|
| text | string (required if url parameter is not used) | The content to perform analysis (UTF-8 encoded). |
| url | string (required if text parameter is not used) | The url of the web page to perform analysis). |
| related_entities | boolean: true (default), false | Whether or not to include related entities/concepts in the response |
| show_metadata | boolean: true (default), false | Whether or not to include entity/concept metadata in the response |
| enable_categorizer | boolean: true (default), false | Whether or not to include document category information in the response |
| unique | boolean: true, false (default) | Whether or not to detect only one occurrence of an entity or a concept that my appear multiple times |
| max | integer: 100 (default) | Maximum number of entities/concepts to detect |
Sample Text Request :
| Field | Description |
|---|---|
| categories |
List of categories. Not present when enabled_categorizer="false".
|
| categories/yct_categories | List of YCT categories. |
| categories/yct_categories/yct_category | YCT category. This element has a numeric score attribute. Categories are listed in descending order of scores. |
| entities | List of detected entities/concepts. Not present when no entities/concepts are detected. |
| Each entity/concept. This element has a numeric score attribute. Entities are listed in descending order of scores. | |
| entities/entity/text |
Text of the entity/concept as it appeared in the input. This element has the following attributes:
|
| entities/entity/wiki_url | The Wikipedia URL of the entity/concept. Not present when the entity/concept doesn't have a Wikipedia page. |
| entities/entity/types | List of types of the entity/concept. One entity/concept can have multiple types. |
| entities/entity/types/type | One type of the entity/concept (e.g. location, person, organization). |
| entities/entity/metadata_list |
List of metadata. Not present when the entity/concept doesn't have any metadata or show_metadata="false". Currently only location type entities/concepts have metadata.
|
| entities/entity/metadata_list/metadata | One metadata of the entity/concept. |
| entities/entity/metadata_list/metadata/geo_area | Area of the location. |
| entities/entity/metadata_list/metadata/geo_country | Country of the location. |
| entities/entity/metadata_list/metadata/geo_isocountrycode | ISO country code of the location. |
| entities/entity/metadata_list/metadata/geo_location | Longitude and latitude of the location. |
| entities/entity/metadata_list/metadata/geo_name | Name of the location. |
| entities/entity/metadata_list/metadata/geo_placetype | Place type of the location. |
| entities/entity/metadata_list/metadata/geo_state | State name of the location. |
| entities/entity/metadata_list/metadata/geo_statecode | State abbreviation of the location. |
| entities/entity/metadata_list/metadata/geo_town | Town of the location. |
| entities/entity/metadata_list/metadata/woe_id | WOE id of the location. |
| entities/entity/related_entities |
List of related entities/concepts. Not present when the entity/concept doesn't have any related entities/concepts or related_entities="false".
|
| entities/entity/related_entities/wikipedia | List of related Wikipedia entities/concepts. |
| entities/entity/related_entities/wikipedia/wiki_url | URL of the related Wikipedia entity/concept. |
The following is a sample response for the sample query above:
The Content Analysis service is limited to 5,000 queries per IP address per day and to noncommercial use. See information on rate limiting.
Please see our Usage Policy to learn about acceptable uses and how to request additional queries.
The Content Analysis service returns the standard errors. There are no service-specific errors.
The Content Analysis Search service is discussed on the yws-search-general mailing list.
Re: This thing changed my life
Mon, 20 Feb 2012
Thu, 09 Feb 2012
Fri, 03 Feb 2012
I figured I should share the wealth...
Thu, 02 Feb 2012
Wed, 01 Feb 2012