PlaceSpotter helps developers make their applications location-aware by identifying places in unstructured and atomic content – feeds, web pages, news, status updates – and returning geographic metadata for geographic indexing and markup. The service is available as a REST API which uses POST only so that developers can pass in any textual content (or links) and get back geographic entities. This section of the document describes some key concepts that are relevant to the use of PlaceSpotter.
Many of the Key Concepts here will be familiar to users of the GeoPlanet API; GeoPlanet is a core Yahoo! web service that can be used to obtain more information about the places returned by PlaceSpotter, such as the relationship of one place to containing or adjacent places.
Yahoo! PlaceSpotter aims to capture all forms of how a place is called, and disambiguate the place-name to its canonical form. The platform identifies and disambiguates every place-name to a specific place concept, referenced by its unique identifier, the Where-on-Earth ID (WOEID). WOEIDs always reference a place, not a place name. For example, "New York", "New York City", "NYC", and "the Big Apple" are all variant names for WOEID 2459115. If PlaceSpotter find these variants in the text, it will understand them to be multiple appellations of the same place.
This approach extends to a multi-lingual environment: "München" in Germany is "Munich" to the English speaking world and "Monaco di Bavaria" to the Italians, but may also be keyed as "Muenchen" and "Munchen" if special characters, diacritical marks, and ligatures are not available to the user. All of these spatial appellations are simply multiple names for the same place, and therefore will be identified and understood as the same identifier (WOEID 676757).
Spatial entities identified by Yahoo! PlaceSpotter are uniquely referenced by a positive 32-bit identifier: the Where On Earth ID (WOEID). WOEIDs are permanent and non-repetitive, and are assigned to all entities shared across PlaceSpotter, GeoPlanet, Upcoming, and many other Yahoo! APIs. Read more about WOEIDs in the GeoPlanet Documentation on WOEIDs.
WOEIDs reference a particular geostatic named place, and are not used to refer to businesses or individual addresses. When it encounters a structured address, PlaceSpotter will not perform street-level geocoding but will instead provide the WOEID of the smallest bounding named place known, frequently a postal code or neighborhood.
Yahoo! PlaceSpotter uses the Yahoo! Geo-informatics database information. This global database consists of six million (and growing) named places including administrative areas, settlements , postal, codes, points of interest, colloquial regions, islands, etc. Coverage varies from country to country and Yahoo! is always working to improve and update the database information.
Places are categorized to help identify the specific place you are searching for, such as a county and city of the same name. These Place Types have distinctive codes and names that are returned for each place. The complete list of Place Types may be found in the Yahoo! GeoPlanet Documentation on Place Types.
Yahoo! PlaceSpotter is UTF-8 compliant and supports location
names for usage variations and in multiple languages, including
English, French, German, Italian, Spanish as well as local
multi-byte character set data in Japanese, Traditional Chinese, and
Korean. To specify the language, set the
query parameter to a code
described by RFC
Places in PlaceSpotter are primarily represented by WOEIDs, but we also return a coarse representation in Longitude/Latitude using the WGS84 datum. See more on how we work with positioning, space, and place in the GeoPlanet Documentation on Positional Consistency. WOEIDs returned by PlaceSpotter can be passed onto the GeoPlanet API for further geographic exploration.
Some documents contain multiple place references within a geographic area, such as a county, state, or country. The geographic area associated with a document is called its Document Scope and is a place itself. Yahoo! PlaceSpotter uses the place references in a document along with rules to determine the Document Scope. There are two flavors of Document Scope: Geographic Scope, and Administrative Scope. Geographic Scope is the place that best describes the document and may be of any place type. Administrative Scope is the place that best describes the document and has an administrative place type. The administrative place types are:
For example, if your document contains the places "Bolinas", "San Francisco", and "Sacramento", we will return "California" (WOEID 2347563) as the Administrative Scope, and "Northern California" (WOEID 55857166) as the Geographic Scope. Sometimes the same WOEID will be returned for both.
A portion of text within a document that conveys geographic context is called a place reference. A place reference may be ambiguous, such as "Springfield" (there are 26 Springfields in the US alone), or unambiguous, such as "London, England". Yahoo! PlaceSpotter identifies these place references and returns them in its response document, along with the actual text and list of WOEIDs that match each place reference. This makes it possible to highlight text and create links to content associated with the place reference.
Yahoo! PlaceSpotter is not a geocoder, so addresses in requests are resolved to the smallest bounding Place, usually a town or a postal code. Also, WOEIDs are not assigned to individual house numbers or street names.
Yahoo! PlaceSpotter delivers bounding boxes and centroids of named places. This information is not definitive, and we make no claims to be the authority on the center of a place, its best routing point, nor the approximation of its extents. We provide coordinates only to assist users in finding the place on a map, and zooming to the correct extents.
Postal codes, such as US zip codes or UK postcodes, are assigned by postal authorities within a country for the purpose of expediting mail delivery, and do not necessarily align with administrative areas or cities in that country. For example, in the US, a city may be served by multiple zip codes, and a zip code may serve multiple cities.
The world's geography is not static; Yahoo! PlaceSpotter acknowledges this reality. We employ a significant number of automated and editorial processes that are designed to ensure the currency and accuracy of our geographic resource. Constant administrative, postal, and geographic processes render locations obsolete: cities grow to absorb adjacent towns and villages, postal codes are created, terminated, and modified on a frequent basis in most countries that have them, and new development replaces outdated infrastructure. In Japan, for example, the "gappei" process constantly re-organises the nation’s official administrative geography by a method of merging, splitting, and redrawing geographical boundaries.
In cases where a place is stripped of its official status, Yahoo! will migrate the place to a historical category so that it can still be recognised, and its relationships to its successor places are updated so they can be discovered. Such places cease to be included in the administrative hierarchy. Their WOEIDs will remain unchanged.
In cases where a place is still current, yet has been redefined in respect of name, geometry or category, the WOEID will remain unchanged, and the attribution is updated to reflect the change.
As we continuously refine Yahoo! PlaceSpotter there are times when we need to deprecate existing WOEIDs; there are two primary scenarios in which we will take this infrequent course of action:
For example, the place "CIA" was originally represented in PlaceSpotter as a suburb category location, but subsequently identified as a match for "Calgary International Airport"; the original WOEID for the suburb place was thus deprecated and mapped to the new WOEID for the airport. PlaceSpotter, however, continues to accommodate transparently and permanently the deprecated, duplicate WOEID. This approach to data management allows us to improve and refine the underlying resource without impacting offline content which has been indexed against it.
In the rare instance when this occurs, it is usually due to the integration of historical locations that have no bearing to the "real world", or situations where places are deemed to be unspecific or unverifiable.
In these situations, the WOEID of the invalid place will be deprecated and mapped to the WOEID of its parent place.
Yahoo! aims to capture the geography of the Earth as it is used by the world's people. To this end we are guided by various standards and sources of geographical information. For country codes and names, we rely on ISO 3166-1 but make no specific claim as to official designation or authority of disputed territories.
We appreciate that the subjective and personal nature of world geography ensures that there is no single authoritative hierarchy and we do not aim to impose one here. Rather, the PlaceSpotter hierarchy is presented to facilitate geographic discovery, and ultimately assist in disambiguating identically named places, and resolving spatial appellations to a unique, open, and permanent identifier.
Yahoo! aims to improve PlaceSpotter at every opportunity, and we could always use your help. Please post comments, questions, and bugs (outrageous!) to the Yahoo! BOSS Forum.