The SearchMonkey vocabularies is a collection of vocabularies that we recommend using in DataRSS feeds and for annotating pages with metadata. Each vocabulary includes a set of terms and classes that are common for a particular domain. The set of vocabularies includes well-established RDF vocabularies as well as RDF vocabularies for microformats such as hCard, hCalendar and hReview. The vocabulary set also includes vocabularies developed specifically for SearchMonkey.
The following table lists the standard vocabularies recommended to be used in SearchMonkey applications. The documentation for some of these vocabularies is reproduced in this guide for convenience.
| Vocabulary prefix | Vocabulary name | Domain | Documentation |
|---|---|---|---|
| dc | Dublin Core | Document metadata | See documentation on the web. |
| foaf | Friend-Of-A-Friend | Personal profiles and social networks | See included documentation. |
| vcard | VCard | Personal and business addresses | See included documentation. |
| vcal | VCalendar | Events and other calendar items | See included documentation. |
| review | hReview | Reviews | See included documentation. |
| sioc | SIOC | Blogs, discussion forums, Q&A sites | See included documentation. |
| gr | GoodRelations | Product price specification, delivery and payment etc. | See included documentation. |
| dbpedia | DBPedia | Generic vocabulary | See documentation on the web. |
| fb | Freebase | Generic vocabulary | See documentation on the web. |
These vocabularies are intended to help developers to get started. However, this selection is not exclusive: you can provide data using other vocabularies. See Defining New Properties for more information.
The section Predefined Prefixes gives an overview of the recommended vocabularies. In dataRSS feeds, these vocabularies do not need to be explicitly declared if you place the following processing instruction at the beginning of a dataRSS feed:
In pages with embedded RDF metadata (eRDF and RDFa), each vocabulary needs to be declared using the appropriate constructs, i.e. LINK elements in eRDF and XML namespace declarations in RDFa.
In section Examples we list a number of examples of using these vocabularies in DataRSS.
This specification is on a periodic release schedule to improve conformance to industry standard vocabularies and enable common use cases to be accomplished in a consistent manner.
In the following we include some examples of representing data in DataRSS format from different domains. (We ignore the Atom headers for brevity). As DataRSS follows the RDFa standard, these examples can be directly translated to annotations in HTML by applying the same attributes (rel, property, typeof, resource) to HTML elements, following the same nesting as in the examples.
As an example, consider the following snippet of dataRSS:
This can be directly embedded inside HTML by applying the attribues to non-display HTML elements such as SPAN and DIV:
Note that RDFa provides additional attributes that make it easier to add markup to existing pages by reusing semantic-bearing HTML elements. For example, in the above case both the name and homepage can be provided using the HTML <a> tag:
See the RDFa Primer for more details.
There will probably be times when the kinds of metadata you'd like to extract isn't found in the searchmonkey-profile vocabulary reference. It might be an existing RDF vocabulary, or something you needed to make up by yourself.
For example, suppose your metadata is about digital cameras and in
particular you would like to represent the number of megapixels a
digital camera can handle. Let's assume you have created an RDF or OWL
ontology and defined the term DigitalCamera and the
property megapixels, both with a namespace
http://example.com/vocab/digicam#. You are encouraged to
publish your schema at this same location so that others can consult the
definition of your newly created term.
The declaration and use of your terms in combination with the existing class product:Product would look like this:
Use the links below to download the OWL definitions of the SearchMonkey vocabularies.
The following table lists the prefixes that are predefined for DataRSS feeds. Namespacesthat are not included in this table need to be explicitly defined. It is also an error to redefine any of these namespaces.
| Prefix | Name | Namespace |
|---|---|---|
| abmeta | AB Meta | http://www.abmeta.org/ns# |
| action | SearchMonkey Actions | http://search.yahoo.com/searchmonkey/action/ |
| assert | SearchMonkey Assertions (deprecated) | http://search.yahoo.com/searchmonkey/assert/ |
| cc | Creative Commons | http://creativecommons.org/ns# |
| commerce | SearchMonkey Commerce | http://search.yahoo.com/searchmonkey/commerce/ |
| context | SearchMonkey Context (deprecated) | http://search.yahoo.com/searchmonkey/context/ |
| country | SearchMonkey Country Datatypes | http://search.yahoo.com/searchmonkey-datatype/country/ |
| currency | SearchMonkey Currency Datatypes | http://search.yahoo.com/searchmonkey-datatype/currency/ |
| dbpedia | DBPedia | http://dbpedia.org/resource/ |
| dc | Dublin Core | http://purl.org/dc/terms/ |
| fb | Freebase | http://rdf.freebase.com/ |
| feed | SearchMonkey Feed | http://search.yahoo.com/searchmonkey/feed/ |
| finance | SearchMonkey Finance | http://search.yahoo.com/searchmonkey/finance/ |
| foaf | FOAF | http://xmlns.com/foaf/0.1/ |
| geo | GeoRSS | http://www.georss.org/georss# |
| gr | GoodRelations | http://purl.org/goodrelations/v1# |
| job | SearchMonkey Jobs | http://search.yahoo.com/searchmonkey/job/ |
| media | SearchMonkey Media | http://search.yahoo.com/searchmonkey/media/ |
| news | SearchMonkey News | http://search.yahoo.com/searchmonkey/news/ |
| owl | OWL ontology language | http://www.w3.org/2002/07/owl# |
| page | SearchMonkey Page (deprecated) | http://search.yahoo.com/searchmonkey/page/ |
| product | SearchMonkey Product | http://search.yahoo.com/searchmonkey/product/ |
| rdf | RDF | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
| rdfs | RDF Schema | http://www.w3.org/2000/01/rdf-schema# |
| reference | SearchMonkey Reference | http://search.yahoo.com/searchmonkey/reference/ |
| rel | SearchMonkey Relations | http://search.yahoo.com/searchmonkey-relation/ |
| resume | SearchMonkey Resume | http://search.yahoo.com/searchmonkey/resume/ |
| review | Review | http://purl.org/stuff/rev# |
| sioc | SIOC | http://rdfs.org/sioc/ns# |
| social | SearchMonkey Social | http://search.yahoo.com/searchmonkey/social/ |
| stag | Semantic Tags | http://semantictagging.org/ns# |
| tagspace | SearchMonkey Tagspace (deprecated) | http://search.yahoo.com/searchmonkey/tagspace/ |
| umbel | UMBEL | http://umbel.org/umbel/sc/ |
| use | SearchMonkey Use Datatypes | http://search.yahoo.com/searchmonkey-datatype/use/ |
| vcal | VCalendar | http://www.w3.org/2002/12/cal/icaltzd# |
| vcard | VCard | http://www.w3.org/2006/vcard/ns# |
| xfn | XFN | http://gmpg.org/xfn/11# |
| xhtml | XHTML | http://www.w3.org/1999/xhtml/vocab# |
| xsd | XML Schema Datatypes | http://www.w3.org/2001/XMLSchema# |
Datatype vocabularies are used with in combination with literals to specify the type of literal. Some properties, like dc:identifier, may have many different possible values. To specify the type of the actual value provide, use the datatype attribute. In the following example, a datatype attribute specifies that the value of the dc:identifier is a positive integer.
The datatype attribute is supported using the data::xpath , you cannot query it using data::get.
Datatypes help to validate the output of custom data services: a warning can be raisedif the actual value does not conform to the specified datatype.
| Datatype | Description |
|---|---|
| currency:XYZ | A specific currency, where XYZ is any 3-letter currency code from ISO 4217 or revisions thereof |
| units:bytes | Information size in octets |
| units:cm | Distance in centimeters |
| units:ft | Distance in feet |
| units:g | Weight in grams |
| units:in | Distance in inches |
| units:kg | Weight in kilograms |
| units:km | Distance in kilometers |
| units:lb | Weight in pounds |
| units:m | Distance in meters |
| units:mi | Distance in miles |
| units:mm | Distance in millimeters |
| units:oz | Weight in ounces |
| use:email | A string intended for use as an email address |
| use:fax | A telephone number intended to reach a fax machine |
| use:isbn | A string intended for use as an ISBN |
| use:url | A string intended for use as a URL |
| xsd:ENTITIES | See ENTITIES |
| xsd:ENTITY | See ENTITY |
| xsd:ID | See ID |
| xsd:IDREF | See IDREF |
| xsd:IDREFS | See IDREFS |
| xsd:NCName | See NCName |
| xsd:NMTOKEN | See NMTOKEN |
| xsd:NMTOKENS | See NMTOKENS |
| xsd:NOTATION | See NOTATION |
| xsd:Name | See Name |
| xsd:QName | See QName |
| xsd:anyURI | See anyURI |
| xsd:base64Binary | See base64Binary |
| xsd:boolean | See boolean |
| xsd:byte | See byte |
| xsd:date | See date |
| xsd:dateTime | See dateTime |
| xsd:decimal | See decimal |
| xsd:double | See double |
| xsd:duration | See duration |
| xsd:float | See float |
| xsd:gDay | See gDay |
| xsd:gMonth | See gMonth |
| xsd:gMonthDay | See gMonthDay |
| xsd:gYear | See gYear |
| xsd:gYearMonth | See gYearMonth |
| xsd:hexBinary | See hexBinary |
| xsd:int | See int |
| xsd:integer | See integer |
| xsd:language | See language |
| xsd:list | See list |
| xsd:long | See long |
| xsd:negativeInteger | See negativeInteger |
| xsd:nonNegativeInteger | See nonNegativeInteger |
| xsd:nonPositiveInteger | See nonPositiveInteger |
| xsd:normalizedString | See normalizedString |
| xsd:positiveInteger | See positiveInteger |
| xsd:short | See short |
| xsd:string | See string |
| xsd:time | See time |
| xsd:token | See token |
| xsd:union | See union |
| xsd:unsignedByte | See unsignedByte |
| xsd:unsignedInt | See unsignedInt |
| xsd:unsignedLong | See unsignedLong |
| xsd:unsignedShort | See unsignedShort |