Home | Index

SearchMonkey Guide

SearchMonkey Guide

A Manual for SearchMonkey Developers and Publishers

Abstract

The SearchMonkey Guide contains information for SearchMonkey developers and publishers. It includes full documentation for the online developer tool, tutorials and best practices, overviews of microformats and RDF, and reference material for APIs and XML schemas used to build SearchMonkey applications.

For developer assistance with SearchMonkey, join the SearchMonkey Yahoo! Group for Developers. For publisher assistance with SearchMonkey, join the SearchMonkey Yahoo! Group for Site Owners.


Table of Contents

1. Overview
About This Guide
Audience
Chapter Summaries
Understanding SearchMonkey
Example SearchMonkey Applications
Data Service Types
Developer Quickstart
Intel® Mash Maker with SearchMonkey
Intel Mash Maker
Installing Mash Maker
Using Mash Maker
Reviewing your Extraction
2. Developer Guide
Application Dashboard
DataRSS Primer for Developers
Creating Custom Data Services
Converting OpenSearch to DataRSS
Creating a (Page) Custom Data Service
Creating a (Web Service) Custom Data Service
Data Service Best Practices
Data Service Screens (Page)
Step 1: Basic Info
Step 2: URLs
Step 3: Data Extraction
Step 4: Confirmation
Data Service Screens (Web Service)
Step 1: Basic Info
Step 2: Inputs
Step 3: Test Data
Step 4: Endpoint
Step 5: Confirmation
Creating Presentation Applications
Presentation Templates
Infobars
Enhanced Results
Presentation Application Versioning
Presentation Application PHP Structure
Yahoo! Index Data
Creating a Presentation Application
Presentation Application Best Practices
Title Best Practices
Summary Best Practices
Image Best Practices
Link Best Practices
Key/Value Pair Best Practices
Getting Your Apps into the Search Gallery
Presentation Application Screens
Step 1: Basic Info
Step 2: URLs
Step 3: Data Services
Step 4: Appearance
Step 5: Confirmation
Step 6: Publication
Warnings and Errors
3. Site Owner Guide
Factors to Consider When Choosing a Data Delivery Method
Comparing Data Delivery Methods
XML-Based Atom Feeds
eRDF/RDFa Markup
Microformats
OpenSearch
Custom Data Services
Submitting Feeds
The Process of Creating a Feed
Selecting Content for the Feed
Selecting Pages (URLs)
General Content Guidelines
Ineligible Content
Understanding DataRSS
DataRSS Elements and Attributes
DataRSS in SearchMonkey Data Services
Creating the Feed
What are DataRSS and OPML Feeds?
Site Explorer Feed Requirements
OPML (Outline Processor Markup Language)
NewsML DataRSS
Google Base
SearchMonkey Feed Submission Overview
Leveraging the Data Web
Microformats
Supported Microformats
RDF
Enhanced Results User Agent
A. DataRSS Specification
DataRSS Feeds
Normative references:
The DataRSS elements
XML Schema syntax
B. SearchMonkey vocabularies
Overview
Intro
About the Examples
Defining New Classes and Properties
OWL Definitions
Predefined Prefixes
Datatypes
Examples
Business Addresses and Reviews
Components
Simple Local Listing Example
Advanced Local Listing Example
Personal Profiles and Social Networks
Components
Simple Social Profile Example
Advanced Social Profile Example
Friend-Of-A-Friend
Intro
Overview
Reference
foaf:Agent
foaf:Document
foaf:Group
foaf:Image
foaf:OnlineAccount
foaf:OnlineChatAccount
foaf:OnlineEcommerceAccount
foaf:OnlineGamingAccount
foaf:Organization
foaf:Person
foaf:PersonalProfileDocument
foaf:Project
foaf:accountName
foaf:accountServiceHomepage
foaf:aimChatID
foaf:based_near
foaf:birthday
foaf:currentProject
foaf:depiction
foaf:depicts
foaf:dnaChecksum
foaf:family_name
foaf:firstName
foaf:fundedBy
foaf:geekcode
foaf:gender
foaf:givenname
foaf:holdsAccount
foaf:homepage
foaf:icqChatID
foaf:img
foaf:interest
foaf:isPrimaryTopicOf
foaf:jabberID
foaf:knows
foaf:logo
foaf:made
foaf:maker
foaf:mbox
foaf:mbox_sha1sum
foaf:member
foaf:msnChatID
foaf:myersBriggs
foaf:name
foaf:nick
foaf:openid
foaf:page
foaf:pastProject
foaf:phone
foaf:plan
foaf:primaryTopic
foaf:publications
foaf:schoolHomepage
foaf:sha1
foaf:surname
foaf:theme
foaf:thumbnail
foaf:tipjar
foaf:title
foaf:topic
foaf:topic_interest
foaf:weblog
foaf:workInfoHomepage
foaf:workplaceHomepage
foaf:yahooChatID
GoodRelations
Intro
Overview
Reference
gr:AcceptedPaymentMethods
gr:ActualProductOrServiceInstance
gr:AvailableDeliveryMethods
gr:BusinessEntity
gr:BusinessEntityType
gr:BusinessFunction
gr:DayOfWeek
gr:DeliveryChargeSpecification
gr:DeliveryMethod
gr:DeliveryModeParcelService
gr:LocationOfSalesOrServiceProvisioning
gr:N-Ary-Relations
gr:Offering
gr:OpeningHoursSpecification
gr:PaymentChargeSpecification
gr:PaymentMethod
gr:PaymentMethodCreditCard
gr:PriceSpecification
gr:ProductOrService
gr:ProductOrServiceModel
gr:ProductOrServicesSomeInstancesPlaceholder
gr:QualitativeValue
gr:QuantitativeValue
gr:QuantitativeValueFloat
gr:QuantitativeValueInteger
gr:TypeAndQuantityNode
gr:UnitPriceSpecification
gr:WarrantyPromise
gr:WarrantyScope
gr:acceptedPaymentMethods
gr:amountOfThisGood
gr:appliesToDeliveryMethod
gr:appliesToPaymentMethod
gr:availableAtOrFrom
gr:availableDeliveryMethods
gr:closes
gr:datatypeProductOrServiceProperty
gr:description
gr:durationOfWarrantyInMonths
gr:eligibleCustomerTypes
gr:eligibleRegions
gr:hasBusinessFunction
gr:hasCurrency
gr:hasCurrencyValue
gr:hasDUNS
gr:hasEAN_UCC-13
gr:hasEligibleQuantity
gr:hasGTIN-14
gr:hasGlobalLocationNumber
gr:hasMakeAndModel
gr:hasMaxCurrencyValue
gr:hasMaxValue
gr:hasMaxValueFloat
gr:hasMaxValueInteger
gr:hasMinCurrencyValue
gr:hasMinValue
gr:hasMinValueFloat
gr:hasMinValueInteger
gr:hasOpeningHoursDayOfWeek
gr:hasPriceSpecification
gr:hasUnitOfMeasurement
gr:hasValueFloat
gr:hasValueInteger
gr:hasWarrantyPromise
gr:hasWarrantyScope
gr:includesObject
gr:isAccessoryOrSparePartFor
gr:isConsumableFor
gr:isListPrice
gr:isSimilarTo
gr:legalName
gr:offers
gr:opens
gr:qualitativeProductOrServiceProperty
gr:quantitativeProductOrServiceProperty
gr:typeOfGood
gr:validFrom
gr:validThrough
gr:valueAddedTaxIncluded
hReview
Intro
Overview
Reference
review:Comment
review:Feedback
review:Review
review:commenter
review:hasComment
review:hasFeedback
review:hasReview
review:positiveVotes
review:rating
review:reviewer
review:text
review:title
review:totalVotes
review:type
SearchMonkey Actions
Intro
Overview
Reference
action:addFriend
action:append
action:checkAvailability
action:compare
action:delete
action:discuss
action:edit
action:give
action:locateStore
action:map
action:notify
action:perform
action:procure
action:readDocumentation
action:reserve
action:sendEmail
action:sendToPhone
action:viewHistory
action:viewImages
SearchMonkey Commerce
Intro
Overview
Reference
commerce:Business
commerce:Hotel
commerce:Restaurant
commerce:acceptsCredit
commerce:accessibility
commerce:ambience
commerce:attire
commerce:businessCategory
commerce:corkage
commerce:cuisine
commerce:features
commerce:hoursOfOperation
commerce:mealOptions
commerce:parkingOptions
commerce:priceRange
commerce:priceRangeHighest
commerce:priceRangeLowest
commerce:seatingOptions
commerce:serviceOptions
commerce:smoking
commerce:takesReservations
SearchMonkey Feeds
Intro
Overview
Reference
feed:Entry
feed:Feed
feed:hasEntry
SearchMonkey Jobs
Intro
Overview
Reference
job:JobListing
job:degree
job:duration
job:experience
job:expires
job:function
job:hireType
job:industry
job:location
job:published
job:salaryFrom
job:salaryTo
job:salaryType
SearchMonkey Media
Intro
Overview
Reference
media:Article
media:Audio
media:Image
media:Media
media:Photo
media:Photoset
media:Text
media:Thumbnail
media:Video
media:Videoset
media:audio
media:bitrate
media:channels
media:duration
media:fileSize
media:framerate
media:height
media:image
media:region
media:samplingrate
media:thumbnail
media:type
media:video
media:views
media:width
SearchMonkey Product
Intro
Overview
Reference
product:Flight
product:Product
product:Service
product:availability
product:brand
product:category
product:color
product:condition
product:height
product:identifier
product:invoice
product:length
product:listPrice
product:manufacturer
product:maxInvoice
product:maxMSRP
product:minInvoice
product:minMSRP
product:msrp
product:priceFrom
product:priceTo
product:shippingCost
product:shippingWeight
product:weight
product:width
SearchMonkey Resume
Intro
Overview
Reference
resume:Resume
resume:contact
resume:duration
resume:education
resume:experience
resume:org
resume:summary
SIOC
Intro
Overview
Reference
sioc:Community
sioc:Container
sioc:Forum
sioc:Item
sioc:Post
sioc:Role
sioc:Site
sioc:Space
sioc:Thread
sioc:User
sioc:Usergroup
sioc:about
sioc:account_of
sioc:administrator_of
sioc:attachment
sioc:avatar
sioc:container_of
sioc:content
sioc:content_encoded
sioc:created_at
sioc:creator_of
sioc:description
sioc:email
sioc:email_sha1
sioc:feed
sioc:first_name
sioc:function_of
sioc:has_administrator
sioc:has_container
sioc:has_creator
sioc:has_function
sioc:has_host
sioc:has_member
sioc:has_moderator
sioc:has_modifier
sioc:has_owner
sioc:has_parent
sioc:has_part
sioc:has_reply
sioc:has_scope
sioc:has_space
sioc:has_subscriber
sioc:has_usergroup
sioc:host_of
sioc:id
sioc:ip_address
sioc:last_name
sioc:link
sioc:links_to
sioc:member_of
sioc:moderator_of
sioc:modified_at
sioc:modifier_of
sioc:name
sioc:next_by_date
sioc:next_version
sioc:note
sioc:num_replies
sioc:num_views
sioc:owner_of
sioc:parent_of
sioc:part_of
sioc:previous_by_date
sioc:previous_version
sioc:reference
sioc:related_to
sioc:reply_of
sioc:scope_of
sioc:sibling
sioc:space_of
sioc:subject
sioc:subscriber_of
sioc:title
sioc:topic
sioc:usergroup_of
VCalendar
Intro
Overview
Reference
vcal:Valarm
vcal:Vevent
vcal:Vfreebusy
vcal:Vjournal
vcal:Vtimezone
vcal:Vtodo
vcal:List_of_Float
vcal:Value_DURATION
vcal:Value_PERIOD
vcal:Value_CAL-ADDRESS
vcal:DomainOf_rrule
vcal:Value_RECUR
vcal:Value_DATE
vcal:Vcalendar
vcal:X-
vcal:action
vcal:altrep
vcal:attach
vcal:attendee
vcal:byday
vcal:byhour
vcal:byminute
vcal:bymonth
vcal:bysecond
vcal:bysetpos
vcal:byweekno
vcal:byyearday
vcal:calAddress
vcal:calscale
vcal:categories
vcal:class
vcal:cn
vcal:comment
vcal:completed
vcal:component
vcal:contact
vcal:count
vcal:created
vcal:cutype
vcal:daylight
vcal:delegatedFrom
vcal:delegatedTo
vcal:description
vcal:dir
vcal:dtend
vcal:dtstamp
vcal:dtstart
vcal:due
vcal:duration
vcal:encoding
vcal:exdate
vcal:exrule
vcal:fbtype
vcal:fmttype
vcal:freebusy
vcal:freq
vcal:geo
vcal:interval
vcal:language
vcal:lastModified
vcal:location
vcal:member
vcal:method
vcal:organizer
vcal:partstat
vcal:percentComplete
vcal:priority
vcal:prodid
vcal:range
vcal:rdate
vcal:recurrenceId
vcal:related
vcal:relatedTo
vcal:reltype
vcal:repeat
vcal:requestStatus
vcal:resources
vcal:role
vcal:rrule
vcal:rsvp
vcal:sentBy
vcal:sequence
vcal:standard
vcal:status
vcal:summary
vcal:transp
vcal:trigger
vcal:tzid
vcal:tzname
vcal:tzoffsetfrom
vcal:tzoffsetto
vcal:tzurl
vcal:uid
vcal:until
vcal:url
vcal:version
vcal:wkst
VCard
Intro
Overview
Reference
vcard:Address
vcard:Geo
vcard:Name
vcard:Organization
vcard:VCard
vcard:additional-name
vcard:adr
vcard:agent
vcard:bday
vcard:category
vcard:class
vcard:country-name
vcard:email
vcard:extended-address
vcard:family-name
vcard:fax
vcard:fn
vcard:geo
vcard:given-name
vcard:homeAdr
vcard:homeTel
vcard:honorific-prefix
vcard:honorific-suffix
vcard:key
vcard:label
vcard:latitude
vcard:locality
vcard:logo
vcard:longitude
vcard:mailer
vcard:mobileEmail
vcard:mobileTel
vcard:n
vcard:note
vcard:org
vcard:organization-name
vcard:organization-unit
vcard:personalEmail
vcard:photo
vcard:post-office-box
vcard:postal-code
vcard:region
vcard:rev
vcard:role
vcard:sort-string
vcard:sound
vcard:street-address
vcard:tel
vcard:title
vcard:tz
vcard:uid
vcard:unlabeledAdr
vcard:unlabeledEmail
vcard:unlabeledTel
vcard:url
vcard:workAdr
vcard:workEmail
vcard:workTel
The rel Vocabulary (Deprecated)
C. PHP Reference
SearchMonkey PHP Whitelist
DOM Class Whitelist
Other Class Whitelist
String Functions Whitelist
Array Functions Whitelist
Variable Functions Whitelist
Output Functions Whitelist
Date Functions Whitelist
Regex Functions Whitelist
JSON Functions Whitelist
Math Functions Whitelist
Reflection Functions Whitelist
Other Functions Whitelist
The Data Class
public static function get()
Parameters
Returns
public static function getImage()
Parameters
Returns
public static function getStars()
Parameters
Returns
public static function getStarsFromNum()
Parameters
Returns
public static function xpath()
Parameters
Returns
public static function xpathString()
Parameters
Returns
D. FAQ: SearchMonkey and the Semantic Web

List of Figures

1.1. From Basic to Enhanced
1.2. Structure of a SearchMonkey Application
1.3. You are Done!
2.1. Application Dashboard
2.2. Basic Info Screen
2.3. URLs Screen
2.4. Data Extraction Screen
2.5. Preview Pane: Data Extraction Screen
2.6. Confirmation Screen
2.7. Data Services Library Submission Screen
2.8. Basic Info Screen
2.9. Inputs Screen
2.10. Test Data Screen
2.11. Endpoint Screen
2.12. Preview Pane: Endpoint Screen
2.13. Confirmation Screen
2.14. Data Services Library Submission Screen
2.15. Example Infobar: Acme Movies
2.16. Example Enhanced Result: Acme Movies
2.17. Basic Info Screen
2.18. URLs Screen
2.19. Data Services Screen
2.20. Appearance Screen
2.21. Preview Pane: Appearance Screen
2.22. Confirmation Screen
2.23. Confirmation Screen
3.1. Triples Diagram
3.2. Mapping the Data Service to the Application
3.3. RDF Triples for Joe's Home Page
C.1. PHP Insert Buttons

List of Tables

C.1. SearchMonkey Icons

List of Examples

2.1. Example DataRSS
2.2. Extracting a Photo from a Page with XSLT
3.1. Example DataRSS
3.2. Contact Information in hCard Format
3.3. Joe's Home Page with RDFa Markup
3.4. Joe's Home Page with eRDF Markup