Home | Index

SearchMonkey Guide

Enhanced Results User Agent

One way for third party developers to build SearchMonkey applications is to write "custom data services" that can extract structured data from HTML pages, or retrieve data by calling web service APIs.

If you are seeing a user agent in your logs that resembles:

User-Agent: Mozilla/5.0 (compatible; Yahoo! SearchMonkey 1.0; 
http://help.yahoo.com/l/us/yahoo/search/enhancedresults/agent.html)

this means that a third party developer has built a data service to retrieve data from your website and dispatch that data to SearchMonkey in order to provide more sophisticated search results to Yahoo! users. For example, if your site provides a web service that returns weather data for different geographical locations, someone could write a SearchMonkey data service that calls your web service and embeds weather information directly in the search result.

It is possible that SearchMonkey's data extraction will end up driving additional user traffic to your site. However, if you feel that this data extraction is abusive, there is a couple of ways to block it:

Configure your web server to block the user agent "Yahoo! SearchMonkey 1.0". For example, to block SearchMonkey in Apache, add this to your virtual host block in httpd.conf:

SetEnvIfNoCase User-Agent "Yahoo! SearchMonkey 1.0" noMonkey
<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=noMonkey
</Limit>

Send an email to with a list of the URLs you want blocked, along with contact information so we can confirm that you are the owner of those feeds. It can take a few days to block your pages.

Because SearchMonkey is not a web crawler (the service only retrieves URLs when requested to by a SearchMonkey developer or user), SearchMonkey does not follow the robots exclusion protocol, and won't check your robots.txt file.