Content or API providers can opt out or restrict YQL access to their data by following the instructions in the sections below. Please remember that it is your responsibility to obtain the necessary permissions from the content or API providers to use their content or services, separate from your use of YQL: neither Yahoo nor your use of YQL cover those permissions.
YQL uses the
robots.txt file on your server to determine the Web
pages accessible from your site. YQL uses the user-agent "Yahoo Pipes 2.0" when accessing the
robots.txt file and checks it for allows/disallows from this user agent. If
robots.txt check does prevent YQL from accessing your content, it will
then fetch the target page using a different user agent:
Therefore, to deny YQL access to your content, simply add "Yahoo Pipes 2.0" to the
relevent parts of your
robots.txt. For example:
Another approach is to block YQL on your Web server. For example, in Apache, add this to
your virtual host block in
YQL fetches content from URLs when requested by a developer. Because YQL is not a Web crawler, it does not follow the robots exclusion protocol for non-HTML data, such as XML or CSV, from a site. To stop YQL from accessing any content on your site, block the YQL user-agent (Yahoo Pipes 2.0) on your Web server.
For example, on Apache servers, add this rule to your virtual host block in
YQL allows APIs to accurately use IP-based rate limits that will track and count on the YQL developer's IP address, rather than the IP addresses of shared proxy servers that YQL uses to access content on the Web.
For outgoing requests to external content and API providers, YQL determines the last
valid client IP address connecting to its Web service and then ensures this is the first IP
address in the
X-FORWARDED-FOR HTTP header.
For example, in the
X-FORWARDED-FOR HTTP header below, the request
arriving at YQL came from the
220.127.116.11 IP address. IP-rate limiters should
use this value rather than the IP addresses of YQL proxy servers.
X-FORWARDED-FOR: 18.104.22.168, 22.214.171.124, 126.96.36.199
We also set the
CLIENT-IP HTTP header to this IP address.
Because these headers are "unsigned," they can be spoofed. Therefore, providers
should only use these headers if the proxy setting them is trusted. The IP addresses of
the proxy hosts that should be trusted can be found at
https://developer.yahoo.com/yql/proxy.txt. This file will be updated as
our proxy hosts change.