0

YQL 'just' started reporting sites as blocked by robots.txt when they are not

Hi

Just within the last couple of hours, request to YQL from anywhere but the console has been reporting: "Requesting a robots.txt restricted URL" for several sites where they are clearly not restricted by robots.txt, this would appear to be a bug (as it still works from console, but not from anywhere else):

Some example sites blocked but showing no block in their robots.txt:

http://www.fotolia.com/robots.txt
http://www.bigstockphoto.com/robots.txt
http://www.sxc.hu/robots

Several others are showing only: "Connect Failure" with no further explanation.
I have not reached execution limit (not by a long a shot) and all requests are return a valid "HTTP/1.0 200 OK" http header.

Any suggestions? Is this a bug? Any idea when it will be fixed if so?

Thanks
Bob

by
3 Replies
  • Hi Bob, can you provide some sample queries of what is failing?
    0
  • QUOTE (The Josh @ Jan 27 2011, 10:52 AM) <{POST_SNAPBACK}>
    Hi Bob, can you provide some sample queries of what is failing?


    Query of:
    http://query.yahooapis.com/v1/public/yql/b...g_box%27%5D%2Fa

    Gives headers (output as json) of:
    ["HTTP\/1.0 200 OK","Access-Control-Allow-Origin: *","Cache-Control: no-cache","Content-Type: application\/json;charset=utf-8","Date: Thu, 27 Jan 2011 18:56:21 GMT","Server: YTS\/1.19.5","Age: 1"]

    and response of:
    {"query":{"count":0,"created":"2011-01-27T18:56:22Z","lang":"en-US","diagnostics":{"publiclyCallable":"true","forbidden":"An error caused the engine to disallow robots for this domain","url":[{"error":"Connect Failure","execution-time":"0","proxy":"DEFAULT","content":"http://www.fotolia.com/robots.txt"},{"error":"Requesting a robots.txt restricted URL: http://www.fotolia.com/search?k=cheese&filters[collection]=false&order=creation&filters[content_type%3Aphoto]=1&filters[content_type%3Aillustration]=1&filters[content_type%3Avector]=1&filters[orientation]=all&with_offensive=on&without_offensive=on&limit=40","execution-time":"1","http-status-code":"403","http-status-message":"Forbidden","proxy":"DEFAULT","content":"http://www.fotolia.com/search?k=cheese&filters[collection]=false&order=creation&filters[content_type%3Aphoto]=1&filters[content_type%3Aillustration]=1&filters[content_type%3Avector]=1&filters[orientation]=all&with_offensive=on&without_offensive=on&limit=40"}],"user-time":"12","service-time":"1","build-version":"10970"},"results":null}}

    It works if executed from the console both for the above query and for a raw query (not using alias):
    http://query.yahooapis.com/v1/public/yql?q...iagnostics=true

    But not when run from the server (174.120.153.133).
    It was previously working fine, made no changes to the server or yahoo settings/account, and just stopped somewhere around noon (gmt) today.
    0
  • This now appears to be fixed:
    If anyone experiencing this issue is following this thread, see this thread instead:
    http://developer.yahoo.net/forum/?showtopic=8291
    0

Recent Posts

in YQL