YQL query stopped working

I was executing yql from my local server, then after about 600request it stopped working.

select * from html where url="http://www.tvrtke.com/Rezultati-pretrage,PID-4,PN-178.aspx?M=0&Z=18&D=0&E=&W=&POZ=0&T=&U=&MB=&P=&F=&OIB=" AND xpath='//div[contains(@class,"LinkSearch")]'

Now i get ever time: [query] => Array ( [count] => 0 [created] => 2013-10-09T21:04:08Z [lang] => en-US [results] => )

I tried query on yahoo console, works like a charm. Tried even from another IP, but same.

I'm not using api key, but i don't understand how it's working on test console and not from my server. I don't think i reached req limit and web page surely doesnt have robots.txt.

4 Replies
  • I'm having the same problem, sometimes it stops working for a while, then it works for a while too, and I'm sure I don't reach the limit.

  • Now it's working again, but there is another problem. My timeout is reasonably high and i'm getting no results for every third or fourth request. Tested with curl on local server, same timeout range, every single request is successful with same timeout and query set. I know YQL is simple to use but reliability is poor.

  • Same here: Queries work fine on some occasions while the same queries return empty results at other times. Its not an IP based limit as I tried multiple IPs. I'm querying against Google maps. YQL console works fine.

    My guess is that Google Maps limits are reached, and Google stops answering YQL requests.

    YQL documentation for content providers instruct the content provider to look at the X-FORWARDED-FOR and not the YQL proxy server address, to avoid blockage of the YQL servers. However, I'm not certain that Google indeed follows this guideline. See here: http://developer.yahoo.com/yql/guide/limit_access_content_providers.html#rate_limiting_ip_addr

    More than that, the link to the list of YQL servers to trust doesn't return any information: http://developer.yahoo.com/yql/proxy.txt, so there is no way a content provider can actually filter according to the X-FORWARDED_FOR header securely.

    So my guess is that YQL proxy server address gets blocked after a while, while the YQL console uses a different IP address and hence is not blocked.

    Is there a way to check if this is indeed true?

    Also, assuming there are multiple YQL proxy servers, how is the proxy server selected for each query? Is there a way to select the YQL server, or auto-rotate between them, or anything to make sure that YQL service doesn't 'stop working' ?

  • Using few set of test querys for cca 30 days 24/7, more then big timediff between querys. I keep log of every query, results and if not successful. For first 5 days or so, 90-95% returned results, after that rapid drop to i guess 10-20% success rate. But if i type in yql console, it would be 100% every time.

    I returned to iMacros, Greasemonkey, curl and other methods, which do need a little more work, but with them i get full control and 99% success.. Retrieving "information of interest" in periodical timeframes is what i need. Cant rely on chance.

    With YQL i have no feedback if query blocked, sites doesnt response/exists etc. I guess if YQL would be pay service, quality would be higher :).


Recent Posts

in YQL