0

YQL returns 404 response code although the page is exists

Hello,
I am trying to crawl a page using YQL, but YQL returns nothing
My YQL query is:
select * from html where url="http://cms.albawaba.com/tmp/1.html"
I have tried to debug my query using
http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22http%3A%2F%2Fcms.albawaba.com%2Ftmp%2F1.html%22%20&diagnostics=true&debug=true
the debug gives me 404 response code!! although the page is there and you can check it.
could you please guide me to what is wrong!!
Thanks for your attention

by
2 Replies
  • http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22http%3A%2F%2Fcms.albawaba.com%2Ftmp%2F1.html%22%20&diagnostics=true&debug=true<br><br>The response contains 2 calls, the call to &quot;http://cms.albawaba.com/robots.txt&quot; has 404 http status code. Whereas the http status code of the call to &quot;http://cms.albawaba.com/tmp/1.htm&quot; is 200 and it contains the error message.<br><br>&lt;query yahoo:count=&quot;0&quot; yahoo:created=&quot;2012-01-13T00:53:03Z&quot; yahoo:lang=&quot;en-US&quot;&gt;<br>&nbsp;&nbsp;&nbsp; &lt;diagnostics&gt;<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &lt;publiclyCallable&gt;true&lt;/publiclyCallable&gt;<br><br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &lt;url execution-start-time=&quot;8&quot; execution-stop-time=&quot;116&quot; execution-time=&quot;108&quot; http-status-code=&quot;404&quot; http-status-message=&quot;Not Found&quot; id=&quot;ec630aae-&nbsp;&nbsp;&nbsp; 3f4d-4281-b4fd-71117b48808b&quot; proxy=&quot;DEFAULT&quot;&gt;http://cms.albawaba.com/robots.txt&lt;/url&gt;<br><br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &lt;url execution-start-time=&quot;0&quot; execution-stop-time=&quot;501&quot; execution-time=&quot;501&quot; http-status-code=&quot;200&quot; http-status-message=&quot;OK&quot; id=&quot;44b14117-1ba8-43c8-87e3-2a2fceb34cda&quot; proxy=&quot;DEFAULT&quot;&gt;http://cms.albawaba.com/tmp/1.html&lt;/url&gt;<br>&nbsp;&nbsp;&nbsp; &lt;error&gt;Fail to parse html. The content of elements must consist of well-formed character data or markup.&lt;/error&gt;<br><br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &lt;user-time&gt;527&lt;/user-time&gt;&lt;service-time&gt;609&lt;/service-time&gt;&lt;build-version&gt;24402&lt;/build-version&gt;<br>&nbsp;&nbsp;&nbsp; &lt;/diagnostics&gt;<br>&nbsp;&nbsp;&nbsp; &lt;results/&gt;<br>&lt;/query&gt;<br><br><br> QUOTE (alaa a @ 2 Nov 2011 1:44 AM) Hello,<br>I am trying to crawl a page using YQL, but YQL returns nothing<br>My YQL query is:<br>select * from html where url=&quot;http://cms.albawaba.com/tmp/1.html&quot; <br>I have tried to debug my query using <br>http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22http%3A%2F%2Fcms.albawaba.com%2Ftmp%2F1.html%22%20&diagnostics=true&debug=true<br>the debug gives me 404 response code!! although the page is there and you can check it.<br>could you please guide me to what is wrong!! <br>Thanks for your attention
    0
  • try<br>select * from html where url=&quot;http://cms.albawaba.com/tmp/1.html&quot; and compat=&#39;html5&#39;<br><br><br> QUOTE (alaa a @ 2 Nov 2011 1:44 AM) Hello,<br>I am trying to crawl a page using YQL, but YQL returns nothing<br>My YQL query is:<br>select * from html where url=&quot;http://cms.albawaba.com/tmp/1.html&quot; <br>I have tried to debug my query using <br>http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22http%3A%2F%2Fcms.albawaba.com%2Ftmp%2F1.html%22%20&diagnostics=true&debug=true<br>the debug gives me 404 response code!! although the page is there and you can check it.<br>could you please guide me to what is wrong!! <br>Thanks for your attention
    0

Recent Posts

in YQL