0

Very simple query failing?

Am very interested in YQL and think if it works as advertised, then I'll be able to use it in all sorts of places. However after doing some very basic testing, I'm hitting a bit of a wall...

CODE
SELECT * FROM html WHERE url="http://3dtotal.com/home2/home.asp" AND xpath="//table"


Can anyone figure why the above code is failing? Seems relatively simple, but am getting 'error : Failed to parse html'

by
4 Replies
    • x
    • Feb 3, 2009
    QUOTE (cgenie77 @ Feb 3 2009, 01:47 AM) <{POST_SNAPBACK}>
    CODE
    SELECT * FROM html WHERE url="http://3dtotal.com/home2/home.asp" AND xpath="//table"

    Can anyone figure why the above code is failing? Seems relatively simple, but am getting 'error : Failed to parse html'

    The following validation throws nearly a hundred parsing errors:

    http://validator.w3.org/check?verbose=1&am...ome2%2Fhome.asp

    In particular, upon inspecting the source code, I found that the nesting of tags is incorrect. There are for example table cells that don't belong to any table.
    0
  • QUOTE (cgenie77 @ Feb 3 2009, 01:47 AM) <{POST_SNAPBACK}>
    Am very interested in YQL and think if it works as advertised, then I'll be able to use it in all sorts of places. However after doing some very basic testing, I'm hitting a bit of a wall...

    CODE
    SELECT * FROM html WHERE url="http://3dtotal.com/home2/home.asp" AND xpath="//table"


    Can anyone figure why the above code is failing? Seems relatively simple, but am getting 'error : Failed to parse html'


    As per Paul Donnelly's post on the Yahoo! Pipes suggestion board ( http://suggestions.yahoo.com/detail/?prop=...&fid=130651 ) try using
    http://cgi.w3.org/cgi-bin/tidy?forceXML=on...ome2%2Fhome.asp
    This works, but I don't know if it returns the result you require.
    0
  • heh, yes I noticed it's not exactly the greatest feat of web design! ;)So is YQL dependent on properly formed html? (I ask because that's part of the problem - ie rubbishy sites tend to need cleverer solutions to scrape the information - otherwise they'd just have an RSS/XML feed already!)
    0
  • QUOTE (cgenie77 @ Feb 3 2009, 07:47 AM) <{POST_SNAPBACK}>
    heh, yes I noticed it's not exactly the greatest feat of web design! ;)-- Nagesh
    0

Recent Posts

in YQL