0

can i make queries on an xhtml?

hi guys,
i have this problem. i have to make an open data table to query on a xhml file...

can i do this or the file MUST be only an xml?

thanks!

by
2 Replies
  • QUOTE (Samuele @ Apr 20 2010, 08:23 AM) <{POST_SNAPBACK}>
    hi guys,
    i have this problem. i have to make an open data table to query on a xhml file...

    can i do this or the file MUST be only an xml?

    thanks!


    If you mean XHTML, then you can query that using the xml table, or even the html table.

    If you want to fetch XHTML you can do:
    var x = y.rest("http://serverwithmyfile/file.xhtml").get().response;

    or even:

    var x = y.query("select * from xml where url=@url",{url:"http://serverwithmyfile/file.xhtml"}).results;

    If it's not XHTML exactly then you could get YQL to treat it as a web page and try and repair it using tidy(). That may remove information you want as tidy only accepts HTML4 elements.

    var x = y.rest("http://serverwithmyfile/file.xhtml").accept("text/html").get().response;

    Jonathan
    0
  • The easiest way would be to use the html table that comes as a standard.
    Include an input key in your table:
    CODE
    <inputs>
    <key id="url" type="xs:string" paramType="variable" required="true"/>
    </inputs>

    That way you can vary the url you "crawl" (scrape is such a nasty word)
    Specify an optional xpath expression to find the data within the page. You could accept this as another key to make it more flexible.
    Once the data is loaded:
    CODE
    var query  = 'select * from html where url=@url  and xpath=\"'+xpath+'\"';
    var pageData = y.query(query,{url:url});

    You should have the page as an E4X object or if you specified an xpath expression the output from that as E4X.
    Read the docs, there is a great example of using these techniques in there.
    0

Recent Posts

in YQL