0

XPath and attribute extraction

The YQL console has this example:

select * from html where url="http://finance.yahoo.com/q?s=yhoo" and xpath='//div[@id="yfi_headlines"]/div[2]/ul/li/a'

Which returns a list of links that looks like:

CODE
<a href="http://biz.yahoo.com/ap/090313/internet_midday_glance.html?.v=1">Midday Glance: Internet companies</a>
<a href="http://biz.yahoo.com/paidcontent/090313/1_334578_id.html?.v=1">Analysts: What Armstrong's Move Means For AOL, Google</a>
...


Is there any way, instead, to have YQL return only the HREF attribute of each link? For example:

CODE
http://biz.yahoo.com/ap/090313/internet_midday_glance.html?.v=1
http://biz.yahoo.com/paidcontent/090313/1_334578_id.html?.v=1
...

by
1 Reply
  • Right now there is no way to return pure textual responses. As close as you could get is:

    CODE
    select href from html where url="http://finance.yahoo.com/q?s=yhoo" and xpath='//div[@id="yfi_headlines"]/div[2]/ul/li/a'
    0
  • CODE
    select content from html where url="http://finance.yahoo.com/q?s=yhoo" and xpath='//div[@id="yfi_headlines"]/div[2]/ul/li/a'

    Gets you a bit closer than selecting *:
    CODE
        <results>
    <a>AOL names yet another head of online ad business</a>
    <a>Report: Flickr Founder To Launch Social-Gaming Venture</a>
    <a>The Leverage Isn't Where You Think</a>
    ...
    </results>
    0

Recent Posts

in YQL