0

[YQL] Howto format html into table-result

On the moment I am trying to build a YQL to use with Pipes. At the moment I have the following code:
CODE
select * from html where url="http://www.profcoach.nl/DataProvider/LineUpParticipant.ashx?roundid=20101112&lineUpParticipantId=18803&do=1"


The goal I want to achieve is to get the Teamname (in above example h3 'Fc Knooiers on Tour'), h6 of the result-box (in above example '7,1 <span>Punten</span>' and the sum of em class='reservepoints'

I know I can extent the YQL with the following statement:

CODE
select * from html where url="http://www.profcoach.nl/DataProvider/LineUpParticipant.ashx?roundid=20101112&lineUpParticipantId=18803&do=1" and xpath="//div[@class='result-box']"


This only gives me the result, but does anyone know how it is possible to combine with OR or AND the XPATH expression.

by
  • x
  • Mar 14, 2011
3 Replies
  • You could use an Xpath expression similar to
    //div[@class='result-box']|//h3|//em[@class='reservepoints']
    This doesn't sum the reserpoints values.


    QUOTE (rwoudsma @ Mar 14 2011, 08:29 AM) <{POST_SNAPBACK}>
    On the moment I am trying to build a YQL to use with Pipes. At the moment I have the following code:
    CODE
    select * from html where url="http://www.profcoach.nl/DataProvider/LineUpParticipant.ashx?roundid=20101112&lineUpParticipantId=18803&do=1"


    The goal I want to achieve is to get the Teamname (in above example h3 'Fc Knooiers on Tour'), h6 of the result-box (in above example '7,1 <span>Punten</span>' and the sum of em class='reservepoints'

    I know I can extent the YQL with the following statement:

    CODE
    select * from html where url="http://www.profcoach.nl/DataProvider/LineUpParticipant.ashx?roundid=20101112&lineUpParticipantId=18803&do=1" and xpath="//div[@class='result-box']"


    This only gives me the result, but does anyone know how it is possible to combine with OR or AND the XPATH expression.
    0
    • x
    • Mar 15, 2011
    QUOTE (hapdaniel @ Mar 15 2011, 02:17 AM) <{POST_SNAPBACK}>
    You could use an Xpath expression similar to
    //div[@class='result-box']|//h3|//em[@class='reservepoints']
    This doesn't sum the reserpoints values.

    Thank you for your answer. The thing I would like to achieve is to retrieve the following (non HTML compliant) website. This page represents the soccer selection for a team within the game profcoach. As the website does not give much statistical data I thought why not use YQL and Pipes to make a nice feed/output to be used to give a readable XML/JSON.

    The main goal is to retrieve the following information from the source:
    CODE
    <player>
    <teamname>FC Knooiers on Tour</teamname>
    <weekscore>7.1</weekscore>
    <bench>
    <substitute>
    <name>Waterman</name>
    <weekscore>6.0</weekscore>
    </substitute>
    <substitude>
    <name>Looms</name>
    <points>10.0</points>
    </substitude>
    </bench>
    <total-score>566.0</total-score>
    </player>

    Based upon this data I can give a user their specific information, like total-score, week-score and the number of points left on the bench. In order to achieve this I have created the following YQL:

    CODE
    select * from html where url="http://www.profcoach.nl/DataProvider/LineUpParticipant.ashx?roundid=20101112&lineUpParticipantId=18803&do=1" and xpath="//div[@class='result-box']/h6|//div[@class='result-box']/ul/li[1]/strong|//em[@class='reservepoints']|//div[@class='reserves-box']/h3"


    The YQL is used in the following Pipe document: http://pipes.yahoo.com/pcstats/userweek

    I hoped that is was possible:
    1. to group the information from item.class = 'reservepoints' into <bench> or do a sum for the values
    2. to replace the h3 with the <teamname>
    3. to replace the h6 with <weekscore>
    4. to replace the strong with <total-score>


    Can I do this with YQL and Pipes or isn't this the best way to use this solution? I was inspired after viewing this pipe which is described in this Dutch article.
    0
  • If you want to continue with the YQL/Pipes approach then this should help.
    http://pipes.yahoo.com/pipes/pipe.edit?_id...c831639c037104e
    You have work to do with this pipe.
    0
    • x
    • Mar 16, 2011
    Thank you for the response. I see you are using 'fetch data' to retrieve the results and not the YQL pipe module. May I ask why you take this approach. I use the YQL (in my earlier mentioned pipe) with the URL Builder in order to make it possible to parametrize the values roundid and lineUpParticipantId. Can I use the YQL module or will this not help to retrieve JSON for the list of elements.

    Have changed the regex to:
    CODE
    (${em.0.content})+(${em.1.content})+(${em.2.content})+(${em.3.content})+(${em.4.content})+(${em.5.content})+(${em.6.content})

    as I need al the content values of the em.x.content. Is is possible to make this smarter?

    Great solution with the webservice in order to do the calculation of the values, and when using rename in stead of copy the output will be cleaned up. At the end I can even add a filter, to filter out the old <em> values. Wil your webservices always be available? I saw that you are referring to several other domains.
    0
  • Using the YQL module will split the YQL query output into separate items which is really not what is wanted.
    I've changed the pipe to show how you can get parameter values in. The pipe now has a separate YQL module just to show what happens if you use that module.
    When I looked at some of your source pages it always looked like there were 7 entries for the reserve points. In my pipe I've added entries for the 12th and 14th elements of the array (which don't exist in that example), and shown how to remove the empty values. Assuming you are not looking at a huge possible number of entries this is the easiest way to go.

    I'm intending to leave the web service file where it is. You may feel more comfortable making your own copy. I've changed the URL for the json2.js file to point to one in the YQL open table directory. You may want to copy that as well.

    QUOTE (x @ Mar 16 2011, 03:03 AM) <{POST_SNAPBACK}>
    Thank you for the response. I see you are using 'fetch data' to retrieve the results and not the YQL pipe module. May I ask why you take this approach. I use the YQL (in my earlier mentioned pipe) with the URL Builder in order to make it possible to parametrize the values roundid and lineUpParticipantId. Can I use the YQL module or will this not help to retrieve JSON for the list of elements.

    Have changed the regex to:
    CODE
    (${em.0.content})+(${em.1.content})+(${em.2.content})+(${em.3.content})+(${em.4.content})+(${em.5.content})+(${em.6.content})

    as I need al the content values of the em.x.content. Is is possible to make this smarter?

    Great solution with the webservice in order to do the calculation of the values, and when using rename in stead of copy the output will be cleaned up. At the end I can even add a filter, to filter out the old <em> values. Wil your webservices always be available? I saw that you are referring to several other domains.
    0

Recent Posts

in YQL