0

Field information

I made a YUI DataSource class to be able to access YQL easily from several YUI Widgets. The problem is that the reply from YQL does not contain information about which is the tag for the element containing each record in the reply nor is there information about what fields did end up in the reply.

I can find them out, which is what I do right now. I take that whatever the tag name of the first element in results will be the tag name for all returned records. I also assume that whatever fields it did return in the first record are the ones that I should expect all along. I just assume that these two hold true for all records.

It would be far better if I could request an optional header that tells me what is it that I should expect instead of trying to figure it out from the first record.

The example, including the subclassed DataSource, are at:

http://satyam.com.ar/yui/2.7.0/yql.html

by
9 Replies
  • I should add that the datatype information would also be nice to have, I am reading everything as strings, but the DataSource should parse numbers, dates and booleans into native JavaScript types
    0
  • Is there anybody here? I've made a feature request and I would have expected the courtesy of a reply or an acknowledgment from someone in charge, or is this project dead?
    0
  • QUOTE (Deva Satyam @ Jun 29 2009, 08:22 AM) <{POST_SNAPBACK}>
    Is there anybody here? I've made a feature request and I would have expected the courtesy of a reply or an acknowledgment from someone in charge, or is this project dead?


    Hi Deva,

    On the contrary actually we've been heads down building all the feature requests from our users for the next upcoming release. Thats the reason we've been a bit away from the forums for the past few weeks.

    I should add that I did give this a quick read but couldnt really understand what you're trying to achieve. Can you provide an example of what you exactly mean. I could go through the link that you provided and understand the code (but unfortunately am busy with the upcoming YQL release)


    -- Nagesh
    0
  • I would like that in the same reply along the data itself I could request, with a suitable extra url-argument to have information about element name of each record, the field names to expect from the reply and their data type so that if I ask for the following URL (notice the las two arguments):

    CODE
    http://query.yahooapis.com/v1/public/yql?q=desc%20flickr.photos.search&format=json&callback=cbfunc&diagnostics=false&fieldinfo=true


    I get something like:
    CODE
    cbfunc({
    "query":{
    "count":"10",
    "created":"2009-07-02T08:05:15Z",
    "lang":"en-US",
    "updated":"2009-07-02T08:05:15Z",
    "uri":"http://query.yahooapis.com/v1/yql?q=select+*+from+flickr.photos.search+where+text+%3D+%22YDN%22",
    "fieldinfo":{
    "recordName":"photo",
    "fields":[
    "key":[{
    "name":"farm",
    "type":"xs:string"
    },
    {
    "name":"id",
    "type":"xs:string"
    },
    {
    "name":"isfamily",
    "type":"xs:boolean"
    },
    ......
    ]
    },

    "results":{
    "photo":[{
    "farm":"3",
    "id":"3679272771",
    "isfamily":"0",
    "isfriend":"0",
    "ispublic":"1",
    "owner":"28401989@N04",
    "secret":"6ace55fce0",
    "server":"2643",
    "title":"ConvergeSC Speakers"
    },
    ....


    so that instead of assuming the data structure from how it comes, I can actually know for sure and even (and this is the most important) know when the data is other than string, an information I have no way of guessing out. Numbers sort differently than strings containing numeric characters, but there is no way to tell one from the other from the received data. There is also no way to tell a boolean apart from a string containing a single digit 0 or 1. It is also good to get this information in the same reply message as the data itself, to avoid twice the latency time, instead of going to a DESC command first or more than that for complex queries. Also, what the actual fields ended up being, in case any failed.
    0
  • QUOTE (Deva Satyam @ Jul 2 2009, 12:19 PM) <{POST_SNAPBACK}>
    I would like that in the same reply along the data itself I could request, with a suitable extra url-argument to have information about element name of each record, the field names to expect from the reply and their data type so that if I ask for the following URL (notice the las two arguments):

    CODE
    http://query.yahooapis.com/v1/public/yql?q=desc%20flickr.photos.search&format=json&callback=cbfunc&diagnostics=false&fieldinfo=true


    I get something like:
    CODE
    cbfunc({
    "query":{
    "count":"10",
    "created":"2009-07-02T08:05:15Z",
    "lang":"en-US",
    "updated":"2009-07-02T08:05:15Z",
    "uri":"http://query.yahooapis.com/v1/yql?q=select+*+from+flickr.photos.search+where+text+%3D+%22YDN%22",
    "fieldinfo":{
    "recordName":"photo",
    "fields":[
    "key":[{
    "name":"farm",
    "type":"xs:string"
    },
    {
    "name":"id",
    "type":"xs:string"
    },
    {
    "name":"isfamily",
    "type":"xs:boolean"
    },
    ......
    ]
    },

    "results":{
    "photo":[{
    "farm":"3",
    "id":"3679272771",
    "isfamily":"0",
    "isfriend":"0",
    "ispublic":"1",
    "owner":"28401989@N04",
    "secret":"6ace55fce0",
    "server":"2643",
    "title":"ConvergeSC Speakers"
    },
    ....


    so that instead of assuming the data structure from how it comes, I can actually know for sure and even (and this is the most important) know when the data is other than string, an information I have no way of guessing out. Numbers sort differently than strings containing numeric characters, but there is no way to tell one from the other from the received data. There is also no way to tell a boolean apart from a string containing a single digit 0 or 1. It is also good to get this information in the same reply message as the data itself, to avoid twice the latency time, instead of going to a DESC command first or more than that for complex queries. Also, what the actual fields ended up being, in case any failed.


    Currently YQL doesn't offer any mechanism to inspect the type of data coming back before the query gets executed, mostly because its dealing with so many heterogeneous sources it knows little about beyond the information in the Open Data Table xml binding. While we could have required that each table fully describe its data shape and type (and actually tried something like that in an earlier testing environment), we decided that forcing developers to add this information is both a laborious task and one that produces quite a few errors - and the benefit of such information for many uses is low. Clearly for your example, this would be valuable.

    When YQL deals with XML data sources we do preserve their DTD information, which you could use to determine the types. An alternative is simply to cast the fields up and down a type hierarchy to determine what type the field is based on the example values being set in it.

    Jonathan
    0
  • Greetings. So does this mean I can't do accurate numeric greater/less than queries in YQL? Since values are all seen as strings? I have, for example, results of the the form:

    "game": [
    {
    "umpire": "",
    "game": "1.0",
    "temp": "73.0",
    "runs_home": "4.0",
    "home": "nya",
    "type": "R",
    "runs_away": "2.0",
    "away": "tor",
    "game_id": "honkers",
    "local_time": "13:05",
    "date": "",
    "wind": "5.0",
    "wind_dir": "Out to RF"
    }

    And I'm looking to query for one-run games. Is this not possible (or a rather complicated endeavor utilizing IN and arrays of strings) with YQL? Or am I missing something simple here?
    0
  • This limitation precludes the development of ANY general purpose library based on YQL, just end user applications where the final developer knows the data types of the fileds in the query he is doing. For strongly typed languages this carries a further burden to developers as no library would be able to provide type-casted results as would normally be expected.

    This is a serious limitation and it is really a pitty.
    0
  • QUOTE (Zachary @ Jul 11 2009, 10:51 AM) <{POST_SNAPBACK}>
    Greetings. So does this mean I can't do accurate numeric greater/less than queries in YQL? Since values are all seen as strings? I have, for example, results of the the form:

    "game": [
    {
    "umpire": "",
    "game": "1.0",
    "temp": "73.0",
    "runs_home": "4.0",
    "home": "nya",
    "type": "R",
    "runs_away": "2.0",
    "away": "tor",
    "game_id": "honkers",
    "local_time": "13:05",
    "date": "",
    "wind": "5.0",
    "wind_dir": "Out to RF"
    }

    And I'm looking to query for one-run games. Is this not possible (or a rather complicated endeavor utilizing IN and arrays of strings) with YQL? Or am I missing something simple here?


    In the YQL query you can still use greater than and less than. Since we know very little about the data, we use coercion i.e try to convert the right hand side value and the left hand side value into numeric types and then compare them.
    Since YQL acts over a large number of data sources, we use the coercion technique to introspect the api. I suspect that this can also be done on the client side in JS where you would coerce and find about the data.

    hope that helps
    Nagesh
    0
  • QUOTE (Nagesh Susarla @ Jul 20 2009, 10:15 AM) <{POST_SNAPBACK}>
    In the YQL query you can still use greater than and less than. Since we know very little about the data, we use coercion i.e try to convert the right hand side value and the left hand side value into numeric types and then compare them.
    Since YQL acts over a large number of data sources, we use the coercion technique to introspect the api. I suspect that this can also be done on the client side in JS where you would coerce and find about the data.

    hope that helps
    Nagesh


    Sounds good it I were to operate on the data myself, but it is not so good if I have to pass the data type information to upper levels of software. So far I do my guessing on what the field names are based on the first row. Further guessing the data type would either force me to gamble that the first row is representative of the types of the rest of the result set or loop through the whole of it to ensure that no individual field fails to pass the numerical test.

    Then, what about dates? Are they predictable? Booleans?
    0

Recent Posts

in YQL