I'm not a developer, so I may be doing nothing more here than wishful thinking. But from what I think I understand, the Fantasy Sports API does seem like it has the power to enable league members (even just, say, the commissioner) to download and/or archive their league's data, including and especially whatever data is stored in the "all-time" data of long-running leagues.
What I'm envisioning is some interface that grabs whatever data Yahoo! has stored on a league's history (team stats? for H2H leagues, matchup results? etc.), and then spits out that data in some sort of moderately user-friendly format. Even a .csv would be great. Is anyone working on such a thing? And/or does anyone have an idea how difficult such an undertaking would be?
I ask because one of the most fun aspects of participating in a long-running league is the added sense of "history" in the competition. It's all the more satisfying to see Team A beat Team B if you know that Team B has won every other matchup the two teams have ever had, including the nail-biter in last year's playoffs where team B edged out team A by a point. You know what I mean? It would be amazing to be able to build an archive of your league's stat history, to dig as deep as possible into this silly little history.
Some of this data is already available through a league's "all-time" records as provided by Yahoo!, but accessing that data requires a lot of clicking around. And, while I know Yahoo! is kind to us, it's hard to believe they will store our league history indefinitely. It would be great if we were able to download all the available data onto our local machines, ideally in some format that's easily sortable. It seems like the Fantasy Sports API gives us the ability to do that. Am I right?
It'd be nice to know which information isn't archived forever. For example, only the latest 100 forum posts are archived. Now I wonder if other data that we can't see via the Y! site exists, such as matchup data (e.g., weekly lineups), I've not yet checked but this information is probably one of the keys to your question.
I archive forum posts now but that's it. The football season has gone by fast, so we better hurry to archive the rest! I assumed I'd have done this already, so maybe a team effort is needed! :)
I'm not sure if you've seen the feature, but we actually do try to enable this concept of tracking all-time data for long-running leagues through the site itself. If you're a commissioner, you should check out the "Edit League History" tool in the Commissioner Tools for your league. It lets you choose previous leagues that you've been in to add to your league history. Once you have that in place, you'll have the ability to navigate back to view all of these leagues from a dropdown on your league homepage. You can also get to all of your past leagues from your fantasy profile. We're definitely working on adding more features to persistent leagues (which is what we call these things), but we at least have a lot of the basics in place.
Okay, enough plugging features that hopefully you guys have already seen. :)Regarding your point about getting data about past seasons in some structured format...yeah, that's what these APIs are supposed to do, but if you think there ought to be another layer on top of it that, like, turns it to CSV, that seems okay, too. I would say that *technically*, I think there's something in our TOS that discourages you from storing data from our APIs indefinitely? But I think that mostly relates to effectively user-personal data, and so I'm sure there's an argument to be made about what is and isn't covered about that regarding league/team data from past seasons.
I'm hopeful most understand how to review league data from years past via the site, but, it's not all there. For example, you can't look up to see matchup data (e.g., team rosters) for when teams Foo and Bar played each other week #4 in the year 2006. It only has final scores. So at some point we'll need to determine which data is (and is not) available, and how the API differs from the site in this regard.
Thanks for the thoughtful comments, guys. Phil, I think you are on the right track--it would probably be most helpful at the start to figure out what, exactly is archived and what isn't. Like you point out, matchup results are stored, (Team A beat Team B 6-5 in Week 4, 2008), but not much deeper than that (e.g. for baseball, how many strikeouts Team A recorded that week, or even what players were on Team B)Here's another example: Right now, if I think, "hmm, what's the biggest margin of victory that Team A has ever had in a matchup with Team B?" I can (as Sean rightly points out) click through our persistent league's history, but that takes accessing each year separately, making note of each Team A v. Team B matchup, and then sorting that data on my own. If, instead, I could just dump all that data into a spreadsheet, I could suddenly sort and pick out points of data much faster (and, I am assuming, with more insight on trends).
I can confirm that the old data exists within the API, data that is not available within the yahoo site. And as someone who is planning to create similar question/answers for our league, I'll let you know once I create a generic data archiver and will throw it up on github. I'll probably be storing it within sqlite before doing any calculations but exporting it to other formats shouldn't be too difficult.
Hey crater, it looks like you were saying that you think there might be some gaps in terms of what data is available in the APIs for previous seasons (in terms of trying to access it right now). As far as I'm aware, ALL of our data that's available for a current season of a game should also be available for past seasons of that game. Definitely let me know if you're seeing that break down anywhere, but I'm pretty sure that's the case. Also, feel free to leave specific requests for what type of data you're trying to access and I can try to help formulate useful URLs to hit.
EDIT: Performance-wise, I should note that some of these queries over past seasons may perform significantly worse because most of the data in question would have gotten out of our caches by now. So just a heads up. Try not to pound services from previous seasons too hard. :D
I've noticed the performance difference, so the local storage will help all parties. Or maybe a simple local cache mechanism could be created that'll cache either the returned XML or objects (e.g., simplexml object for php).
Also, crater and I are comparing what's available online via the Yahoo site, and not the API, so we wondered. For example, go to your leagues 2005 page (at yahoo.com) and you won't see which players competed in week #4. However, thankfully this information (which I recently confirmed) is available via the API. That's the difference.
That's awesome. Sean, thanks for giving us a perspective on what should data be there (and how the performance might be affected in retrieving it). Phil, it sounds like you are making strides in figuring out what can be pulled from previous seasons and how to pull it. I'm mostly illiterate when it comes to developing these API requests, but if there's any place you need help with what you're working on, let me know.