A way to remedy poor results from non-xml compatible sites?

I've stumble across some issues regarding some site's coding practices and YQL results. Essentially a couple of my test sources are nesting content inside of spans, like div containers and images. I thought that was incredibly odd, but didn't think much of it until I saw the YQL results where the code was rearranged a bit. Typically I wouldn't find issue with this, but it was a coupe days headache trying to figure out why some things were not matching up to the original source.

Is it possible to remedy this while using the YQL or should I look to server-side solutions (cURL) for these kinds of inconsistencies?

