0

Pipes don't work with some rss-uris?

There are urls of rss-feeds that (sometimes) don't work in pipes.

For example: - http://newsfeed.zeit.de/politik/index or - http://www.linksjugend-solid.de/feed/ or - http://allfacebook.de/feed/

The error messages I get are like this:

This Pipe ran successfully but encountered some problems: warning Error fetching http://allfacebook.de/feed/. Response: OK (200). Error: Invalid XML document. Root cause: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.

warning Error fetching http://www.linksjugend-solid.de/feed/. Response: OK (200). Error: Invalid XML document. Root cause: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.

For me this urls seem to be correct. What can I do?

by
3 Replies
  • 0
  • Add http://www.skysports.com/rss/0,20514,11661,00.xml http://www.skysports.com/rss/0,20514,11095,00.xml

    to that list. They sometimes work but more often than not throw the same error you show above.

    https://pipes.yahoo.com/pipes/pipe.info?_id=bde5a5a3da3ab537ddca93eab248b602

    0
  • Hello to both of you,

    Let's decompose the problem and try to answer it, then I'll give you my thoughts about it. I'll give you a general solution to try out, and ask you a few questions about the uses of your pipes afterwards.

    The cases are:

    • There are occurrences of perfectly working feeds and pipes
    • There are occurrences of "Invalid XML document" of a few feeds.

    which means the urls are correct, and pipe has no problem fetching them, as they are fetched (though some times only). So first think to check is if there are actually times when the feeds work. I did it for all 3 quoted urls of Eyke Eikin, this is the case.

    Next is to analyse the problem: invalid xml document, eg the fetch page is not an rss feed. In my experience this probably caused either by downtime of the server, authentication issues, or rate limits. Second think to check is the actual page indicated by the url when you have an occurrence of the error, so go to the url using your web browser: if there is no problem, it probably isn't a downtime issue, so only two possibilities left. Next, try to fetch the page (not the rss) using pipes, through the XPath fetch page module for example, to have the request can from the same server.

    At that point we will be much more aware of what's going on.

    A general fix for the 3 potential issues I've highlighted is the use of feedburner (and the likes of), as it will provide a de facto proxy for the feed. Whether to feedburn the sources or the pipe's url depend on the usage you have of the feed you've created, so here come the questions: How many requests per minute/hour/day the pipe has? do you call the same source several times in a single pipe?

    Best

    0

Recent Posts

in Pipes