My first pipes

Hello! I'm just starting to learn pipes. I want to create a news/feed in format rss. The result is not quite what I wanted. I can not remove the unnecessary lines. Here's a link to my pipes http://pipes.yahoo.com/pipes/pipe.info?_id=9d79e2bdd07dde99e31737f589b778fc. Please tell me the way solutions.

thanks in advance Kanat

4 Replies
  • Hello,

    Your regexes are wrong. check out this website for more information about it.

    For your application at hand, clean the strings in 2 steps: first, take out the junk in front of the wished string, then the junk after. Eg, try:

    In item.title replace ^.*\<a[^\>]*\> with (nothing) [ ]g [x]s [ ]m [ ]i -> you get the title plus an invisible </a> mark up left. the regex here is read as "from first character of the string (^), match any number of any character (. is any character, the star is the quantifier : 0 or more) until you find a < character (as this character can have some significance in a regex, you need to escape it using the backslash) followed by a a character, followed by any number of any character except > (the bracket specifies a range a characters matched, if it begins with ^ it specifies the (range of) character(s) NOT matched) followed by a > character. All of that matched string is replaced by nothing, leaving the title and the last mark-up.

    In item.title replace \<.*$ with (nothing) [ ]g [x]s [ ]m [ ]i -> start matching from first mark up character to end of string ($), replaced with nothing: that leaves only the title.

    In item.link replace ^.*href=' with (nothing) [ ]g [x]s [ ]m [ ]i -> As in the first regex, matches everything up to the what marks the actual link, eg the href=', and replace it with nothing.

    In item.link replace '.*$ with (nothing) [ ]g [x]s [ ]m [ ]i -> Finally, matches from first non-url character (the ') to the end of string.

    Et voilĂ  :)

  • Lolo, thanks a lot! you're right, without knowledge of regular expressions here doing nothing. Followed your instructions, all so beautifully turned out. Simply amazing!

    Sincerely, Kanat

  • Hello! Lolo Again, ask for help I wish to improve the my pipes and add the full text of the news. Like it came out, but again at a loss with the cleaning of the text from html tags in the description of the news. Regular expressions are still hard. Such as how to get rid


    div> and other tags. Please give a couple of examples, so it's easier to understand regular expressions.

    Thanks in advance. Kanat

  • Use the XPath fetch page module instead of the deprecated Fetch Page. The source is item.link, and visit the page of one of the articles. If you use firefox, a right-click anywhere in the page shows the Inspect Element. On the story, you'll see the corresponding DOM and from that, you can deduce the XPath needed to get the article. For your source website, //div[@class="full_story"] and ticking both boxes (HTML5 parser and emit as string), will get you the entire story with a minimum of HTML left. a simple regex of the form \<[^\>]+\> will select the HTML tags left and with the [x]g [x]s options (Global Substitution), you'll get your text.


Recent Posts

in Pipes