Your regexes are wrong. check out this website for more information about it.
For your application at hand, clean the strings in 2 steps: first, take out the junk in front of the wished string, then the junk after. Eg, try:
In item.title replace ^.*\<a[^\>]*\> with (nothing) [ ]g [x]s [ ]m [ ]i -> you get the title plus an invisible </a> mark up left. the regex here is read as "from first character of the string (^), match any number of any character (. is any character, the star is the quantifier : 0 or more) until you find a < character (as this character can have some significance in a regex, you need to escape it using the backslash) followed by a a character, followed by any number of any character except > (the bracket specifies a range a characters matched, if it begins with ^ it specifies the (range of) character(s) NOT matched) followed by a > character. All of that matched string is replaced by nothing, leaving the title and the last mark-up.
In item.title replace \<.*$ with (nothing) [ ]g [x]s [ ]m [ ]i -> start matching from first mark up character to end of string ($), replaced with nothing: that leaves only the title.
In item.link replace ^.*href=' with (nothing) [ ]g [x]s [ ]m [ ]i -> As in the first regex, matches everything up to the what marks the actual link, eg the href=', and replace it with nothing.
In item.link replace '.*$ with (nothing) [ ]g [x]s [ ]m [ ]i -> Finally, matches from first non-url character (the ') to the end of string.
Again, ask for help
I wish to improve the my pipes and add the full text of the news. Like it came out, but again at a loss with the cleaning of the text from html tags in the description of the news. Regular expressions are still hard. Such as how to get rid
div> and other tags. Please give a couple of examples, so it's easier to understand regular expressions.
Use the XPath fetch page module instead of the deprecated Fetch Page. The source is item.link, and visit the page of one of the articles. If you use firefox, a right-click anywhere in the page shows the Inspect Element. On the story, you'll see the corresponding DOM and from that, you can deduce the XPath needed to get the article. For your source website, //div[@class="full_story"] and ticking both boxes (HTML5 parser and emit as string), will get you the entire story with a minimum of HTML left. a simple regex of the form \<[^\>]+\> will select the HTML tags left and with the [x]g [x]s options (Global Substitution), you'll get your text.