0

displaying unicode characters well

If I display information from the internet or encoded text files containing special characters (á,ö,ß etc.), the widget does not display these characters but unicode code information. If there characters are put into a string directly, they're displayed. For example 'Mönchfeld' fetched from an external data source becomes 'Mönchfeld'. 'Mönchfeld' also becomes 'Mönchfeld'.

Is there a way to display these characters directly in the right way or do I have to write search and replace loops for every special character that may be used possibly?

by
5 Replies
  • We have run into this issues as well with Internet feeds that do not always obey UTF-8 encoding. We have developed a server-side script to scrub the content before it is sent to the Widget. Is a server side script something that could be used with your Widget?
    0
  • A server-sided script may be an option but I think if there is no string function, it would be easier to write one. There are only some special characters in western-european languages to be replaced - totally 23 (umlauts, vocals with accents, n with tilde). So, I think I will implement an own function that scans the html data for unicode strings and replaces them with the real letter. The easiest way may be to scan the string for '#' at first and if this character occurs to search for unicodes and replace them.
    0
  • The issue is that your data has HTML entities inside the string (the ö is an HTML control sequence). If a server side script is not an option as suggested by Brian, then I would suggest doing as you indicated by doing a regexp to match those sequences and replace them with real characters. Do remember, the platform is Javascript, so this is a problem which has already been solved and so you don't need to solve it again. See:

    http://search.yahoo.com/search?p=javascrip...l+entity+decode

    That said, to save you some time, I would likely use this if it were me:

    http://phpjs.org/functions/html_entity_decode:424

    -Jeremy
    0
  • QUOTE (Jeremy Johnstone @ Feb 11 2010, 08:50 AM) <{POST_SNAPBACK}>
    http://phpjs.org/functions/html_entity_decode:424


    Fantastic recommendation. Always something good.
    0
  • I use this solution now:

    QUOTE
    sched1[i][3] = sched1[i][3].replace(/&#246;/g,"ö");


    ... this (in continuation) for all necessary special characters.
    0

Recent Posts

in Getting Started / Beginners - Yahoo! TV Widgets