1. Technology

Your suggestion is on its way!

An email with a link to:

http://javascript.about.com/library/blunicode.htm

was emailed to:

Thanks for sharing About.com with others!

JavaScript and Unicode

Join the Discussion

Questions? Comments?

ASCII only allows for 128 different characters and 32 of those are allocated to control characters leaving only 96 actual displayable characters that can be used. Manyof the world's languages use characters that do not fit within this small group. There are also a lot of special symbols used in specialized areas such as mathematics. To cater for this much larger range of characters in a standard way, the Unicode character set was developed. This matches the ASCII characters for the first 128 but provides a way of specifically identifying over 16 million different characters. All of the characters you are ever likely to need can be found somewhere in the Unicode charts. There are three different ways of specifying that a web page should display unicode characters. These are utf-8, utf-16, and utf-32. All three allow all characters to bbe used, the only difference is that the utf-8 will use a single byte for each character within the first 256 and will only use two or four bytes per character for those that need it while utf-16 will use at least two bytes per character and utf-32 will always use four bytes per character.

Unicode characters are specified in JavaScript by typing a backslash, a lowercase "u", and then the four digit hexadecimal number corresponding to the character's encoding in the utf-16 character set. This gives you the ability to reference the 65000+ most common characters that can be used in web pages from within your JavaScript code.

To make it easier for you to see what a given unicode value entered into your JavaScript represents, simply enter the four digit hexadecimal value that you intend to use into the following form and the actual character that it represents will be displayed in bold red text in the sentence below. (note that nothing will happen if you enter anything other than a valid four digit hexadecimal number).

\u
is the Unicode Character : " ".

One thing that you do need to watch for if you start using unicode characters in your web page is that there are a few javaScript functions that do not handle unicode. Each of these functions has been replaced in JavaScript by a new function that does support unicode but you will still see a lot of code around using the old non compliant functions. For example a lot of people still use escape() instead of encodeURI() or encodeURIComponent().

©2014 About.com. All rights reserved.