Search

J2ME and Unicode

0 views

When you build a MIDlet that will appear in markets around the world, language handling becomes a core concern. The original Java platform, designed for desktop and server environments, already offered Unicode support at the language level. However, mobile devices bring a different set of constraints - display size, limited memory, and diverse font engines. Understanding how to harness Unicode correctly in J2ME ensures your application can display Japanese, Arabic, Cyrillic, or any other script without garbling or crashes.

Why Unicode is Essential for J2ME Applications

Unicode assigns a single numeric value to every character you can imagine. That includes the familiar 128 characters of ASCII, all Latin letters with diacritics, Greek and Cyrillic symbols, Chinese ideographs, and even emoji. The ISO/IEC 10646 standard formalizes this mapping. The result is a language-agnostic representation that any Unicode-compliant system can interpret.

Before Unicode, developers relied on platform‑specific encodings. An English‑only phone might use a simple 7‑bit scheme that worked fine for Latin text. When you tried to show Japanese, the same byte values could produce unreadable squares or cause the device to crash. Interoperability suffered, especially when exchanging data between systems that used different code pages.

J2ME embraces Unicode natively. When you declare a String in your code, Java stores it as a sequence of 16‑bit Unicode code units. This means that if you write String hello = "こんにちは";, the compiler translates the literal into Unicode, and the runtime interprets it correctly regardless of the underlying device. This advantage is only realized if the device’s display subsystem can render the characters.

Modern mobile platforms that ship with J2ME support a handful of built‑in fonts, but these are often limited to Latin or a single East Asian script. When you target multiple markets, you’ll need to supply additional font resources and instruct the Toolkit to use them. Many developers mistakenly assume Unicode support alone guarantees multilingual display, but without the correct fonts, Unicode characters will still render as boxes or missing glyphs.

Industry giants such as Apple, IBM, Microsoft, and Oracle have all adopted Unicode in their products. Standards like XML, Java, and LDAP specify Unicode as the encoding of choice. By aligning your MIDlet with these standards, you future‑proof it for a wide range of devices, including those that use custom or proprietary rendering engines.

Beyond display, Unicode also influences sorting, comparison, and locale‑specific operations. The J2ME API includes locale support that respects cultural differences in case conversion and string comparison. Proper Unicode handling allows you to use java.text.RuleBasedCollator to sort names correctly in German or Russian, for example. For an application that offers localized content or user input in various scripts, ignoring these subtleties can lead to a frustrating user experience.

Preparing Your J2ME Project for Multilingual Support

Getting Unicode working in your MIDlet is a two‑step process: first, you must ensure the emulator or device can display the desired characters; second, you must provide the right font resources to the Toolkit. The following steps outline how to make a J2ME project ready for multiple languages.

Installing a Unicode Font on Your Development Machine

Choose a font that contains glyphs for the scripts you need. The Arial Unicode MS font is a popular choice because it covers over 65,000 characters from the Unicode repertoire. Microsoft no longer distributes it for free, so you may need to purchase it or find an alternative such as Google Fonts that offers wide coverage. Once you have the font file, install it on your operating system so that the Sun Wireless Toolkit can locate it.

Duplicating the Default Device Profile

The Toolkit includes a set of device profiles under wtkdevices. To create a Unicode‑enabled version, copy the existing DefaultColorPhone folder and rename it to UnicodePhone. Inside that folder, rename DefaultColorPhone.properties to UnicodePhone.properties. These files define the screen resolution, available memory, and, crucially, the fonts that the emulator should use.

Editing the Properties File to Reference the Unicode Font

Open UnicodePhone.properties in a text editor and locate the lines that start with font.. The default file will reference generic fonts such as SansSerif or Monospaced. Replace each of those references with Arial Unicode MS (or your chosen Unicode font). A sample snippet looks like this after modification:

Prompt
font.default=Arial Unicode MS-plain-10</p> <p>font.softButton=Arial Unicode MS-plain-11</p> <p>font.system.plain.small=Arial Unicode MS-plain-9</p> <p>font.system.plain.medium=Arial Unicode MS-plain-11</p> <p>font.system.plain.large=Arial Unicode MS-plain-14</p> <p>font.system.bold.small=Arial Unicode MS-bold-9</p> <p>font.system.bold.medium=Arial Unicode MS-bold-11</p> <p>font.system.bold.large=Arial Unicode MS-bold-14</p> <p>font.system.italic.small=Arial Unicode MS-italic-9</p> <p>font.system.italic.medium=Arial Unicode MS-italic-11</p> <p>font.system.italic.large=Arial Unicode MS-italic-14</p> <p>font.system.bold.italic.small=Arial Unicode MS-bolditalic-9</p> <p>font.system.bold.italic.medium=Arial Unicode MS-bolditalic-11</p> <p>font.system.bold.italic.large=Arial Unicode MS-bolditalic-14</p> <p>font.monospace.plain.small=Arial Unicode MS-plain-9</p> <p>font.monospace.plain.medium=Arial Unicode MS-plain-11</p> <p>font.monospace.plain.large=Arial Unicode MS-plain-14</p>

Save the file, then restart the emulator. When you run a MIDlet that displays Unicode characters, the emulator should now render them correctly.

Testing with a Simple MIDlet

To confirm everything is set up, create a small MIDlet that prints "Hello World" in Japanese. The Unicode string literal uses uXXXX escape sequences for each character. A minimal implementation looks like this:

Prompt
import javax.microedition.lcdui.*;</p> <p>import javax.microedition.midlet.*;</p> <p>public class SimpleUnicodeTest extends MIDlet {</p> <p> public void startApp() {</p> <p> Display display = Display.getDisplay(this);</p> <p> StringItem msg = new StringItem("Japanese Hello World", "\u3053\u3093\u306B\u3061\u306F\u4E16\u754C");</p> <p> Form form = new Form("Unicode Test");</p> <p> form.append(msg);</p> <p> display.setCurrent(form);</p> <p> }</p> <p> public void pauseApp() {}</p> <p> public void destroyApp(boolean unconditional) {}</p> <p>}</p>

Run the MIDlet in the UnicodePhone emulator. The screen should show the greeting in Japanese characters, confirming that your font and property settings work.

Loading Multilingual Resources at Runtime

Hard‑coding text into your MIDlet makes internationalization difficult. Instead, store language strings in external files or resource bundles and load them when the application starts. The following techniques illustrate how to read Unicode text from a file and convert placeholder escape sequences into actual characters.

Reading a UTF‑8 Text File

Suppose you keep a file named lang_ja.txt inside your JAR. Each line contains a Unicode string or a literal uXXXX sequence. To read it, use an InputStreamReader with the UTF‑8 charset. The method below returns the file’s content as a single string:

Prompt
public String readUnicodeFile(String filename) {</p> <p> StringBuilder buffer = new StringBuilder();</p> <p> try (InputStream is = getClass().getResourceAsStream(filename);</p> <p> InputStreamReader isr = new InputStreamReader(is, "UTF-8")) {</p> <p> int ch;</p> <p> while ((ch = isr.read()) != -1) {</p> <p> buffer.append((char) ch);</p> <p> }</p> <p> } catch (Exception e) {</p> <p> System.out.println(e);</p> <p> }</p> <p> return buffer.toString();</p> <p>}</p>

When the file contains actual Unicode characters, the StringBuilder will hold them correctly. If the file instead uses uXXXX escape sequences, you’ll need an extra step to translate those into real characters.

Converting Escape Sequences to Unicode

Consider a line that reads u3053u3093u306Bu3061u306Fu4E16u754C. Each uXXXX pair represents one code point. The following helper method parses the string and builds a new string with the decoded characters:

Prompt
private String convertEscapedToUnicode(String escaped) {</p> <p> StringBuilder result = new StringBuilder();</p> <p> for (int i = 0; i <p> if (escaped.charAt(i) == 'u' && i + 4 <p> String hex = escaped.substring(i + 1, i + 5);</p> <p> int code = Integer.parseInt(hex, 16);</p> <p> result.append((char) code);</p> <p> i += 4;</p> <p> } else {</p> <p> result.append(escaped.charAt(i));</p> <p> }</p> <p> }</p> <p> return result.toString();</p> <p>}</p>

Apply this method after reading the file, then use the returned string in your UI components.

Organizing Multiple Language Files

For a fully internationalized MIDlet, store a separate file for each locale - lang_en.txt, lang_ja.txt, lang_fr.txt, etc. At runtime, determine the device’s locale via System.getProperty("microedition.locale") and load the appropriate file. This approach keeps the codebase clean and allows translators to edit plain text files without touching Java code.

When your MIDlet is deployed, the JAR will bundle all language files. The Toolkit’s midp.properties can be configured to include them automatically, or you can place them in the same package as your MIDlet class and access them via getResourceAsStream. The key is that the files must be encoded in UTF‑8; otherwise, non‑Latin characters may be corrupted during the packaging process.

By following these practices - installing a comprehensive Unicode font, configuring the emulator’s properties, and loading external language resources - you create a J2ME application that displays content correctly in any supported script. The result is a smooth experience for users worldwide, without the pitfalls of garbled text or application crashes.

Suggest a Correction

Found an error or have a suggestion? Let us know and we'll review it.

Share this article

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Related Articles