Chapter 16 Writing International Applications

If you target multiple languages as well as multiple platforms, you have the potential to reach a huge audience, which greatly increases your application’s chances of success. This chapter covers what you need to do to make your application amenable to internationalization, which is sometimes abbreviated to “i18n” (an “i” followed by “18” characters followed by “n”).

Introduction to Internationalization

When taking your application to an international market, the first thing that comes to mind is translation. You will need to provide a set of translations for all the strings your application presents in each foreign language that it supports. In wxWidgets, you do this by providing message catalogs and loading them through the wxLocale class. This technique may differ from how you are used to implementing translations if you currently use string tables. Instead of referring to each string by an identifier and switching tables, message catalogs work by translating a string using an appropriate catalog. (Alternatively, you can use your own system for handling translations if you prefer, but be aware that messages in the wxWidgets library itself rely on catalogs.)

Without message catalogs, the untranslated strings in the source code will be used to display text in menus, buttons, and so on. If your own language contains non-ASCII characters, such as accents, you will need a separate “translation” (message catalog) for it because source code should only contain ASCII.

Representing text in a different language can also involve different character encodings, which means that the same byte value might represent different characters when displayed on-screen. You need to make sure that your application can correctly set up the character encodings used by your GUI elements and in your data files. You need to be aware of the specific encoding used in each case and how to translate between encodings.

Another aspect of internationalization is formatting for numbers, date, and time. Note that formatting can be different even for the same language. For example, the number represented in English by 1,234.56 is represented as 1.234,56 in Germany and as 1'234.56 in the German-speaking part of Switzerland. In the USA, the 10th of November is represented as 11/10, whereas the same date for a reader in the UK means the 11th of October. We’ll see shortly how wxWidgets can help here.

Translated strings are longer than their English counterpart, which means that the window layout must adapt to different sizes. Sizers are best suited to solve this part and are explained in Chapter 7, “Window Layout Using Sizers.” Another layout problem is that for Western languages, the flow of reading goes from left to right, but other languages such as Arabic and Hebrew are read from right to left (called RTL), which means that the entire layout must change orientation. There is currently no specific mechanism for implementing RTL layout in wxWidgets.

The last group of elements to be adapted to a different language or culture consists of images and sounds. For example, if you are writing a phone directory application, you might have a feature that speaks the numbers, which will be language-dependent, and you might want to display different images depending on the country.

Providing Translations

wxWidgets provides facilities for message translation using the wxLocale class and is itself fully translated into several languages. Please consult the wxWidgets home page for the most up-to-date translations.

The wxWidgets approach to internationalization closely follows the GNU gettext package. wxWidgets uses message catalogs, which are binary compatible with gettext catalogs allowing you to use all the gettext tools. No additional libraries are needed during runtime because wxWidgets is able to read message catalogs.

During program development, you will need the gettext package for working with message catalogs (or poEdit; see the next section). There are two kinds of message catalog: source catalogs, which are text files with extension .po, and binary catalog which are created from the source files with the msgfmt program (part of the gettext package) and have the extension .mo. Only the binary files are needed during program execution. For each language you support, you need one message catalog.

poEdit

You don’t have to use command-line tools for maintaining your message catalogs. Vaclav Slavik has written poEdit, a graphical front-end to the gettext package available from http://www.poedit.org. poEdit, shown in Figure 16-1, helps you to maintain message catalogs, generate .mo files, and merge in changes to strings in your application code as your application changes and grows.

Figure 16-1 poEdit

poEdit

Step-by-Step Guide to Using Message Catalogs

These are the steps you need to follow to create and use message catalogs:

1. Wrap literal strings in the program text that should be translated with wxGetTranslation or equivalently with the _( ) macro. Strings that will not be translated should be wrapped in wxT( ) or the alias _T( ) to make them Unicode-compatible.

2. Extract the strings to be translated from the program into a .po file. Fortunately, you don’t have to do this by hand; you can use the xgettext program or, more easily, poEdit. If you use xgettext, you can use the -k option to recognize wxGetTranslation as well as _( ). poEdit can also be configured to recognize wxGetTranslation via the catalog settings dialog.

3. Translate the strings extracted in the previous step to another language by editing the .po file or using poEdit (one .po file per language). You must also indicate the character set you are using for the translated strings.

     If you do not use poEdit, you will have to do it by hand, using your favorite text editor. The header of your .po file will look something this:


# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR Free Software Foundation, Inc.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.

#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION "
"POT-Creation-Date: 1999-02-19 16:03+0100 "
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE "
"Last-Translator: FULL NAME <EMAIL@ADDRESS> "
"Language-Team: LANGUAGE <[email protected]> "
"MIME-Version: 1.0 "
"Content-Type: text/plain; charset=iso8859-1 "
"Content-Transfer-Encoding: 8bit "


     Note the charset property in the second to last line, specifying the character set used by the catalog. All strings in the catalog are encoded using this character set. This is very important if non-Unicode encodings are used because otherwise the GUI elements cannot correctly display all characters.

4. Compile the .po file into the binary .mo file to be used by the program. You can do this within poEdit, or you might want to add it as a step in your distribution script, for example using:

  
msgfmt -o myapp.mo myapp.po


5. Set the appropriate locale in your program to use the strings for the given language (see the next section, “Using wxLocale”).

Under Mac OS X, you’ll need to make one modification to the Info.plist file, which describes the contents of the “application bundle.” This file (an XML text file encoded in UTF-8) should have a CFBundleDevelopmentRegion entry describing the language of the developer—such as English—and Mac OS X will query the bundle for the presence of certain resource directories to find out which languages are supported. For example, for German, this might be the directory German.lproj. Because wxWidgets applications do not use these directories for storing resource information, instead storing the translation in .mo files, the application needs to be told explicitly which languages are supported. You do this by adding a CFBundleLocalizations entry to Info.plist. It might look like this:


<key>CFBundleDevelopmentRegion</key>
<string>English</string>
<key>CFBundleLocalizations</key>
<array>
       <string>en</string>
       <string>de</string>
       <string>fr</string>
</array>


Using wxLocale

The wxLocale class encapsulates all language-dependent settings and is a generalization of the C locale concept. Normally you add a wxLocale member variable to your application class, say m_locale, and in your application OnInit function, you initialize the locale as follows:


if (m_locale.Init(wxLANGUAGE_DEFAULT,
                    wxLOCALE_LOAD_DEFAULT | wxLOCALE_CONV_ENCODING))
{
    m_locale.AddCatalog(wxT("myapp"));
}


Note that wxLocale::Init will try to find and add the wxstd.mo catalog, containing wxWidgets' own translations. The parameter wxLANGUAGE_DEFAULT means use the system language, and you can also force a certain language using the correct wxLANGUAGE_xxx code.

When you tell the wxLocale class to load a message catalog, the catalog is converted to the character set used by the user’s operating system. This is the default behavior of the wxLocale class; you can disable it by not passing wxLOCALE_CONV_ENCODING to wxLocale::Init as the second parameter.

Where does wxWidgets find its message catalogs? For each directory <DIR> in its internal list, wxWidgets looks in:

Image   <DIR>/<LANG>/LC_MESSAGES

Image   <DIR>/<LANG>

Image   <DIR>

The rules about which directories are taken into account are different on each platform:

Image   On all platforms, the value of the LC_PATH environment variable is searched.

Image   On Unix and Mac OS X, the wxWidgets installation directory is searched, and also /share/locale, /usr/share/locale, /usr/lib/locale, /usr/locale /share/locale, and the current directory.

Image   On Windows, the application directory is also searched.

You can add further search directories using the function wxLocale:: AddCatalogLookupPathPrefix. For example:


wxString resDir = GetAppDir( ) + wxFILE_SEP_PATH + wxT("resources");
m_locale.AddCatalogLookupPathPrefix(resDir);

// If resDir is c:MyApp esources, AddCatalog will now look for the
// French catalog in these places as well as the standard dirs:
//
// c:MyApp esourcesfrLC_MESSAGESmyapp.mo
// c:MyApp esourcesfrmyapp.mo

// c:MyApp esourcesmyapp.mo

m_locale.AddCatalog(wxT("myapp"));


The usual method for distributing message catalogs is to create a subdirectory for each language, using the standard canonical name, containing <appname>.mo in each directory. For example, the wxWidgets internat sample has directories fr and de representing French and German using ISO 639 language codes.

Character Encodings and Unicode

There are more characters around on Earth than can fit into the 256 possible byte values that the classical 8-bit character represents. In order to be able to display more than 256 different glyphs, another layer of indirection has been added: the character encoding or character set. (The “new and improved” solution, Unicode, will be presented later in this section.)

Thus, what is represented by the byte value 161 is determined by the character set. In the ISO 8859-1 (Latin-1) character set, this is ¡—an inverted exclamation mark. In ISO 8859-2 (Latin-2), it represents a ¥ (Aogonek).

When you are drawing text on a window, the system must know about the encoding used. This is called the “font encoding,” although it is just an indication of a character set. Creating a font without indicating the character set means “use the default encoding.” This is fine in most situations because the user is normally using the system in his or her language.

But if you know that something is in a different encoding, such as ISO 8859-2, then you need to create the appropriate font. For example:


wxFont myFont(10, wxFONTFAMILY_DEFAULT, wxNORMAL, wxNORMAL,
               false, wxT("Arial"), wxFONTENCODING_ISO8859_2);


Otherwise, it will not be displayed properly on a western system, such as ISO 8859-1.

Note that there may be situations where an optimal encoding is not available. In these cases, you can try to use an alternative encoding, and if one is available, you must convert the text into this encoding. The following snippet shows this sequence: a string text in the encoding enc should be shown in the font facename. The use of wxCSConv will be explained shortly.


// We have a string in an encoding 'enc' which we want to
// display in a font called 'facename'.
//
// First we must find out whether there is a font available for
// rendering this encoding

wxString text; // Contains the text in encoding 'enc'

if (!wxFontMapper::Get( )->IsEncodingAvailable(enc, facename))
{
   // We don’t have an encoding 'enc' available in this font.
   // What alternative encodings are available?

   wxFontEncoding alternative;
   if (wxFontMapper::Get( )->GetAltForEncoding(enc, &alternative,
                                              facename, false))
   {
       // We do have a font in an 'alternative' encoding,
       // so we must convert our string into that alternative.

       wxCSConv convFrom(wxFontMapper::GetEncodingName(enc));
       wxCSConv convTo(wxFontMapper::GetEncodingName(alternative));
       text = wxString(text.wc_str(convFrom), convTo) ;

       // Create font with the encoding alternative

       wxFont myFont(10, wxFONTFAMILY_DEFAULT, wxNORMAL, wxNORMAL,
               false, facename , alternative);
       dc.SetFont(myFont);
   }
   else
   {
      // Unable to convert; attempt a lossy conversion to
      // ISO 8859-1 (7-bit ASCII)

      wxFont myFont(10, wxFONTFAMILY_DEFAULT, wxNORMAL, wxNORMAL,
              false, facename, wxFONTENCODING_ISO8859_1);
      dc.SetFont(myFont);
    }
}
else
{
    // The font with that encoding exists, no problem.

     wxFont myFont(10, wxFONTFAMILY_DEFAULT, wxNORMAL, wxNORMAL,
               false, facename, enc);
     dc.SetFont(myFont);
}

// Finally, draw the text with the font we’ve selected.

dc.DrawText(text, 100, 100);


Converting Data

The previous code example needs a chain of bytes to be converted from one encoding to another. There are two ways to achieve this. The first, using wxEncodingConverter, is deprecated and should not be used in new code. Unless your compiler cannot handle wchar_t, you should use the character set converters (wxCSConv, base class wxMBConv).

wxEncodingConverter

This class supports only a limited subset of encodings, but if your compiler doesn’t recognize wchar_t, it is the only solution you have. For example:


wxEncodingConverter converter(enc, alternative, wxCONVERT_SUBSTITUTE);
text = converter.Convert(text);


wxCONVERT_SUBSTITUTE indicates that it should try some lossy substitutions if it cannot convert a character strictly. This means that, for example, acute capitals might be replaced by ordinary capitals and en dashes and em dashes might be replaced by “-”, and so on.

wxCSConv (wxMBConv)

Unicode solves the ambiguity problem mentioned earlier by using 16 or even 32 bits in a wide character (wchar_t) to store all characters in a “global encoding.” This means that you don’t have to deal with encodings unless you need to read or write data in an 8-bit format, which as we know does not have enough information and needs an indication of its encoding.

Even when you don’t compile wxWidgets in Unicode mode (where wchar_t is used internally to store the characters in a string), you can use these wide characters for conversions, if available. You convert from one encoding into wide character strings and then back to a different encoding. This is also used in the wxString class to offer you convenient conversions. Just bear in mind that in non-Unicode builds, wxString itself uses 8-bit characters and does not know how this string is encoded.

To transfer a wxString into a wide character array, you use the wxString::wc_str function, which takes a multi-byte converter class as its parameter. This parameter tells a non-Unicode build which encoding the string is in, but it is ignored by the Unicode build because the string is already using wide characters internally.

In a Unicode build, we can then build a string directly from these characters, but in a non-Unicode build, we must indicate which character set this should be converted to. So in the line below, convTo is ignored in Unicode builds.


text = wxString(text.wc_str(convFrom), convTo);


The character set encoding offers more possibilities than font encodings, so you’d have to convert from font encoding to character set encoding using


wxFontMapper::GetEncodingName(fontencoding);


This means that our previous task would be written as follows using character set encoding:


wxCSConv convFrom(wxFontMapper::GetEncodingName(enc));
wxCSConv convTo(wxFontMapper::GetEncodingName(alternative));
text = wxString(text.wc_str(convFrom) , convTo) ;


There are situations where you output 8-bit data directly instead of a wxString, and this can be done using a wxCharBuffer instance. So the last line would read as follows:


wxCharBuffer output = convTo.cWC2MB(text.wc_str(convFrom));


And if your input data is not a string but rather 8-bit data as well (a wxCharBuffer named input below), then you can write:


wxCharBuffer output = convTo.cWC2MB(convFrom.cMB2WC(input));


A few global converter objects are available; for example, wxConvISO8859_1 is an object, and wxConvCurrent is a pointer to a converter that uses the C library locale. There are also subclasses of wxMBConv that are optimized for certain encoding tasks, namely wxMBConvUTF7, wxMBConvUTF8, wxMBConvUTF16LE/BE, and wxMBConvUTF32LE/BE. The latter two are typedefed to wxMBConvUFT16/32 using the byte order native to the machine. For more information, see the topic “wxMBConv Classes Overview” in the wxWidgets reference manual.

Converting Outside of a Temporary Buffer

As just discussed, the conversion classes allow you to easily convert from one string encoding to another. However, most conversions return either a newly created wxString or a temporary buffer. There are instances where we might need to perform a conversion and then hold the result for later processing. This is done by copying the results of a conversion into separate storage.

Consider the case of sending strings between computers, such as over a socket. We should agree on a protocol for what type of string encoding to use; otherwise, platforms with different default encodings would garble received strings. The sender could convert to UTF-8, and the receiver could then convert from UTF-8 into its default encoding.

The following short example demonstrates how to use a combination of techniques to convert a string of any encoding into UTF-8, store the result in a char* for sending over the socket, and then later convert that raw UTF-8 data back into a wxString.


// Convert the string to UTF-8
const wxCharBuffer ConvertToUTF8(wxString anyString)

{
    return wxConvUTF8.cWC2MB( anyString.wc_str(*wxConvCurrent) ) ;
}

// Use the raw UTF-8 data passed to build a wxString
wxString ConvertFromUTF8(const char* rawUTF8)
{
    return wxString(wxConvUTF8.cMB2WC(rawUTF8), *wxConvCurrent);
}

// Test our wxString<->UTF-8 conversion
void StringConversionTest(wxString anyString)
{
    // Convert to UTF-8, keep the char buffer around
    const wxCharBuffer bUTF8 = ConvertToUTF8(anyString);

    // wxCharBuffer has an implicit conversion operator for char *
    const char *cUTF8 = bUTF8 ;

    // Rebuild the string
    wxString stringCopy = ConvertFromUTF8(cUTF8);

    // The two strings should be equal
    wxASSERT(anyString == stringCopy);
}


Help Files

You will want to distribute a separate help file for each supported language. Your help controller initialization will select the appropriate help file name according to the current locale, perhaps using wxLocale::GetName to form the file name, or simply using _( ) to translate to the appropriate file name. For example:


m_helpController->Initialize(_("help_english"));


If you are using wxHtmlHelpController, you need to make sure that all the HTML files contain the META tag, for example:


<meta http-equiv="Content-Type" content="text/html; charset=iso8859 //2">


You also need to make sure that the project file (extension HHP) contains one additional line in the OPTIONS section:


Charset=iso8859-2


This additional entry tells the HTML help controller what encoding is used in contents and index tables.

Numbers and Dates

The current locale is also used for formatting numbers and dates. The printf-based formatting in wxString takes care of this automatically. Consider this snippet:


wxString::Format(wxT("%.1f") , myDouble);


Here, Format uses the correct decimal separator. And for date formatting:


wxDateTime t = wxDateTime::Now( );
wxString str = t.Format( );


Format presents the current date and time in the correct language and format. Have a look at the API documentation to see all the possibilities for passing a format description to the Format call—just don’t forget that you will probably have to translate this format string as well for using it in a different language because the sequence for different parts of the date can differ among languages.

If you want to add the correct thousands separator or just want to know the correct character for the decimal point, you can ask the locale object for the corresponding strings using the GetInfo method:


wxString info = m_locale.GetInfo(wxLOCALE_THOUSANDS_SEP,
                                   wxLOCALE_CAT_NUMBER) ;


Other Media

You can also load other data such as images and sounds depending on the locale. You can use the same mechanism as for text, for example:


wxBitmap bitmap(_("flag.png"));


This code will cause flag.png to appear on your list of strings to translate, so you just translate the string flag.png into the appropriate file name for your platform, for example de/flag.png. Make sure that the translated versions are also available as true files in your application, or you can load them from a compressed archive (refer to Chapter 14, “Files and Streams”).

A Simple Sample

To illustrate some of the concepts that we’ve covered, you can find a little sample in examples/chap16 on the CD-ROM. It shows some strings and a flag graphic for three languages: English, French, and German. You can change the language from the File menu, which will change the menu strings, the wxStaticText strings, and the flag graphic to suit the newly selected language. To demonstrate the different behavior of _( ) and wxT( ), the menu help in the status line remains in English.

Figure 16-2 The internationalization samples

The internationalization samples

The sample’s application class contains a wxLocale pointer and a function SelectLanguage that will re-create the locale object with the appropriate language. This is the application class declaration and implementation:


class MyApp : public wxApp
{
public:
    ~MyApp( ) ;

    // Initialize the application
    virtual bool OnInit( );

    // Recreates m_locale according to lang
    void SelectLanguage(int lang);

private:
    wxLocale* m_locale; // 'our' locale
};

IMPLEMENT_APP(MyApp)

bool MyApp::OnInit( )
{
    wxImage::AddHandler( new wxPNGHandler );

    m_locale = NULL;
    SelectLanguage( wxLANGUAGE_DEFAULT );

    MyFrame *frame = new MyFrame(_("i18n wxWidgets App"));

    frame->Show(true);
    return true;
}

void MyApp::SelectLanguage(int lang)
{
    delete m_locale;
    m_locale = new wxLocale( lang );
    m_locale->AddCatalog( wxT("i18n") );
}

MyApp::~MyApp( )
{
    delete m_locale;
}


There are two functions of particular interest in the frame class: SetupStrings and OnChangeLanguage. SetupStrings sets the labels and re-creates the menu bar, using translations for all the strings apart from the menu help strings, as follows:


void MyFrame::SetupStrings( )
{
    m_helloString->SetLabel(_("Welcome to International Sample"));
    m_todayString->SetLabel( wxString::Format(_("Now is %s") ,
 wxDateTime::Now( ).Format( ).c_str( ) ) );
    m_thousandString->SetLabel( wxString::Format(_("12345 divided by 10
 is written as %.1f") , 1234.5 ) );
    m_flag->SetBitmap(wxBitmap( _("flag.png") , wxBITMAP_TYPE_PNG ));

    // create a menu bar
    wxMenu *menuFile = new wxMenu;

    // the "About" item should be in the help menu
    wxMenu *helpMenu = new wxMenu;
    helpMenu->Append(wxID_ABOUT, _("&About... F1"),
                     wxT("Show about dialog"));

    menuFile->Append(wxID_NEW, _("Change language..."),
                     wxT("Select a new language"));
    menuFile->AppendSeparator( );
    menuFile->Append(wxID_EXIT, _("E&xit Alt-X"),
                     wxT("Quit this program"));

    wxMenuBar *menuBar = new wxMenuBar( );
    menuBar->Append(menuFile, _("&File"));
    menuBar->Append(helpMenu, _("&Help"));

    wxMenuBar* formerMenuBar = GetMenuBar( );

    SetMenuBar(menuBar);
    delete formerMenuBar;

    SetStatusText(_("Welcome to wxWidgets!"));
}


OnChangeLanguage is called when the user wants to specify a new language, and it maps the user’s selection into a locale identifier such as wxLANGUAGE_GERMAN. This identifier is passed to MyApp::SelectLanguage before SetupStrings is called to change the labels and flag bitmap. Here is the implementation of OnChangeLanguage:


void MyFrame::OnChangeLanguage(wxCommandEvent& event)
{
    wxArrayInt languageCodes;
    wxArrayString languageNames;

    languageCodes.Add(wxLANGUAGE_GERMAN);
    languageNames.Add(_("German"));

    languageCodes.Add(wxLANGUAGE_FRENCH);
    languageNames.Add(_("French"));

    languageCodes.Add(wxLANGUAGE_ENGLISH);
    languageNames.Add(_("English"));

    int lang = wxGetSingleChoiceIndex( _("Select language:"),
                             _("Language"), languageNames );

    if ( lang != -1 )
    {
        wxGetApp( ).SelectLanguage(languageCodes[lang]);
        SetupStrings( );
    }
}


Summary

We’ve discussed the variety of ways in which wxWidgets helps you handle translations as well as formatting issues related to time and date, currency, and so on. You should work with someone familiar with the target languages or locales who will be able to find differences that you might have missed.

For another example of a translated application, see samples/internat in your wxWidgets distribution. It demonstrates translation of strings in menu items and dialogs for ten languages.

Next, we’ll take a look at how you can make your applications perform several tasks at once with multithreading.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.184.102