Chapter 25. Manage Localization

This chapter covers the following topics:

  • Localization and internationalization

This chapter covers the following exam objective:

  • Objective 1.6: Given a scenario, configure localization options.

The shell gives you amazing power over your systems. You can perform simple tasks, such as copying files and running programs. You can combine many tasks into one, perform repetitive tasks with a few keystrokes, and even offload simple decision making to the shell. At first glance, it’s an imposing interface. With a bit of knowledge, you can start using the shell’s advanced features.

“Do I Know This Already?” Quiz

The “Do I Know This Already?” quiz enables you to assess whether you should read this entire chapter or simply jump to the “Exam Preparation Tasks” section for review. If you are in doubt, read the entire chapter. Table 25-1 outlines the major headings in this chapter and the corresponding “Do I Know This Already?” quiz questions. You can find the answers in Appendix A, “Answers to the ‘Do I Know This Already?’ Quizzes and Review Questions.”

Table 25-1 “Do I Know This Already?” Foundation Topics Section-to-Question Mapping

Foundation Topics Section

Questions Covered in This Section

Time Zones

1–3

Character Encoding

4

Representing Locales

5

Caution

The goal of self-assessment is to gauge your mastery of the topics in this chapter. If you do not know the answer to a question or are only partially sure of the answer, you should mark that question as wrong for purposes of the self-assessment. Giving yourself credit for an answer you correctly guess skews your self-assessment results and might provide you with a false sense of security.

1. You’re about to call a friend in Thunder Bay, but you don’t know what time zone she is in. Which of the following would help you determine what time it is in Thunder Bay?

a. tzselect

b. TZ=America/Thunder_Bay date

c. LC_TIME=America/Thunder_Bay date

d. date --timezone=America/Thunder_Bay

2. Which of the following commands can be used to set the system date? (Choose all that apply.)

a. dateconfig

b. date

c. timedatectl

d. localectl

3. In which directory are time zone files found?

a. /etc/timezone

b. /usr/share/zoneinfo

c. /usr/bin/zoneinfo

d. /etc/zoneinfo

4. Which of the following is a valid character set?

a. ASCODE

b. AFT-8

c. Unicode

d. None of these are valid character sets.

5. Which command can be used to change both the locale and the keyboard layout?

a. systemctl

b. kbctl

c. lckbctl

d. None of these are correct.

Foundation Topics

You’ll see different ways of writing numbers, currency, and times in different places. In the United States, it’s conventional to write dates with the month first. Travel north to Canada, and you’ll see the month in different spots. North America is standardized on using commas to group the thousands in a number and using periods to separate integers from decimals. Fly over to France, though, and you’ll see people using commas to separate the integers from the decimals. In addition to these variations, different countries use different currencies.

Internationalization and localization are two concepts that allow a computer to store information one way but display it in a way that suits the conventions of the user. Internationalization allows a system to display information in different ways, and localization is a process that bundles up all the regional changes for a single location into a locale.

Time Zones

The most readily visible localization features have to do with time. Every location on the planet belongs to a time zone. A time zone is defined as an offset from Universal Coordinated Time (UTC). UTC+0 is also known as Greenwich Mean Time (GMT) because it’s centered around Greenwich, London.

Many locations observe some form of daylight saving time (DST). DST is a system in which clocks are moved ahead an hour in the summer to take advantage of longer nights. Each DST zone sets the dates when the hour is added and removed.

A Linux machine may be physically located in one time zone, but the users may connect remotely from other time zones. Unix has a simple way to handle this: All timestamps are stored in UTC, and each user is free to set the time zone of her choosing. The system then adds or removes the time zone offset before displaying the value to the user.

Unix stores time as seconds since midnight UTC on January 1, 1970. This special date is known as the epoch, or the birthday of Unix. As of mid-2019 the current Unix timestamp was over 1.56 billion.

Key Topic.

Displaying and Setting System Time

A date and time must include the time zone in order to have enough meaning to be compared with other dates and times. The date command shows the current date and time. Here is an example of its use:

$ date
Fri Jun 28 21:15:01 CDT 2019

The time zone is displayed as CDT (Central Daylight Time), which is UTC-5. You can specify different formats with the plus sign (+) and percent encodings, as shown in these examples:

$ date +"%Y-%m-%dT%H:%M:%z"
2019-06-28T21:19:-0500

Instead of the time zone as a word, this date format uses the offset in a format known as ISO 8601. When you use date -u, the time is displayed in UTC, which is written as +0000 or sometimes Z, short for Zulu, a way of referring to UTC in military and aviation circles.

The percent encodings in the date commands each have special meaning:

  • %Y: Four-digit year

  • %m: Two-digit month

  • %d: Two-digit day

  • %H: Two-digit hour in 24-hour time

  • %M: Two-digit minute

  • %z: Time zone offset

Other characters, such as the T, colon, and dashes, are displayed as is. Run man date to get a full list of encodings.

If you have administrative rights, you can also use the date command to change the system clock time. For example, to set the time to 6:20 p.m. on October 1, 2018, execute the following command:

$ date 100118202018
Key Topic.

Displaying and Setting the Hardware Clock

The date command displays the system clock (the clock maintained by the operating system). This may differ from the hardware clock (also called the real-time clock), which can be displayed by running the hwclock command:

$ hwclock
Mon 01 Oct 2018 06:09:36 PM PDT  -0.414662 seconds

You can set the hardware clock by using a combination of the --set and --date options. No output is provided for the following command:

hwclock --set --date="2018-10-01 06:12:40"

Typically either the system clock or hardware clock is accurate. For example, the system clock may slowly drift away from an accurate time, and this clock will need to be reset to the (hopefully) more accurate hardware clock. To set the system clock to the current hardware clock time, use the following command:

hwclock --hctosys

Conversely, if you are using NTP (Network Time Protocol), the system clock will likely be more accurate than the hardware clock. To set the hardware clock to be the same as the system clock, use the following command:

$ hwclock --systohc
Key Topic.

Setting Time Zones

The configuration for each time zone—and, by extension, any daylight saving time—is stored in the zoneinfo files, under /usr/share/zoneinfo. For example, /usr/share/zoneinfo/America/Winnipeg contains all the configuration for the city of Winnipeg. Unlike most configuration files in Linux, these files contain binary data and can’t be directly viewed.

The system time zone is stored in a file called /etc/localtime. It is either a copy of the appropriate zoneinfo file or a symlink to the appropriate file, depending on the distribution. Symlinks are typically used so that it’s clear which time zone is in use. For example, you could set your current time zone to Winnipeg with the following command:

$ ln -sf /usr/share/zoneinfo/America/Winnipeg /etc/localtime
Key Topic.

Users who don’t have a time zone set get the zone provided by /etc/localtime as their default. They can override the setting through the TZ environment variable.

# date +%z
-0500
# TZ=Asia/Hong_Kong date +%z
+0800

In the preceding example, the system time zone is UTC-5 in the first command but is overridden just for that command to the Hong Kong time zone. The user could just as easily set TZ in his .bash_profile file to make the changes persistent.

A Linux distribution includes tools to set the time zone from a menu. Depending on the distribution, this may be tzselect, tzconfig, or dpkg-reconfigure tzdata. The tzselect command helps you find the name of the time zone you want and leaves the work of making it permanent up to you. The other two commands make the changes to the /etc/localtime file for you.

Key Topic.

In addition, your distribution may store the time zone as a word in other files (for example, /etc/timezone for Debian and /etc/sysconfig/clock for Red Hat).

Key Topic.

The timedatectl Command

For the Linux+ exam, you should be aware of the relatively new command timedatectl, which can be used to view and change both the date and the time zone on the system. Example 25-1 shows an example of displaying date and time zone information.

Example 25-1 Displaying the Date and Time Zone

[root@server1 ~]# timedatectl status
      Local time: Mon 2018-10-01 18:23:23 PDT
  Universal time: Tue 2018-10-02 01:23:23 UTC
        RTC time: Mon 2018-10-01 13:32:48
        Timezone: America/Los_Angeles (PDT, -0700)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: yes
 Last DST change: DST began at
                   Sun 2018-03-11 01:59:59 PST
                   Sun 2018-03-11 03:00:00 PDT
 Next DST change: DST ends (the clock jumps one hour backwards) at
                   Sun 2018-11-04 01:59:59 PDT
                   Sun 2018-11-04 01:00:00 PST

The timedatectl command can also be used to modify the system clock, the RTC (real-time clock, or hardware clock), and the time zone setting.

Key Topic.

Character Encoding

Once upon a time, computers used the American Standard Code for Information Interchange (ASCII). ASCII encodes a character into 7 bits, which means there are 128 possible characters. This would be fine if all you ever used was English, but eventually the need for accented characters in different languages filled up the possible characters.

Most systems store information in at least 8 bits, so computers started using the previously ignored bit to store special characters, giving 128 new spots for characters.

Vendors then started making their own character sets in the spots not already used by the English characters and punctuation and called them code pages. ASCII character 200 might be an accented N in one code page and a Greek letter in another code page. If you wanted to use different characters, you had to switch to a different code page.

Some of these code pages were codified in the ISO 8859 standard, which defines the standard code pages. ISO 8859-1 is the Latin alphabet with English characters, and ISO 8859-9 is the Latin alphabet with Turkish characters. Confusingly, ISO 8859-3 has some of the Turkish characters along with characters from some other languages.

In the early 1990s it was clear that this was a mess, and people got together to come up with a new standard that could handle everything. Thus Unicode, a universal encoding, was born.

Key Topic.

Unicode defines each possible character as a code point, which is a number. The original ASCII set is mapped into the first 127 values for compatibility. Originally each Unicode character was encoded into 2 bytes, which meant there were 16,000 or so possible characters. This encoding, called UCS-2 (for 2-byte Universal Coded Character Set), ended up not being able to hold the number of characters needed for all the languages and symbols on the planet. UTF-16 (16-bit Unicode Transformation Format) fixed this by allowing anything over 16KB to be represented with a second pair of bytes.

Key Topic.

Around the same time, UTF-8 was being developed. The minimum of 2 bytes per character was not compatible with existing ASCII files. Therefore, UTF-8 encoding allows from 1 to 6 bytes to be used to encode a character, with the length of the character cleverly encoded in the high-order bits of the number. UTF-8 is fully compatible with the original 127 characters but can still represent any Unicode code point.

UTF is by and large the dominant encoding type.

Representing Locales

Each locale is represented in terms of two or three variables (discussed further in Chapter 26, “BASH Scripting Essentials”):

  • Language code (ISO 639)

  • Country code (ISO 3166)

  • Encoding (optional)

It might seem odd to have both a language and a country, but consider that multiple languages may be spoken in a country and that two countries sharing a common language may speak different dialects. Just ask anyone from France what they think about how French is spoken in Quebec, Canada!

Thus, the language and country are different. ISO 639 describes language names, such as en for English, de for German, and es for Spanish. ISO 3166 is for the country. While Germany happens to be DE for country and de for language, a country doesn’t necessarily have the same designation for both. The United States and Canada, which both have English as an official language (en), are US and CA, respectively.

The encoding further describes how the characters are stored in the locale file. A particular locale file may use the old ISO 8859 encoding or the more robust Unicode, and even within Unicode there are multiple variants, such as UTF-8, UTF-16, and UTF-32.

American English is in the en_US.UTF-8 locale, and Spanish is in es_ES.utf8. See what locales are installed on your system with the locale -a command, as shown in Example 25-2.

Key Topic.

Example 25-2 Using the locale -a Command to See the Locales Installed on a System

# locale -a
C
C.UTF-8
en_AG
en_AG.utf8
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
... output omitted ...
es_ES.utf8
es_GT.utf8
es_HN.utf8
es_MX.utf8
es_NI.utf8
POSIX

Fallback Locales

Sometimes you don’t want to deal with locales, especially if you’re writing a script that deals with output of other programs, which could change based on the user’s locale. In such a case, you can temporarily use the C locale. C, which can also be called POSIX, is a generic 8-bit ASCII locale.

Contents of a Locale

Each locale file contains instructions on how to display or translate a variety of items:

  • Addresses: Ordering of various parts in zip code format

  • Collation: How to sort, such as the ordering of accented characters or whether capitalized words are grouped together or separately from lowercase

  • Measurement: Display of various units

  • Messages: Translations for system messages and errors

  • Monetary: How currency symbols are displayed and named

  • Names: Conventions for displaying people’s names

  • Numeric: How to display numbers such as the thousands and decimal separators

  • Paper: Paper sizes used in the country

  • Telephone: How telephone numbers are displayed

  • Time: Date and time formats, such as the ordering of year, month, and date, or 24-hour clock versus using a.m. and p.m.

These locale files are usually distributed with the operating system as separate packages to save on space. If you don’t need the translations, you can generate the rest of the items without installing packages by using locale-gen on systems that support it (see Example 25-3).

Example 25-3 Using locale-gen

# locale-gen fr_FR.UTF-8
Generating locales...
 fr_FR.UTF-8... done
Generation complete.
# locale -a | grep FR
fr_FR.utf8
Key Topic.

The localectl Command

Some Linux distributions include a command that changes not only the locale but also the keyboard layout. When provided the status option, the localectl command displays these values, as shown in Example 25-4.

Example 25-4 Using localectl

# localectl status
   System Locale: LANG=en_US.utf8
       VC Keymap: us
      X11 Layout: us
       X11 Model: pc105+inet
     X11 Options: terminate:ctrl_alt_bksp

You can set the locale and keyboard by using a command like the following:

# localectl set-locale "LANG=de_DE.utf8" set-keymap "de"

The advantage of the localectl command over the locale command is that when you change a locale setting, you typically want to change the keyboard layout to match the local region, and localectl lets you do this.

How Linux Uses the Locale

Internationalization in Linux is handled with the GNU gettext library. If programmers write their applications with that library and annotate their messages correctly, the user can change the behavior with environment variables.

Multiple things can be localized, such as numbers and messages, and gettext has a series of environment variables that it checks to see which locale is appropriate. In order, these are

Key Topic.
  • LANGUAGE

  • LC_ALL

  • LC_XXX

  • LANG

The LANGUAGE variable is consulted only when printing messages. It is ignored for formatting. Also, the colon (:) gives the system a list of locales to try in order when trying to display a system message. You can use LC_ALL to force the locale even if some of the other variables are set.

LC_XXX gives the administrator the power to override a locale for a particular element. For example, if LANG were set to en_US.UTF-8, the user could override currency display by setting LC_MONETARY. The locale command displays the current settings, as shown in Example 25-5.

Example 25-5 Using locale

# locale
LANG=en_CA.UTF-8
LANGUAGE=en_CA:en
LC_CTYPE="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_PAPER="en_CA.UTF-8"
LC_NAME="en_CA.UTF-8"
LC_ADDRESS="en_CA.UTF-8"
LC_TELEPHONE="en_CA.UTF-8"
LC_MEASUREMENT="en_CA.UTF-8"
LC_IDENTIFICATION="en_CA.UTF-8"
LC_ALL=
This example is from a typical English system. You can override just
  parts of it:
# LC_TIME=fr_FR.UTF8 date
samedi 7 mars 2015, 23:11:23 (UTC-0600)
# LC_MESSAGES=fr_FR.UTF8 man
What manual page do you want?
# LANGUAGE='' LC_MESSAGES=fr_FR.UTF8 man
Quelle page de manuel voulez-vous ?

In Example 25-5, the time setting is switched to the French locale, and the date is displayed in French. The second command sets the messages setting to French, but the English variant is used because the higher-priority LANGUAGE is set. A French error message is used when LANGUAGE is set to nothing.

Exam Preparation Tasks

As mentioned in the section “How to Use This Book” in the Introduction, you have a couple of choices for exam preparation: the exercises here, Chapter 30, “Final Preparation,” and the exam simulation questions in the Pearson Test Prep Software Online.

Review All Key Topics

Review the most important topics in this chapter, noted with the Key Topic icon in the outer margin of the page. Table 25-2 lists these key topics and the page number on which each is found.

Key Topic.

Table 25-2 Key Topics for Chapter 25

Key Topic Element

Description

Page Number

Paragraph

The date command

653

Paragraph

The hwclock command

654

Paragraph

The /usr/share/zoneinfo directory

655

Paragraph

The TZ environment variable

655

Paragraph

The /etc/timezone file

656

Section

The timedatectl command

656

Paragraph

ASCII

656

Paragraph

Unicode code points

657

Paragraph

UTF-8 encoding

657

Example 25-2

The locale command

658

Paragraph

The localectl command

660

List

The locale environment variables

660

Define Key Terms

Define the following key terms from this chapter and check your answers in the glossary:

Review Questions

The answers to these review questions are in Appendix A.

1. You’re vacationing in Hawaii. Ignoring why you chose to bring your work computer on vacation, how will you change the time zone for everything on your system?

a. export TZ=US/Hawaii

b. ln -sf /usr/share/zoneinfo/US/Hawaii /etc/localtime

c. ln -sf /usr/share/zoneinfo/US/Hawaii /etc/timezone

d. echo “US/Hawaii” > /etc/profile

2. Consider this output:

# locale
LANG=en_US.UTF-8
LC_TIME="es_ES.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_ALL="de_DE.UTF-8"

If you were to run the date command, which locale would be used for the formatting?

a. American (US) English

b. Spanish (ES)

c. Canadian (CA) English

d. German (DE)

3. What feature allows a system to display information in different ways?

a. Locale

b. Standardization

c. Unicode

d. Internationalization

4. What is another name for Universal Coordinated Time?

a. Universal Standard Time

b. Greenwich Mean Time

c. Standard Time Clock

d. International Standard Time

5. You discover that the system clock has drifted and is no longer accurate. You do some research and find that the RTC is accurate. What command could you execute to set the system clock to the RTC?

a. hwclock --hctosys

b. hwclock --rtctosys

c. hwclock --setsysclock

d. None of these answers are correct.

6. You need to execute the date command using the Hong Kong time zone. Which of the following commands will perform this task?

a. TZ=Asia/Hong_Kong; date +%z

b. TZ=Asia/Hong_Kong | date +%z

c. date +%z TZ=Asia/Hong_Kong

d. None of these answers are correct.

7. Which of the following commands can modify both the system date and the time zone?

a. date

b. hwclock

c. timedatectl

d. systime

8. Which of the following is not included in a locale file?

a. Measurement units

b. Monetary units

c. Paper sizes

d. Time zone

9. You need to display both locale and keyboard layout. Which option to the localectl command will display these values?

a. status

b. show

c. display

d. mode

10. Which of the following is used to force the locale for default language output even if some of the other variables are set?

a. LANG

b. LC_ALL

c. LANGUAGE

d. LC_IDENTIFICATION

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.72.224