Nobody can be exactly like me. Sometimes even I have trouble doing it. | ||
--Tallulah Bankhead |
The credo of “Write once, run anywhere™” means that your code will run in many places where languages and customs are different from yours. With a little care you can write programs that can adapt to these variations gracefully. Keeping your programs supple in this fashion is called internationalization. You have several tools for internationalizing your code. Using internationalization tools to adapt your program to a specific locale—such as by translating messages into the local language—is called localization.
The first tool is inherent in the language: Strings are in Unicode, which can express almost any written language on our planet. Someone must still translate the strings, and displaying the translated text to users requires fonts for those characters. Still, having Unicode is a big boost for localizing your code.
The nexus of internationalization and localization is the locale, which defines a “place.” A place can be a language, culture, or country—anything with an associated set of customs that requires changes in program behavior. Each running program has a default locale that is the user's preferred place. It is up to each program to adapt to a locale's customs as best it can. The locale concept is represented by the Locale
class, which is part of the java.util
package.
Given a locale, several tools can help your program behave in a locally comprehensible fashion. A common pattern is for a class to define the methods for performing locale-sensitive operations. A generic “get instance” static method of this class returns an object (possibly of a subclass) suitable for the default locale. The class will also provide an overload of each “get instance” method that takes a locale argument and returns a suitable object for a particular locale. For example, by invoking the class's getInstance
methods, you can get an appropriate java.util.Calendar
object that works with the user's preferred dates and times. The returned Calendar
object will understand how to translate system time into dates using the customs of the default locale. If the user were Mexican, an object that was a Calendar
adapted to Mexican customs could be returned. A Chinese user might get an object of a different subclass that worked under the Chinese calendar customs.
If your program displays information to the user, you will likely want to localize the output: Saying “That's not right” to someone who doesn't understand English is probably pointless, so you would like to translate (localize) the message for speakers in other locales. The resource bundle mechanisms help you make this possible by mapping string keys to arbitrary resources. You use the values returned by a resource bundle to make your program speak in other tongues—instead of writing the literal message strings in your code, you look up the strings from a resource bundle by string keys. When the program is moved to another locale, someone can translate the messages in the resource bundle and your program will work for that new locale without your changing a line of code.
The classes described in this chapter come mostly from the package java.util
. There are occasional brief discussions of classes in the text internationalization and localization package java.text
, with an overview of some of its capabilities in Section 24.6 on page 708, but a full discussion on this subject is outside the scope of this book.
A java.util.Locale
object describes a specific place—cultural, political, or geographical. Using a locale, objects can localize their behavior to a user's expectations. An object that does so is called locale-sensitive. For example, date formatting can be localized by using the locale-sensitive DateFormat
class (described later in this chapter), so the date written in the United Kingdom as 26/11/72
would be written 26.11.72
in Iceland, 11/26/72
in the United States, or 72.26.11
in Latvia.
A single locale represents issues of language, country, and other traditions. There can be separate locales for U.S. English, U.K. English, Australian English, Pakistani English, and so forth. Although the language is arguably in common for these locales, the customs of date, currency, and numeric representation vary.
Your code will rarely get or create Locale
objects directly but instead will use the default locale that reflects the user's preference. You typically use this locale implicitly by getting resources or resource bundles as shown with other locale-sensitive classes. For example, you get the default calendar object like this:
Calendar now = Calendar.getInstance();
The Calendar
class's getInstance
method looks up the default locale to configure the calendar object it returns. When you write your own locale-sensitive classes, you get the default locale from the static getDefault
method of the Locale
class.
If you write code that lets a user select a locale, you may need to create Locale
objects. There are three constructors:
public
Locale(String language, String country, String variant)
Creates a Locale
object that represents the given language and country, where language
is the two-letter ISO
639 code for the language (such as "et"
for Estonian) and country
is the two-letter ISO
3166 code for the country (such as "KY"
for Cayman Islands). “Further Reading” on page 755 lists references for these codes. The variant
can specify anything, such as an operating environment (such as "POSIX"
or "MAC"
) or company or era. If you specify more than one variant, separate the two with an underscore. To leave any part of the locale unspecified, use ""
, an empty string—not null
.
public
Locale(String language, String country)
Equivalent to Locale(language,country,
"")
.
public
Locale(String language)
Equivalent to Locale(language,"",
"")
.
The language and country can be in any case, but they will always be translated to lowercase for the language and uppercase for the country to conform to the governing standards. The variant is translated into uppercase.
The Locale
class defines static Locale
objects for several well-known locales, such as CANADA_FRENCH
and KOREA
for countries, and KOREAN
and TRADITIONAL_CHINESE
for languages. These objects are simply conveniences and have no special privileges compared to any Locale
object you may create.
The static method setDefault
changes the default locale. The default locale is shared state and should always reflect the user's preference. If you have code that must operate in a different locale, you can specify that locale to locale-sensitive classes either as an argument when you get resources or on specific operations. You should rarely need to change the default locale.
Locale
provides methods for getting the parts of the locale description. The methods getCountry
, getLanguage
, and getVariant
return the values defined during construction. These are terse codes that most users will not know. These methods have “display” variants—getDisplayCountry
, getDisplayLanguage
, and getDisplayVariant
—that return human-readable versions of the values. The method getDisplayName
returns a human-readable summary of the entire locale description, and toString
returns the terse equivalent, using underscores to separate the parts. These “display” methods return values that are localized according to the default locale.
You can optionally provide a Locale
argument to any of the “display” methods to get a description of the given locale under the provided locale. For example, if we print the value of
Locale.ITALY.getDisplayCountry(Locale.FRANCE)
we get
Italie
the French name for Italy.
The methods getISO3Country
and getISO3Language
return three-character ISO
codes for the country and language of the locale, respectively.
When you internationalize code, you commonly have units of meaning—such as text or sounds—that must be translated or otherwise made appropriate for each locale. If you put English text directly into your program, localizing that code is difficult—it requires finding all the strings in your program, identifying which ones are shown to users, and translating them in the code, thereby creating a second version of your program for, say, Swahili users. When you repeat this process for a large number of locales the task becomes a nightmare.
The resource bundle classes in java.util
help you address this problem in a cleaner and more flexible fashion. The abstract class ResourceBundle
defines methods to look up resources in a bundle by string key and to provide a parent bundle that will be searched if a bundle doesn't have a key. This inheritance feature allows one bundle to be just like another bundle except that a few resource values are modified or added. For example, a U.S. English bundle might use a U.K. English bundle for a parent, providing replacements for resources that have different spelling. ResourceBundle
provides the following public methods:
public final String
getString(String key)
throws MissingResourceException
Returns the string stored in the bundle under the given key
.
public final String[]
getStringArray(String key)
throws MissingResourceException
Returns the string array stored in the bundle under the given key
.
public final Object
getObject(String key)
throws MissingResourceException
Returns the object stored in the bundle under the given key
.
public abstract Enumeration
getKeys()
Returns an Enumeration
of the keys understood by this bundle, including all those of the parent.
Each resource bundle defines a set of string keys that map to locale-sensitive resources. These strings can be anything you like, although it is best to make them mnemonic. When you want to use the resource you look it up by name. If the resource is not found a MissingResourceException
is thrown. The resources themselves can be of any type but are commonly strings, so the getString
methods are provided for convenience.
The following example shows an internationalized way to rewrite the “Hello, world” example. This internationalized version requires a program called GlobalHello
and a resource bundle for the program's strings called GlobalRes
, which will define a set of constants for the localizable strings. First, the program:
import java.util.*; public class GlobalHello { public static void main(String[] args) { ResourceBundle res = ResourceBundle.getBundle("GlobalRes"); String msg; if (args.length > 0) msg = res.getString(GlobalRes.GOODBYE); else msg = res.getString(GlobalRes.HELLO); System.out.println(msg); } }
The program first gets its resource bundle. Then it checks whether any arguments are provided on the command line. If some are, it says good-bye; otherwise it says hello. The program logic determines which message to display, but the actual string to print is looked up by key (GlobalRes.HELLO
or GlobalRes.GOODBYE
).
Each resource bundle is a set of associated classes and property files. In our example, GlobalRes
is the name of a class that extends ResourceBundle
, implementing the methods necessary to map a message key to a localized translation of that message. You define classes for the various locales for which you want to localize the messages, naming the classes to reflect the locale. For example, the bundle class that manages GlobalRes
messages for the Lingala language would be GlobalRes_ln
because "ln"
is the two-letter code for Lingala. French would be mapped in GlobalRes_fr
, and Canadian French would be GlobalRes_fr_CA
, which might have a parent bundle of GlobalRes_fr
.
We have chosen to make the key strings constants in the GlobalRes
class. Using constants prevents errors of misspelling. If you pass literal strings such as "hello"
to getString
, a misspelling will show up only when the erroneous getString
is executed, and that might not happen during testing. If you use constants, a misspelling will be caught by the compiler (unless you are unlucky enough to accidentally spell the name of another constant).
You find resources by calling one of two static getBundle
methods in ResourceBundle
: the one we used, which searches the current locale for the best available version of the bundle you name; and the other method, which lets you specify both bundle name and desired locale. A fully qualified bundle name has the form package.Bundle_la_CO_va
, where package.Bundle
is the general fully qualified name for the bundle class (such as GlobalRes
), la
is the two-letter language code (lowercase), CO
is the two-letter country code (uppercase), and va
is the list of variants separated by underscores. If a bundle with the full name cannot be found, the last component is dropped and the search is repeated with this shorter name. This process is repeated until only the last locale modifier is left. If even this search fails and if you invoked getBundle
with a specified locale, the search is restarted with the full name of the bundle for the default locale. If this second search ends with no bundle found or if you were searching in the default locale, getBundle
checks using just the bundle name. If even that bundle does not exist, getBundle
throws a MissingBundleException
.
For example, suppose you ask for the bundle GlobalRes
, specifying a locale for an Esperanto speaker living in Kiribati who is left-handed, and the default locale of the user is for a Nepali speaker in Bhutan who works for Acme, Inc. The longest possible search would be:
GlobalRes_eo_KI_left GlobalRes_eo_KI GlobalRes_eo GlobalRes_ne_BT_Acme GlobalRes_ne_BT GlobalRes_ne GlobalRes
The first resource bundle that is found ends the search, being considered the best available match.
The examples you have seen use resource bundles to fetch strings, but remember that you can use getObject
to get any type of object. Bundles are used to store images, URL
s, audio sources, graphics components, and any other kind of locale-sensitive resource that can be represented by an object.
Mapping string keys to localized resource objects is usually straightforward—simply use one of the provided subclasses of ResourceBundle
that implement the lookup for you: ListResourceBundle
and PropertyResourceBundle
.
ListResourceBundle
maps a simple list of keys to their localized objects. It is an abstract subclass of ResourceBundle
for which you provide a getContents
method that returns an array of key/resource pairs as an array of arrays of Object
. The keys must be strings, but the resources can be any kind of object. The ListResourceBundle
takes this array and builds the maps for the various “get” methods. The following classes use ListResourceBundle
to define a few locales for GlobalRes
. First, the base bundle:
public class GlobalRes extends ListResourceBundle { public static final String HELLO = "hello"; public static final String GOODBYE = "goodbye"; public Object[][] getContents() { return contents; } private static final Object[][] contents = { { GlobalRes.HELLO, "Ciao" }, { GlobalRes.GOODBYE, "Ciao" }, }; }
This is the top-level bundle—when no other bundle is found, this will be used. We have chosen Italian for the default. Before any “get” method is executed, GlobalRes.getContents
will be invoked and the contents
array's values will seed the data structures used by the “get” methods. ListResourceBundle
uses an internal lookup table for efficient access; it does not search through your array of keys. The GlobalRes
class also defines the constants that name known resources in the bundle. Here is another bundle for a more specific locale:
public class GlobalRes_en extends ListResourceBundle { public Object[][] getContents() { return contents; } private static final Object[][] contents = { { GlobalRes.HELLO, "Hello" }, { GlobalRes.GOODBYE, "Goodbye" }, }; }
This bundle covers the English-language locale en
. It provides specific values for each localizable string. The next bundle uses the inheritance feature:
public class GlobalRes_en_AU extends ListResourceBundle { // mostly like basic English - our parent bundle public Object[][] getContents() { return contents; } private static final Object[][] contents = { { GlobalRes.HELLO, "G'day" }, }; }
This bundle is for English speakers from Australia (AU
). It provides a more colloquial version of the HELLO
string and inherits all other strings from the general English locale GlobalRes_en
. Whenever a resource bundle is instantiated, its parent chain is established. This proceeds by successively dropping the variant, country, and language components from the base bundle name and instantiating those bundles if they exist. If they do exist then setParent
is called on the preceding bundle passing the new bundle as the parent. So in our example, when GlobalRes_en_AU
is created, the system will create GlobalRes_en
and set it as the parent of GlobalRes_en_AU
. In turn, the parent of GlobalRes_en
will be the base bundle GlobalRes
.
Given these classes, someone with an English-language locale (en
) would get the values returned by GlobalRes_en
unless the locale also specified the country Australia (AU
), in which case values from GlobalRes_en_AU
would be used. Everyone else would see those in GlobalRes
.
PropertyResourceBundle
is a subclass of ResourceBundle
that reads its list of resources from a text property description. Instead of using an array of key/resource pairs, the text contains key/resource pairs as lines of the form
key=value
Both keys and values must be strings. A PropertyResourceBundle
object reads the text from an InputStream
passed to the PropertyResourceBundle
constructor and uses the information it reads to build a lookup table for efficient access.
The bundle search process that we described earlier actually has an additional step that looks for a file ResName
.properties
after it looks for a class ResName
. For example, if the search process doesn't find the class GlobalRes_eo_KI_left
it will then look for the file GlobalRes_eo_KI_left.properties
before looking for the next resources class. If that file exists, an input stream is created for it and used to construct a PropertyResourceBundle
that will read the properties from the file.
It is easier to use property files than to create subclasses of ListResourceBundle
but the files have two limitations. First, they can only define string resources whereas ListResourceBundle
can define arbitrary objects. Second, the only legal character encoding for property files is the byte format of ISO
8859-1. This means that other Unicode characters must be encoded with u
xxxx
escape sequences.
ListResourceBundle
, PropertyResourceBundle
, and .properties
files will be sufficient for most of your bundles, but you can create your own subclass of ResourceBundle
if they are not. You must implement two methods:
protected abstract Object
handleGetObject(String key)
throws MissingResourceException
Returns the object associated with the given key
. If the key is not defined in this bundle, it returns null
, and that causes the ResourceBundle
to check in the parent (if any). Do not throw MissingResourceException
unless you check the parent instead of letting the bundle do it. All the “get” methods are written in terms of this one method.
public abstract Enumeration
getKeys()
Returns an Enumeration
of the keys understood by this bundle, including all those of the parent.
Exercise 24.1: Get GlobalHello
to work with the example locales. Add some more locales, using ListResourceBundle
, .properties
files, and your own specific subclass of ResourceBundle
.
Currency encoding is highly sensitive to locale, and the java.util.Currency
class helps you properly format currency values. You obtain a Currency
object from one of its static getInstance
methods, one of which takes a Locale
object while the other takes a currency code as a String
(codes are from the ISO
4217 standard).
The Currency
class does not directly map currency values into localized strings but gives you information you need to do so. The information at your disposal is
public String
getSymbol()
Returns the symbol of this currency for the default locale.
public String
getSymbol(Locale locale)
Returns the symbol of this currency for the specified locale. If there is no locale specific symbol then the ISO
4217 currency code is returned. Many currencies share the same symbol in their own locale. For example, the $ symbol represents U.S. dollars in the United States, Canadian dollars in Canada, and Australian dollars in Australia—to name but a few. The local currency symbol is usually reserved for the local currency, so each locale can change the representation used for other currencies. For example, if this currency object represents the U.S. dollar, then invoking getSymbol
with a U.S locale will return "$"
because it is the local currency. However, invoking getSymbol
with a Canadian locale will return "USD"
(the currency code for the U.S. dollar) because the $ symbol is reserved for the Canadian dollar in the Canadian locale.
public int
getDefaultFractionDigits()
Returns the default number of fraction digits used with this currency. For example, the British pound would have a value of 2 because two digits usually follow the decimal point for pence (such as in £18.29), whereas the Japanese yen would have zero because yen values typically have no fractional part (such as ¥1200). Some “currencies” are not really currencies at all (IMF
Special Drawing Rights, for example), and they return –1.
public String
getCurrencyCode()
Returns the ISO
4217 currency code of this currency.
Exercise 24.2: Select six different locales and six different currencies, and print a table showing the currency symbol for each currency in each locale.
Time is represented as a long
integer measured in milliseconds since midnight Greenwich Mean Time (GMT
) January 1, 1970. This starting point for time measurement is known as the epoch. This value is signed, so negative values signify time before the beginning of the epoch. The System.currentTimeMillis
method returns the current time. This value will express dates into the year A.D.
292,280,995, which should suffice for most purposes.
You can use java.util.Date
to hold a time and perform some simple time-related operations. When a new Date
object is created, you can specify a long
value for its time. If you use the no-arg constructor, the Date
object will mark the time of its creation. A Date
object can be used for simple operations. For example, the simplest program to print the current time (repeated from page 37) is
import java.util.Date; class Date2 { public static void main(String[] args) { Date now = new Date(); System.out.println(now); } }
This program will produce output such as the following:
Sun Mar 20 08:48:38 GMT+10:00 2005
Note that this is not localized output. No matter what the default locale, the date will be in this format, adjusted for the current time zone.
You can compare two dates with the before
and after
methods, which return true
if the object on which they are invoked is before or after the other date. Or you can compare the long
values you get from invoking getTime
on the two objects. The method setTime
lets you change the time to a different long
.
The Date
class provides no support for localization and has effectively been replaced by the more sophisticated and locale-sensitive Calendar
and DateFormat
classes.
Calendars mark the passage of time. Most of the world uses the same calendar, commonly called the Gregorian calendar after Pope Gregory XIII
, under whose auspices it was first instituted. Many other calendars exist in the world, and the calendar abstractions are designed to express such variations. A given moment in time is expressed as a date according to a particular calendar, and the same moment can be expressed as different dates by different calendars. The calendar abstraction is couched in the following form:
An abstract Calendar
class that represents various ways of marking time
An abstract TimeZone
class that represents time zone offsets and other adjustments, such as daylight saving time
An abstract java.text.DateFormat
class that defines how one can format and parse date and time strings
Because the Gregorian calendar is commonly used, you also have the following concrete implementations of the abstractions:
A GregorianCalendar
class
A SimpleTimeZone
class for use with GregorianCalendar
A java.text.SimpleDateFormat
class that formats and parses Gregorian dates and times
For example, the following code creates a GregorianCalendar
object representing midnight (00:00:00), October 26, 1972, in the local time zone, then prints its value:
Calendar cal = new GregorianCalendar(1972, Calendar.OCTOBER, 26); System.out.println(cal.getTime());
The method getTime
returns a Date
object for the calendar object's time, which was set by converting a year, month, and date into a millisecond-measured long
. The output would be something like this (depending on your local time zone of course):
Thu Oct 26 00:00:00 GMT+10:00 1972
You can also work directly with the millisecond time value by using getTimeInMillis
and setTimeInMillis
. These are equivalent to working with a Date
object; for example, getTimeInMillis
is equivalent to invoking getTime().getTime()
.
The abstract Calendar
class provides a large set of constants that are useful in many calendars, such as Calendar.AM
and Calendar.PM
for calendars that use 12-hour clocks. Some constants are useful only for certain calendars, but no calendar class is required to use such constants. In particular, the month names in Calendar
(such as Calendar.JUNE
) are names for the various month numbers (such as 5—month numbers start at 0
), with a special month UNDECIMBER
for the thirteenth month that many calendars have. But no calendar is required to use these constants.
Each Calendar
object represents a particular moment in time on that calendar. The Calendar
class provides only constructors that create an object for the current time, either in the default locale and time zone or in specified ones.
Calendar objects represent a moment in time, but they are not responsible for displaying the date. That locale-sensitive procedure is the job of the DateFormat
class, which will soon be described.
You can obtain a calendar object for a locale by invoking one of the static Calendar.getInstance
methods. With no arguments, getInstance
returns an object of the best available calendar type (currently only GregorianCalendar
) for the default locale and time zone, set to the current time. The other overloads allow you to specify the locale, the time zone, or both. The static getAvailableLocales
method returns an array of Locale
objects for which calendars are installed on the system.
With a calendar object in hand, you can manipulate the date. The following example prints the next week of days for a given calendar object:
public static void oneWeek(PrintStream out, Calendar cal) { Calendar cur = (Calendar) cal.clone(); //modifiable copy int dow = cal.get(Calendar.DAY_OF_WEEK); do { out.println(cur.getTime()); cur.add(Calendar.DAY_OF_WEEK, 1); } while (cur.get(Calendar.DAY_OF_WEEK) != dow); }
First we make a copy of the calendar argument so that we can make changes without affecting the calendar we were passed.[1] Instead of assuming that there are seven days in a week (who knows what kind of calendar we were given?), we loop, printing the time and adding one day to that time, until we have printed a week's worth of days. We detect whether a week has passed by looking for the next day whose “day of the week” is the same as that of the original object.
The Calendar
class defines many kinds of calendar fields for calendar objects, such as DAY_OF_WEEK
in the preceding code. These calendar fields are constants used in the methods that manipulate parts of the time:
MILLISECOND
SECOND
MINUTE
HOUR
HOUR_OF_DAY
AM_PM
DAY_OF_WEEK
DAY_OF_WEEK_IN_MONTH
DAY_OF_MONTH
DATE
DAY_OF_YEAR
WEEK_OF_MONTH
WEEK_OF_YEAR
MONTH
YEAR
ERA
ZONE_OFFSET
DST_OFFSET
FIELD_COUNT
An int
is used to store values for all these calendar field types. You use these constants—or any others defined by a particular calendar class—to specify a calendar field to the following methods (always as the first argument):
| Returns the value of the field |
| Sets the value of the field to the provided |
| Clears the value of the field to “unspecified” |
| Returns |
| Adds an |
| Rolls the field up to the next value if the second |
| Gets the minimum valid value for the field |
| Gets the maximum valid value for the field |
| Gets the highest minimum value for the field; if it varies, this can be different from |
| Gets the smallest maximum value for the field; if it varies, this can be different from |
The greatest minimum and least maximum describe cases in which a value can vary within the overall boundaries. For example, the least maximum value for DAY_OF_MONTH
on the Gregorian calendar is 28 because February, the shortest month, can have as few as 28 days. The maximum value is 31 because no month has more than 31 days.
The set
method allows you to specify a date by certain calendar fields and then calculate the time associated with that date. For example, you can calculate on which day of the week a particular date falls:
public static int dotw(int year, int month, int date) { Calendar cal = new GregorianCalendar(); cal.set(Calendar.YEAR, year); cal.set(Calendar.MONTH, month); cal.set(Calendar.DATE, date); return cal.get(Calendar.DAY_OF_WEEK); }
The method dotw
calculates the day of the week on the Gregorian calendar for the given date. It creates a Gregorian calendar object, sets the date fields for year, month, and day, and returns the resulting day of the week.
The clear
method can be used to reset a field's value to be unspecified. You can use clear
with no parameters to clear all calendar fields. The isSet
method returns true
if a field currently has a value set.
Three variants of set
change particular fields you commonly need to manipulate, leaving unspecified fields alone:
public void set(int year, int month, int date) public void set(int year, int month, int date, int hrs, int min) public void set(int year, int month, int date, int hrs, int min, int sec)
You can also use setTime
to set the calendar's time from a Date
object.
A calendar field that is out of range can be interpreted correctly. For example, January 32 can be equivalent to February 1. Whether it is treated as such or as an error depends on whether the calendar is considered to be lenient. A lenient calendar will do its best to interpret values as valid. A strict (non-lenient) calendar will not accept any values out of range, throwing IllegalArgumentException
. The setLenient
method takes a boolean
that specifies whether parsing should be lenient; isLenient
returns the current setting.
A week can start on any day, depending on the calendar. You can discover the first day of the week with the method getFirstDayOfWeek
. In a Gregorian calendar for the United States this method would return SUNDAY
, whereas Ireland uses MONDAY
. You can change this by invoking setFirstDayOfWeek
with a valid weekday index.
Some calendars require a minimum number of days in the first week of the year. The method getMinimalDaysInFirstWeek
returns that number; the method setMinimalDaysInFirstWeek
lets you change it. The minimum number of days in a week is important when you are trying to determine in which week a particular date falls—for example, in some calendars, if January 1 is a Friday it may be considered part of the last week of the preceding year.
You can compare two Calendar
objects by using compareTo
since Calendar
implements Comparable
. If you prefer, you can use the before
and after
methods to compare the objects.
TimeZone
is an abstract class that encapsulates not only offset from GMT
but also other offset issues, such as daylight saving time. As with other locale-sensitive classes, you can get the default TimeZone
by invoking the static method getDefault
. You can change the default time zone by passing setDefault
a new TimeZone
object to use—or null
to reset to the original default time zone. Time zones are understood by particular calendar types, so you should ensure that the default calendar and time zone are compatible.
Each time zone has a string identifier that is interpreted by the time zone object and can be displayed to the user. These identifiers use a long form consisting of a major and minor regional name, separated by '/
'. For example, the following are all valid time zone identifiers: America/New_York
, Australia/Brisbane
, Africa/Timbuktu
. Many time zones have a short form identifier— often just a three letter acronym—some of which are recognized by TimeZone
for backward compatibility. You should endeavor to always use the long form—after all, while many people know that EST
stands for “Eastern Standard Time,” that doesn't tell you for which country. TimeZone
also recognizes generic identifiers expressed as the difference in time from GMT
. For example, GMT+10:00
and GMT-4:00
are both valid generic time zone identifiers. You can get an array of all the identifiers available on your system from the static method getAvailableIDs
. If you want only those for a given offset from GMT
, you can invoke getAvailableIDs
with that offset. An offset might, for example, have identifiers for both daylight saving and standard time zones.
You can find the identifier of a given TimeZone
object from getID
, and you can set it with setID
. Setting the identifier changes only the identifier on the time zone—it does not change the offset or other values. You can get the time zone for a given identifier by passing it to the static method getTimeZone
.
A time zone can be converted into a displayable form by using one of the getDisplayName
methods, similar to those of Locale
. These methods allow you to specify whether to use the default locale or a specified one, and whether to use a short or long format. The string returned by the display methods is controlled by a DateFormat
object (which you'll see a little later). These objects maintain their own tables of information on how to format different time zones. On a given system they may not maintain information for all the supported time zones, in which case the generic identifier form is used, such as in the example on page 696.
Each time zone has a raw offset from GMT
, which can be either positive or negative. You can get or set the raw offset by using getRawOffset
or set RawOffset
, but you should rarely need to do this.
Daylight saving time supplements the raw offset with a seasonal time shift. The value of this shift can be obtained from getDSTSavings
—the default implementation returns 3,600,000 (the number of milliseconds in an hour). You can ask whether a time zone ever uses daylight saving time during the year by invoking the method useDaylightTime
, which returns a boolean
. The method inDaylightTime
returns true
if the Date
argument you pass would fall inside daylight saving time in the zone.
You can obtain the exact offset for a time zone on a given date by specifying that date in milliseconds or by using calendar fields to specify the year and month and so on.
public int
getOffset(long date)
Returns the offset from GMT
for the given time in this time zone, taking any daylight saving time offset into account
public abstract int
getOffset(int era, int year, int month, int day, int dayOfWeek, int milliseconds)
Returns the offset from GMT
for the given time in this time zone, taking any daylight saving time offset into account. All parameters are interpreted relative to the calendar for which the particular time zone implementation is designed. The era
parameter represents calendar-specific eras, such as B.C.
and A.D.
in the Gregorian calendar.
The GregorianCalendar
class is a concrete subclass of Calendar
that reflects UTC
(Coordinated Universal Time), although it cannot always do so exactly. Imprecise behavior is inherited from the time mechanisms of the underlying system.[2] Parts of a date are specified in UTC
standard units and ranges. Here are the ranges for GregorianCalendar
:
| 1–292278994 |
| 0–11 |
| Day of the month, 1–31 |
| 0–23 |
| 0–59 |
| 0–59 |
| 0–999 |
The GregorianCalendar
class supports several constructors:
public
GregorianCalendar()
Creates a GregorianCalendar
object that represents the current time in the default time zone with the default locale.
public
GregorianCalendar(int year, int month, int date, int hrs, int min, int sec)
Creates a GregorianCalendar
object that represents the given date in the default time zone with the default locale.
public
GregorianCalendar(int year, int month, int date, int hrs, int min)
Equivalent to GregorianCalendar(year,month,
date,hrs,
min,0)
—that is, the beginning of the specified minute.
public
GregorianCalendar(int year, int month, int date)
Equivalent to GregorianCalendar(year,month,
date,0,
0,0)
—that is, midnight on the given date (which is considered to be the start of the day).
public
GregorianCalendar(Locale locale)
Creates a GregorianCalendar
object that represents the current time in the default time zone with the given locale
.
public
GregorianCalendar(TimeZone timeZone)
Creates a GregorianCalendar
object that represents the current time in the given timeZone
with the default locale.
public
GregorianCalendar(TimeZone zone, Locale locale)
Creates a GregorianCalendar
object that represents the current time in the given timeZone
with the given locale
.
In addition to the methods it inherits from Calendar
, GregorianCalendar
provides an isLeapYear
method that returns true
if the passed in year is a leap year in that calendar.
The Gregorian calendar was preceded by the Julian calendar in many places. In a GregorianCalendar
object, the default date at which this change happened is midnight local time on October 15, 1582. This is when the first countries switched, but others changed later. The getGregorianChange
method returns the time the calendar is currently using for the change as a Date
. You can set a calendar's change-over time by using setGregorianChange
with a Date
object.
The SimpleTimeZone
class is a concrete subclass of TimeZone
that expresses values for Gregorian calendars. It does not handle historical complexities, but instead projects current practices onto all times. For historical dates that precede the use of daylight saving time, for example, you will want to use a calendar with a time zone you have selected that ignores daylight saving time. For future dates, SimpleTimeZone
is probably as good a guess as any.
Date and time formatting is a separate issue from calendars, although they are closely related. Formatting is localized in a different way. Not only are the names of days and months different in different locales that share the same calendar, but also the order in which a dates' components are expressed changes. In the United States it is customary in short dates to put the month before the date, so that July 5 is written as 7/5. In many European countries the date comes first, so 5 July becomes 5/7 or 5.7 or …
In the previous sections the word “date” meant a number of milliseconds since the epoch, which could be interpreted as year, month, day-of-month, hours, minutes, and seconds information. When dealing with the formatting classes you must distinguish between dates, which deal with year, month, and day-of-month information, and times, which deal with hours, minutes, and seconds.
Date and time formatting issues are text issues, so the classes for formatting are in the java.text
package—though the java.util.Formatter
class (see page 624) also supports some localized date formatting as you shall see. The Date2
program on page 695 is simple because it does not localize its output. If you want localization, you need a DateFormat
object.
DateFormat
provides several ways to format and parse dates and times. It is a subclass of the general Format
class, discussed in Section 24.6.2 on page 710. There are three kinds of formatters, each returned by different static methods: date formatters from getDateInstance
, time formatters from getTimeInstance
, and date/time formatters from getDateTimeInstance
. Each of these formatters understands four formatting styles: SHORT
, MEDIUM
, LONG
, and FULL
, which are constants defined in DateFormat
. And for each of them you can either use the default locale or specify one. For example, to get a medium date formatter in the default locale, you would use
Format fmt = DateFormat.getDateInstance(DateFormat.MEDIUM);
To get a date and time formatter that uses dates in short form and times in full form in a Japanese locale, you would use
Locale japan = new Locale("jp", "JP"); Format fmt = DateFormat.getDateTimeInstance( DateFormat.SHORT, DateFormat.FULL, japan );
For all the various “get instance” methods, if both formatting style and locale are specified the locale is the last parameter. The date/time methods require two formatting styles: the first for the date part, the second for the time. The simplest getInstance
method takes no arguments and returns a date/time formatter for short formats in the default locale. The getAvailableLocales
method returns an array of Locale
objects for which date and time formatting is configured.
The following list shows how each formatting style is expressed for the same date. The output is from a date/time formatter for U.S. locales, with the same formatting mode used for both dates and times:
FULL: Friday, August 29, 1986 5:00:00 PM EDT LONG: August 29, 1986 5:00:00 PM EDT MEDIUM: Aug 29, 1986 5:00:00 PM SHORT: 8/29/86 5:00 PM
Each DateFormat
object has an associated calendar and time zone set by the “get instance” method that created it. They are returned by getCalendar
and getTimeZone
, respectively. You can set these values by using setCalendar
and setTimeZone
. Each DateFormat
object has a reference to a NumberFormat
object for formatting numbers. You can use the methods getNumberFormat
and setNumberFormat
. (Number formatting is covered briefly in Section 24.6.2 on page 710.)
You format dates with one of several format
methods based on the formatting parameters described earlier:
public final String
format(Date date)
Returns a formatted string for date
.
public abstract StringBuffer
format(Date date, StringBuffer appendTo, FieldPosition pos)
Adds the formatted string for date
to the end of appendTo
.
public abstract StringBuffer
format(Object obj, StringBuffer appendTo, FieldPosition pos)
Adds the formatted string for obj
to the end of appendTo
. The object can be either a Date
or a Number
whose longValue
is a time in milliseconds.
The pos
argument is a FieldPosition
object that tracks the starting and ending index for a specific field within the formatted output. You create a FieldPosition
object by passing an integer code that represents the field that the object should track. These codes are static fields in DateFormat
, such as MINUTE_FIELD
or MONTH_FIELD
. Suppose you construct a FieldPosition
object pos
with MINUTE_FIELD
and then pass it as an argument to a format
method. When format
returns, the getBeginIndex
and getEndIndex
methods of pos
will return the start and end indices of the characters representing minutes within the formatted string. A specific formatter could also use the FieldPosition
object to align the represented field within the formatted string. To make that happen, you would first invoke the setBeginIndex
and setEndIndex
methods of pos
, passing the indices where you would like that field to start and end in the formatted string. Exactly how the formatter aligns the formatted text depends on the formatter implementation.
A DateFormat
object can also be used to parse dates. Date parsing can be lenient or not, depending on your preference. Lenient date parsing is as forgiving as it can be, whereas strict parsing requires the format and information to be proper and complete. The default is to be lenient. You can use setLenient
to set leniency to be true
or false
. You can test leniency via isLenient
.
The parsing methods are
public Date
parse(String text)
throws ParseException
Tries to parse text
into a date and/or time. If successful, a Date
object is returned; otherwise, a ParseException
is thrown.
public abstract Date
parse(String text, ParsePosition pos)
Tries to parse text
into a date and/or time. If successful, a Date
object is returned; otherwise, returns a null
reference. When the method is called, pos
is the position at which to start parsing; at the end it will either be positioned after the parsed text or will remain unchanged if an error occurred.
public Object
parseObject(String text, ParsePosition pos)
Returns the result of parse(text,pos)
. This method is provided to fulfill the generic contract of Format
.
The class java.text.SimpleDateFormat
is a concrete implementation of DateFormat
that is used in many locales. If you are writing a DateFormat
class, you may find it useful to extend SimpleDateFormat
. SimpleDateFormat
uses methods in the DateFormatSymbols
class to get localized strings and symbols for date representation. When formatting or parsing dates, you should usually not create SimpleDateFormat
objects; instead, you should use one of the “get instance” methods to return an appropriate formatter.
DateFormat
has protected fields calendar
and numberFormat
that give direct access to the values publicly manipulated with the set and get methods.
Exercise 24.3: Write a program that takes a string argument that is parsed into the date to print, and print that date in all possible styles. How lenient will the date parsing be?
The java.util.Formatter
class, described in Chapter 22, also supports the formatting of date and time information using a supplied Date
or Calendar
object, or a date represented as a long
(or Long
). Using the available format conversions you can extract information about that date/time, including things like the day of the month, the day of the week, the year, the hour of the day, and so forth.
The output of the formatter is localized according to the locale associated with that formatter, so things like the name of the day and month will be in the correct language—however, digits themselves are not localized. Unlike DateFormat
, a formatter cannot help you with localization issues such as knowing whether the month or the day should come first in a date—it simply provides access to each individual component and your program must combine them in the right way.
A date/time conversion is indicated by a format conversion of t
(or T
for uppercase output), followed by various suffixes that indicate what is to be output and in what form. The following table lists the conversion suffixes related to times:
| Hour of the day for 24-hour clock format. Two digits: 00–23 |
| Hour of the day for 12-hour clock format. Two digits: 01–12 |
| Hour of the day for 24-hour clock format: 0–23 |
| Hour of the day for 12-hour clock format: 1–12 |
| Minute within the hour. Two digits: 00–59 |
| Seconds within the minute. Two digits: 00–60 (60 is a leap second) |
| Milliseconds within the second. Three digits: 000–999 |
| Nanoseconds within the second. Nine digits: 000000000–999999999 |
| Locale specific |
| Numeric offset from |
| String representing the abbreviation for the time zone |
| Seconds since the epoch. |
| Milliseconds since the epoch. |
So, for example, the following code will print out the current time in the familiar hh:mm:ss format:
System.out.printf("%1$tH:%1$tM:%1$tS %n", new Date());
The conversion suffixes that deal with dates are
| Full month name |
| Abbreviated month name |
| Same as ' |
| Full name of the day of the week |
| Short name of the day of the week |
| The four digit year divided by 100. Two digits: 00–99 |
| Year. Four digits: 0000–9999 |
| Year: Two digits: 00–99 |
| Day of the year. Three digits: 001–999 |
| Month in year. Two digits: 01–99 |
| Day of month. Two digits: 01–99 |
| Day of month: 1–99 |
Naturally, the valid range for day of month, month of year, and so forth, depends on the calendar that is being used. To continue the example, the following code will print the current date in the common mm/dd/yy format:
System.out.printf("%1$tm/%1$td/%1$ty %n", new Date());
As you can see, all the information about a date or time can be extracted and you can combine the pieces in whatever way you need. Doing so, however, is rather tedious both for the writer and any subsequent readers of the code. To ease the tedium a third set of conversion suffixes provides convenient shorthands for common combinations of the other conversions:
| Time in 24-hour clock hh:mm format ( |
| Time in 24-hour clock hh:mm:ss format ( |
| Time in 12-hour clock h:mm:ss am/pm format ( |
| Date in mm/dd/yy format ( |
| Complete date in |
| Long date and time format ( |
So the previous examples could be combined in the more compact and somewhat more readable
System.out.printf("%1$tT %1$tD %n", new Date());
As with all format conversions a width can be specified before the conversion indicator, to specify the minimum number of characters to output. If the converted value is smaller than the width then the output is padded with spaces. The only format flag that can be specified with the date/time conversions is the '–' flag for left-justification—if this flag is given then a width must be supplied as well.
The package java.text
provides several types for localizing text behavior, such as collation (comparing strings), and formatting and parsing text, numbers, and dates. You have already learned about dates in detail so in this section we look at general formatting and parsing, and collation.
Comparing strings in a locale-sensitive fashion is called collation. The central class for collation is Collator
, which provides a compare
method that takes two strings and returns an int
less than, equal to, or greater than zero as the first string is less than, equal to, or greater than the second.
As with most locale-sensitive classes, you get the best available Collator
object for a locale from a getInstance
method, either passing a specific Locale
object or specifying no locale and so using the default locale. For example, you get the best available collator to sort a set of Russian-language strings like this:
Locale russian = new Locale("ru", ""); Collator coll = Collator.getInstance(russian);
You then can use coll.compare
to determine the order of strings. A Collator
object takes locality—not Unicode equivalence—into account when comparing. For example, in a French-speaking locale, the characters ç
and c
are considered equivalent for sorting purposes. A naïve sort that used String.compare
would put all strings starting with ç
after all those starting with c
(indeed, it would put them after z
), but in a French locale this would be wrong. They should be sorted according to the characters that follow the initial c
or ç
characters in the strings.
Determining collation factors for a string can be expensive. A CollationKey
object examines a string once, so you can compare precomputed keys instead of comparing strings with a Collator
. The method Collator.getCollationKey
returns a key for a string. For example, because Collator
implements the interface Comparator
, you could use a Collator
to maintain a sorted set of strings:
class CollatorSorting { private TreeSet<String> sortedStrings; CollatorSorting(Collator collator) { sortedStrings = new TreeSet<String>(collator); } void add(String str) { sortedStrings.add(str); } Iterator<String> strings() { return sortedStrings.iterator(); } }
Each time a new string is inserted in sortedStrings
, the Collator
is used as a Comparator
, with its compare
method invoked on various elements of the set until the TreeSet
finds the proper place to insert the string. This results in several comparisons. You can make this quicker at the cost of space by creating a TreeMap
that uses a CollationKey
to map to the original string. CollationKey
implements the interface Comparable
with a compareTo
method that can be much more efficient than using Collator.compare
.
class CollationKeySorting { private TreeMap<CollationKey, String> sortedStrings; private Collator collator; CollationKeySorting(Collator collator) { this.collator = collator; sortedStrings = new TreeMap<CollationKey, String>(); } void add(String str) { sortedStrings.put( collator.getCollationKey(str), str); } Iterator<String> strings() { return sortedStrings.values().iterator(); } }
The abstract Format
class provides methods to format and parse objects according to a locale. Format
declares a format
method that takes an object and returns a formatted String
, throwing IllegalArgumentException
if the object is not of a type known to the formatting object. Format
also declares a parseObject
method that takes a String
and returns an object initialized from the parsed data, throwing ParseException
if the string is not understood. Each of these methods is implemented as appropriate for the particular kind of formatting. The package java.text
provides three Format
subclasses:
DateFormat
was discussed in the previous section.
MessageFormat
helps you localize output when printing messages that contain values from your program. Because word order varies among languages, you cannot simply use a localized string concatenated with your program's values. For example, the English phrase “a fantastic menu” would in French have the word order “un menu fantastique.” A message that took adjectives and nouns from lists and displayed them in such a phrase could use a MessageFormat
object to localize the order.
NumberFormat
is an abstract class that defines a general way to format and parse various kinds of numbers for different locales. It has two subclasses: ChoiceFormat
to choose among alternatives based on number (such as picking between a singular or plural variant of a word); and DecimalFormat
to format and parse decimal numbers. (The formatting capabilities of NumberFormat
are more powerful than those provided by java.util.Formatter
.)
NumberFormat
in turn has four different kinds of “get instance” methods. Each method uses either a provided Locale
object or the default locale.
getNumberInstance
returns a general number formatter/parser. This is the kind of object returned by the generic getInstance
method.
getIntegerInstance
returns a number formatter/parser that rounds floating-point values to the nearest integer.
getCurrencyInstance
returns a formatter/parser for currency values. The Currency
object used by a NumberFormatter
can also be retrieved with the getCurrency
method.
getPercentInstance
returns a formatter/parser for percentages.
Here is a method you can use to print a number using the format for several different locales:
public void reformat(double num, String[] locales) { for (String loc : locales) { Locale pl = parseLocale(loc); NumberFormat fmt = NumberFormat.getInstance(pl); System.out.print(fmt.format(num)); System.out.println(" " + pl.getDisplayName()); } } public static Locale parseLocale(String desc) { StringTokenizer st = new StringTokenizer(desc, "_"); String lang = "", ctry = "", var = ""; try { lang = st.nextToken(); ctry = st.nextToken(); var = st.nextToken(); } catch (java.util.NoSuchElementException e) { ; // fine, let the others default } return new Locale(lang, ctry, var); }
The first argument to reformat
is the number to format; the other arguments specify locales. We use a StringTokenizer
to break locale argument strings into constituent components. For example, cy_GB
will be broken into the language cy
(Welsh), the country GB
(United Kingdom), and the empty variant ""
. We create a Locale
object from each result, get a number formatter for that locale, and then print the formatted number and the locale. When run with the number 5372.97
and the locale arguments en_US
, lv
, it_CH
, and lt
, reformat
prints:
5,372.97 English (United States) 5 372,97 Latvian 5'372.97 Italian (Switzerland) 5.372,97 Lithuanian
A similar method can be written that takes a locale and a number formatted in that locale, uses the parse
method to get a Number
object, and prints the resulting value formatted according to a list of other locales:
public void parseAndReformat(String locale, String number, String[] locales) throws ParseException { Locale loc = LocalNumber.parseLocale(locale); NumberFormat parser = NumberFormat.getInstance(loc); Number num = parser.parse(number); for (String str : locales) { Locale pl = LocalNumber.parseLocale(str); NumberFormat fmt = NumberFormat.getInstance(pl); System.out.println(fmt.format(num)); } }
When run with the original locale it_CH
, the number string "5'372.97"
and the locale arguments en_US
, lv
, and lt
, parseAndReformat
prints:
5,372.97 5 372,97 5.372,97
Parsing requires finding boundaries in text. The class BreakIterator
provides a locale-sensitive tool for locating such break points. It has four kinds of “get instance” methods that return specific types of BreakIterator
objects:
getCharacterInstance
returns an iterator that shows valid breaks in a string for individual characters (not necessarily a char
).
getWordInstance
returns an iterator that shows word breaks in a string.
getLineInstance
returns an iterator that shows where it is proper to break a line in a string, for purposes such as wrapping text.
getSentenceInstance
returns an iterator that shows where sentence breaks occur in a string.
The following code prints each break shown by a given BreakIterator
:
static void showBreaks(BreakIterator breaks, String str) { breaks.setText(str); int start = breaks.first(); int end = breaks.next(); while (end != BreakIterator.DONE) { System.out.println(str.substring(start, end)); start = end; end = breaks.next(); } System.out.println(str.substring(start)); // the last }
A BreakIterator
is a different style of iterator from the usual java.util.Iterator
objects you have seen. It provides several methods for iterating forward and backward within a string, looking for different break positions.
You should always use these boundary classes when breaking up text because the issues involved are subtle and widely varying. For example, the logical characters used in these classes are not necessarily equivalent to a single char
. Unicode characters can be combined, so it can take more than one 16-bit Unicode value to constitute a logical character. And word breaks are not necessarily spaces—some languages do not even use spaces.
Never speak more clearly than you think | ||
--Jeremey Bernstein |
[1] For historical reasons Calendar.clone
returns Object
not Calendar
, so a cast is required.
[2] Almost all modern systems assume that one day is 24*60*60 seconds. In UTC
, about once a year an extra second, called a leap second, is added to a day to account for the wobble of the Earth. Most computer clocks are not accurate enough to reflect this distinction, so neither is the Date
class. Some computer standards are defined in GMT
, which is the “civil” name for the standard; UT
is the scientific name for the same standard. The distinction between UTC
and UT
is that UT
is based on an atomic clock and UTC
is based on astronomical observations. For almost all practical purposes, this is an invisibly fine hair to split. See “Further Reading” on page 755 for references.
3.145.172.56