Most applications perform locale aware operations like working with texts, dates, timezones, etc. The PHP Intl extension provides a good API for accessing the widely known ICU library's functions.
The extension is installed by default on PHP 5.3 and above. You can look for it by running the following command:
php -m | grep 'intl'
If the extension is not present, you can install it manually by following the installation guide. If you're using Ubuntu, you can directly run the following commands.
sudo apt-get update
sudo apt-get install php5-intl
If you're using PHP7 on your machine, you need to add the (ppa:ondrej/php
) PPA, update your system and install the Intl extension.
# Add PPA
sudo add-apt-repository ppa:ondrej/php-7.0
# Update repository index
sudo apt-get update
# install extension
sudo apt-get install php7.0-intl
Most modern applications are built with localization in mind. Sometimes, the message is a plain string with variable placeholders, other times it's a complex pluralized string.
We're going to start with a simple message containing a placeholder. Placeholders are patterns enclosed in curly braces. Here is an example:
var_dump(
MessageFormatter::formatMessage(
"en_US",
"I have {0, number, integer} apples.",
[ 3 ]
)
);
// output
string(16) "I have 3 apples."
The arguments passed to the MessageFormatter::formatMessage
method are:
The {0, number, integer}
placeholder will inject the first item of the data array as a number
- integer
(see the table below for the list of options). We can also use named arguments for placeholders. The example below will output the same result.
var_dump(
MessageFormatter::formatMessage(
"en_US",
"I have {number_apples, number, integer} apples.",
[ 'number_apples' => 3 ]
)
);
Different languages have different numeral systems, like Arabic, indian, etc.
The previous example is targeting the en_US
locale. Let's change it to ar
to see the difference.
var_dump(
MessageFormatter::formatMessage(
"ar",
"I have {number_apples, number, integer} apples.",
[ 'number_apples' => 3 ]
)
);
string(17) "I have ٣ apples."
We can also change it to Bengali locale (bn
).
var_dump(
MessageFormatter::formatMessage(
"bn",
"I have {number_apples, number, integer} apples.",
[ 'number_apples' => 3 ]
)
);
string(18) "I have ৩ apples."
So far, we've only worked with numbers. Let's take a look at other types that we can use.
$time = time();
var_dump( MessageFormatter::formatMessage(
"en_US",
"Today is {0, date, full} - {0, time}",
array( $time )
) );
string(47) "Today is Wednesday, April 6, 2016 - 11:21:47 PM"
var_dump( MessageFormatter::formatMessage(
"en_US",
"duration: {0, duration}",
array( $time )
) );
string(23) "duration: 405,551:27:58"
We can also spell out the passed numbers.
var_dump( MessageFormatter::formatMessage(
"en_US",
"I have {0, spellout} apples",
array( 34 )
) );
string(25) "I have thirty-four apples"
It also works on different locales. Here is an example using the Arabic language.
var_dump( MessageFormatter::formatMessage(
"ar",
"لدي {0, spellout} تفاحة",
array( 34 )
) );
string(44) "لدي أربعة و ثلاثون تفاحة"
argType | argStyle |
---|---|
number | integer, currency, percent |
date | short, medium, long, full |
time | short, medium, long, full |
spellout | short, medium, long, full |
ordinal | |
duration |
An important part of localizing our application is to manage plural messages to make our UI as intuitive as possible. The apples example above will do the job. Here's how messages should look like in this case.
number_apples
= 0): I have no apples.number_apples
= 1): I have one apple.number_apples
> 1): I have X apples.var_dump( MessageFormatter::formatMessage(
"en_US",
'I have {number_apples, plural, =0{no apples} =1{one apple} other{# apples}}',
array('number_apples' => 10)
) );
// number_apples = 0
string(16) "I have no apples"
// number_apples = 1
string(16) "I have one apple"
// number_apples = 10
string(16) "I have 10 apples"
The syntax is really straightforward, and most pluralization packages adopt this syntax. Check the documentation for more details.
{data, plural, offsetValue =value{message}... other{message}}
data
: value index.plural
: argType.offsetValue
: the offset value is optional(offset:value
). It subtracts the offset from the value.=value{message}
: value to test for equality, and the message between curly braces. We can repeat this part multiple times (=0{no apples} =1{one apple} =2{two apple}
).other{message}
: The default case, like in a switch - case
statement. The #
character may be used the inject the data
value.In some cases, we need to print a different message for every range. The example below does this.
var_dump( MessageFormatter::formatMessage(
"en_US",
'The value of {0,number} is {0, choice,
0 # between 0 and 19 |
20 # between 20 and 39 |
40 # between 40 and 59 |
60 # between 60 and 79 |
80 # between 80 and 100 |
100 < more than 100 }',
array(60)
) );
string(38) "The value of 60 is between 60 and 79 "
The argType
in this case is set to choice
, and this is the syntax format:
{value, choice, choiceStyle}
The official definition from the ICU documentation is:
choiceStyle = number separator message ('|' number separator message)*
number = normal_number | ['-'] ∞ (U+221E, infinity)
normal_number = double value (unlocalized ASCII string)
separator = less_than | less_than_or_equal
less_than = '<'
less_than_or_equal = '#' | ≤ (U+2264)
Note: ICU developers discourage the use of the choice type.
Sometimes we need something like the select option UI component. Profile pages use this to update the UI messages according to the user's gender, etc. Here's an example:
var_dump( MessageFormatter::formatMessage(
"en_US",
"{gender, select, ".
"female {She has some apples} ".
"male {He has some apples.}".
"other {It has some apples.}".
"}",
array('gender' => 'female')
) );
string(19) "She has some apples"
The pattern is defined as follows:
{value, select, selectStyle}
// selectStyle
selectValue {message} (selectValue {message})*
The message
argument may contain other patterns like choice and plural. The next part will explain a complex example where we combine multiple patterns. Check the ICU documentation for more details.
So far, we've seen some simple examples like pluralization, select, etc. Some cases are more complex than others. The ICU documentation has a very good example illustrating this. We'll insert part by part to make it simpler to grasp.
var_dump( MessageFormatter::formatMessage(
"en_US",
"{gender_of_host, select, ".
"female {She has a party} ".
"male {He has some apples.}".
"other {He has some apples.}".
"}",
array('gender_of_host' => 'female', "num_guests" => 5, 'host' => "Hanae", 'guest' => 'Younes' )
) );
This is the same example we used before, but instead of using a simple message, we customize it depending on the num_guests
value (talking about pluralization here).
var_dump( MessageFormatter::formatMessage(
"en_US",
"{gender_of_host, select, ".
"female {".
"{num_guests, plural, offset:1 ".
"=0 {{host} does not have a party.}".
"=1 {{host} invites {guest} to her party.}".
"=2 {{host} invites {guest} and one other person to her party.}".
"other {{host} invites {guest} and # other people to her party.}}}".
"male {He has some apples.}".
"other {He has some apples.}}",
array('gender_of_host' => 'female', "num_guests" => 5, 'host' => "Hanae", 'guest' => 'Younes' )
) );
Notice that we're using the offset:1
to remove one guest from the num_guests
value.
string(53) "Hanae invites Younes and 4 other people to her party."
Here's the full snippet of this example.
var_dump( MessageFormatter::formatMessage(
"en_US",
"{gender_of_host, select, ".
"female {".
"{num_guests, plural, offset:1 ".
"=0 {{host} does not have a party.}".
"=1 {{host} invites {guest} to her party.}".
"=2 {{host} invites {guest} and one other person to her party.}".
"other {{host} invites {guest} and # other people to her party.}}}".
"male {".
"{num_guests, plural, offset:1 ".
"=0 {{host} does not have a party.}".
"=1 {{host} invites {guest} to his party.}".
"=2 {{host} invites {guest} and one other person to his party.}".
"other {{host} invites {guest} and # other people to his party.}}}".
"other {".
"{num_guests, plural, offset:1 ".
"=0 {{host} does not have a party.}".
"=1 {{host} invites {guest} to their party.}".
"=2 {{host} invites {guest} and one other person to their party.}".
"other {{host} invites {guest} and # other people to their party.}}}}",
array('gender_of_host' => 'female', "num_guests" => 5, 'host' => "Hanae", 'guest' => 'Younes' )
) );
Change the number of guests to test all message types.
// num_guests = 2
string(55) "Hanae invites Younes and one other person to her party."
// num_guests = 1
string(34) "Hanae invites Younes to her party."
// num_guests = 0
string(28) "Hanae does not have a party."
There's not much to say about parsing messages; we use the pattern we used for formatting to extract data from an output message.
$messageFormater = new MessageFormatter("en_US", 'I have {0, number}');
var_dump( $messageFormater->parse("I have 10 apples") );
array(1) {
[0]=>
int(10)
}
Check the documentation for more details about message parsing.
3.142.40.56