ES6 provides new ways of creating strings and adds new properties to global String object and to its instances to make working with strings easier. Strings in JavaScript lacked features and capabilities when compared with programming languages such as Python and Ruby therefore ES6 enhanced strings to change that.
Before we get into new string features lets revise JavaScript's internal character encoding and escape sequences. In the Unicode character set every character is represented by a base 10 decimal number called a code point. A code unit is a fixed number of bits in memory to store a code point. An encoding schema determines the length of code unit. A code unit is 8 bits if the UTF-8 encoding schema is used or 16 bits if the UTF-16 encoding schema is used. If a code point doesn't fit in a code unit it is spilt into multiple code units, that is, multiple characters in sequence representing a single character.
JavaScript interpreters by default interpret JavaScript source code as sequence of UTF-16 code units. If source code is written in the UTF-8 encoding schema then there are various ways to tell the JavaScript interpreter to interpret it as sequence of UTF-8 code units. JavaScript strings are always a sequence of UTF-16 code points.
Any Unicode character with a code point less than 65536 can be escaped in a JavaScript string or source code using the hexadecimal value of its code point, prefixed with u
. Escapes are six characters long. They require exactly four characters following u
. If the hexadecimal character code is only one, two or three characters long, you'll need to pad it with leading zeroes. Here is an example to demonstrate this:
var u0061 = "u0061u0062u0063"; console.log(a); //Output is "abc"
In ES5, for escaping a character that requires more than 16 bits for storing, we needed two Unicode escapes. For example, to add u1F691
to a string we had to escape it this way:
console.log("uD83DuDE91");
Here uD83D
and uDE91
are called surrogate pairs. A surrogate pair is two Unicode characters when written in sequence represent another character.
In ES6 we can write it without surrogate pairs:
console.log("u{1F691}");
A string stores u1F691
as uD83DuDE91
, so length of the above string is still 2
The codePointAt()
method of a string returns a non-negative integer that is the code point of the character at the given index.
Here is an example to demonstrate this:
console.log("uD83DuDE91".codePointAt(1)); console.log("u{1F691}".codePointAt(1)); console.log("hello".codePointAt(2));
Output is:
56977 56977 1080
The fromCodePoint()
method of String
object takes a sequence of code points and returns a string. Here is an example to demonstrate this:
console.log(String.fromCodePoint(0x61, 0x62, 0x63)); console.log("u0061u0062 " == String.fromCodePoint(0x61, 0x62));
Output is:
abc true
The repeat()
method of a string, constructs and returns a new string which contains the specified number of copies on which it was called, concatenated together. Here is an example to demonstrate this:
console.log("a".repeat(6)); //Output "aaaaaa"
The includes()
method is used to find whether one string may be found in another string, returning true
or false
as appropriate. Here is an example to demonstrate this:
var str = "Hi, I am a JS Developer"; console.log(str.includes("JS")); //Output "true"
It takes an optional second parameter representing the position in the string at which to begin searching. Here is an example to demonstrate this:
var str = "Hi, I am a JS Developer"; console.log(str.includes("JS", 13)); // Output "false"
The startsWith()
method is used to find whether a string begins with the characters of another string, returning true
or false
as appropriate. Here is an example to demonstrate this:
var str = "Hi, I am a JS Developer"; console.log(str.startsWith('Hi, I am')); //Output "true"
It takes an optional second parameter representing the position in the string at which to begin searching. Here is an example to demonstrate this:
var str = "Hi, I am a JS Developer"; console.log(str.startsWith('JS Developer', 11)); //Output "true"
The endsWith()
method is used to find whether a string ends with the characters of another string, returning true or false as appropriate. It also takes an optional second parameter representing the position in the string that is assumed as the end of the string. Here is an example to demonstrate this:
var str = "Hi, I am a JS Developer"; console.log(str.endsWith("JS Developer")); //Output "true" console.log(str.endsWith("JS", 13)); //Output "true"
Normalization is simply the process of searching and standardizing code points without changing the meaning of the string.
There are also different forms of normalization: NFC, NFD, NFKC and NFKD.
Let's understand Unicode string normalization by an example use case:
There are many Unicode characters that can be stored in 16 bits and can also be represented using a surrogate pair. For example, 'é
' character can be escaped two ways:
console.log("u00E9"); //output 'é' console.log("eu0301"); //output 'é'
The problem is when applying the ==
operator, iterating or finding length you will get an unexpected result. Here is an example to demonstrate this:
var a = "u00E9"; var b = "eu0301"; console.log(a == b); console.log(a.length); console.log(b.length); for(let i = 0; i<a.length; i++) { console.log(a[i]); } for(let i = 0; i<b.length; i++) { console.log(b[i]); }
Output is:
false 1 2 é é
Here both the strings display the same way but when we do various string operations on them we get different results.
The length property ignores surrogate pairs and assumes every 16-bit to be single character. The ==
operator matches the binary bits therefore it also ignores surrogate pairs. The []
operator also assumes every 16-bit to be an index therefore ignoring surrogate pairs.
In this case to solve the problems we need to convert the surrogate pairs to 16-bit character representation. This process is called as normalization. To do this ES6 provides a normalize()
function. Here is an example to demonstrate this:
var a = "u00E9".normalize(); var b = "eu0301".normalize(); console.log(a == b); console.log(a.length); console.log(b.length); for(let i = 0; i<a.length; i++) { console.log(a[i]); } for(let i = 0; i<b.length; i++) { console.log(b[i]); }
Output is:
true 1 1 é é
Here the output is as expected. normalize()
returns the normalized version of the string. normalize()
uses NFC form by default.
Normalization is not just done in the case of surrogate pairs; there are many other cases.
To learn more about Unicode string normalization and normalization forms visit http://www.unicode.org/reports/tr15/.
Template strings is just a new literal for creating strings that makes various things easier. They provide features such as embedded expressions, multi-line strings, string interpolation, string formatting, string tagging, and so on. They are always processed and converted to a normal JavaScript string on runtime therefore they can be used wherever we use normal strings.
Template strings are written using back ticks instead of single or double quotes. Here is an example of a simple template string:
let str1 = `hello!!!`; //template string let str2 = "hello!!!"; console.log(str1 === str2); //output "true"
In ES5, to embed expressions within normal strings you would do something like this:
Var a = 20; Var b = 10; Var c = "JavaScript"; Var str = "My age is " + (a + b) + " and I love " + c; console.log(str);
Output is:
My age is 30 and I love JavaScript
In ES6, template strings make it much easier to embed expressions in strings. Template strings can contain expressions in them. The expressions are placed in placeholders indicated by dollar sign and curly brackets, that is, ${expressions}
. The resolved value of expressions in the placeholders and the text between them are passed to a function for resolving the template string to a normal string. The default function just concatenates the parts into a single string. If we use a custom function to process the string parts then the template string is called as a tagged template string and the custom function is called as tag function.
Here is an example which shows how to embed expressions in a template strings:
let a = 20; let b = 10; let c = "JavaScript"; let str = `My age is ${a+b} and I love ${c}`; console.log(str);
Output is:
My age is 30 and I love JavaScript
Let's create a tagged template string, that is, process the string using a tag function. Let's implement the tag function to do the same thing as the default function. Here is an example to demonstrate this:
let tag = function(strings, ...values) { let result = ""; for(let i = 0; i<strings.length; i++) { result += strings[i]; if(i<values.length) { result += values[i]; } } return result; }; return result; }; let a = 20; let b = 10; let c = "JavaScript"; let str = tag `My age is ${a+b} and I love ${c}`; console.log(str);
Output is:
My age is 30 and I love JavaScript
Here our tag function's name is tag
but you can name it anything else. The custom function takes two parameters, that is, the first parameter is an array of string literals of the template string and the second parameter is an array of resolved values of the expressions. The second parameter is passed as multiple arguments therefore we use the rest argument.
Template strings provide a new way to create strings that contain multiple lines of text.
In ES5, we need to use
new line character to add new line breaks. Here is an example to demonstrate this:
console.log("1 2 3");
Output is:
1 2 3
In ES6, using multiline string we can simply write:
console.log(`1 2 3`);
Output is:
1 2 3
In the above code we simply included new lines where we needed to place
. While converting the template string to normal string the new lines are converted to
.
A raw string is a normal string in which escaped characters aren't interpreted.
We can create a raw string using a template string. We can get raw version of a template string use String.raw
tag function. Here is an example to demonstrate this:
let s = String.raw `xy ${ 1 + 1 }z`; console.log(s);
Output is:
xy 2z
Here
is not interpreted as new line character instead of its two characters, that is, and
n
. Length of variable s
would be 6
.
If you create a tagged function and you want to return the raw string then use raw
property of the first argument. raw
property is an array, which holds raw versions of the strings of the first argument. Here is an example to demonstrate this:
let tag = function(strings, ...values) { return strings.raw[0] }; let str = tag `Hello World!!!`; console.log(str);
Output is:
Hello World!!!
3.143.235.227