Working with strings

ES6 provides new ways of creating strings and adds new properties to global String object and to its instances to make working with strings easier. Strings in JavaScript lacked features and capabilities when compared with programming languages such as Python and Ruby therefore ES6 enhanced strings to change that.

Before we get into new string features lets revise JavaScript's internal character encoding and escape sequences. In the Unicode character set every character is represented by a base 10 decimal number called a code point. A code unit is a fixed number of bits in memory to store a code point. An encoding schema determines the length of code unit. A code unit is 8 bits if the UTF-8 encoding schema is used or 16 bits if the UTF-16 encoding schema is used. If a code point doesn't fit in a code unit it is spilt into multiple code units, that is, multiple characters in sequence representing a single character.

JavaScript interpreters by default interpret JavaScript source code as sequence of UTF-16 code units. If source code is written in the UTF-8 encoding schema then there are various ways to tell the JavaScript interpreter to interpret it as sequence of UTF-8 code units. JavaScript strings are always a sequence of UTF-16 code points.

Any Unicode character with a code point less than 65536 can be escaped in a JavaScript string or source code using the hexadecimal value of its code point, prefixed with u. Escapes are six characters long. They require exactly four characters following u. If the hexadecimal character code is only one, two or three characters long, you'll need to pad it with leading zeroes. Here is an example to demonstrate this:

var u0061 = "u0061u0062u0063";
console.log(a); //Output is "abc"

Escaping larger code points

In ES5, for escaping a character that requires more than 16 bits for storing, we needed two Unicode escapes. For example, to add u1F691 to a string we had to escape it this way:

console.log("uD83DuDE91");

Here uD83D and uDE91 are called surrogate pairs. A surrogate pair is two Unicode characters when written in sequence represent another character.

In ES6 we can write it without surrogate pairs:

console.log("u{1F691}");

A string stores u1F691 as uD83DuDE91, so length of the above string is still 2

The codePointAt(index) method

The codePointAt() method of a string returns a non-negative integer that is the code point of the character at the given index.

Here is an example to demonstrate this:

console.log("uD83DuDE91".codePointAt(1));
console.log("u{1F691}".codePointAt(1));
console.log("hello".codePointAt(2));

Output is:

56977
56977
1080

The String.fromCodePoint(number1, …, number 2) method

The fromCodePoint() method of String object takes a sequence of code points and returns a string. Here is an example to demonstrate this:

console.log(String.fromCodePoint(0x61, 0x62, 0x63));
console.log("u0061u0062 " == String.fromCodePoint(0x61, 0x62));

Output is:

abc
true

The repeat(count) method

The repeat() method of a string, constructs and returns a new string which contains the specified number of copies on which it was called, concatenated together. Here is an example to demonstrate this:

console.log("a".repeat(6));      //Output "aaaaaa"

The includes(string, index) method

The includes() method is used to find whether one string may be found in another string, returning true or false as appropriate. Here is an example to demonstrate this:

var str = "Hi, I am a JS Developer";
console.log(str.includes("JS")); //Output "true"

It takes an optional second parameter representing the position in the string at which to begin searching. Here is an example to demonstrate this:

var str = "Hi, I am a JS Developer";
console.log(str.includes("JS", 13)); // Output "false"

The startsWith(string, index) method

The startsWith() method is used to find whether a string begins with the characters of another string, returning true or false as appropriate. Here is an example to demonstrate this:

var str = "Hi, I am a JS Developer";
console.log(str.startsWith('Hi, I am')); //Output "true"

It takes an optional second parameter representing the position in the string at which to begin searching. Here is an example to demonstrate this:

var str = "Hi, I am a JS Developer";
console.log(str.startsWith('JS Developer', 11)); //Output "true"

The endsWith(string, index) function

The endsWith() method is used to find whether a string ends with the characters of another string, returning true or false as appropriate. It also takes an optional second parameter representing the position in the string that is assumed as the end of the string. Here is an example to demonstrate this:

var str = "Hi, I am a JS Developer";
console.log(str.endsWith("JS Developer"));  //Output "true"
console.log(str.endsWith("JS", 13));        //Output "true"

Normalization

Normalization is simply the process of searching and standardizing code points without changing the meaning of the string.

There are also different forms of normalization: NFC, NFD, NFKC and NFKD.

Let's understand Unicode string normalization by an example use case:

A case study

There are many Unicode characters that can be stored in 16 bits and can also be represented using a surrogate pair. For example, 'é' character can be escaped two ways:

console.log("u00E9");  //output 'é'
console.log("eu0301"); //output 'é'

The problem is when applying the == operator, iterating or finding length you will get an unexpected result. Here is an example to demonstrate this:

var a = "u00E9";
var b = "eu0301";

console.log(a == b);
console.log(a.length);
console.log(b.length);

for(let i = 0; i<a.length; i++)
{
  console.log(a[i]);
}

for(let i = 0; i<b.length; i++)
{
  console.log(b[i]);
}

Output is:

false
1
2
é
é

Here both the strings display the same way but when we do various string operations on them we get different results.

The length property ignores surrogate pairs and assumes every 16-bit to be single character. The == operator matches the binary bits therefore it also ignores surrogate pairs. The [] operator also assumes every 16-bit to be an index therefore ignoring surrogate pairs.

In this case to solve the problems we need to convert the surrogate pairs to 16-bit character representation. This process is called as normalization. To do this ES6 provides a normalize() function. Here is an example to demonstrate this:

var a = "u00E9".normalize();
var b = "eu0301".normalize();

console.log(a == b);
console.log(a.length);
console.log(b.length);

for(let i = 0; i<a.length; i++)
{
  console.log(a[i]);
}

for(let i = 0; i<b.length; i++)
{
  console.log(b[i]);
}

Output is:

true
1
1
é
é

Here the output is as expected. normalize() returns the normalized version of the string. normalize() uses NFC form by default.

Normalization is not just done in the case of surrogate pairs; there are many other cases.

Note

The Normalized version of a string is not made for displaying to the user; it's used for comparing and searching in strings.

To learn more about Unicode string normalization and normalization forms visit http://www.unicode.org/reports/tr15/.

Template strings

Template strings is just a new literal for creating strings that makes various things easier. They provide features such as embedded expressions, multi-line strings, string interpolation, string formatting, string tagging, and so on. They are always processed and converted to a normal JavaScript string on runtime therefore they can be used wherever we use normal strings.

Template strings are written using back ticks instead of single or double quotes. Here is an example of a simple template string:

let str1 = `hello!!!`; //template string
let str2 = "hello!!!";

console.log(str1 === str2); //output "true"

Expressions

In ES5, to embed expressions within normal strings you would do something like this:

Var a = 20;
Var b = 10;
Var c = "JavaScript";
Var str = "My age is " + (a + b) + " and I love " + c;

console.log(str);

Output is:

My age is 30 and I love JavaScript

In ES6, template strings make it much easier to embed expressions in strings. Template strings can contain expressions in them. The expressions are placed in placeholders indicated by dollar sign and curly brackets, that is, ${expressions}. The resolved value of expressions in the placeholders and the text between them are passed to a function for resolving the template string to a normal string. The default function just concatenates the parts into a single string. If we use a custom function to process the string parts then the template string is called as a tagged template string and the custom function is called as tag function.

Here is an example which shows how to embed expressions in a template strings:

let a = 20;
let b = 10;
let c = "JavaScript";
let str = `My age is ${a+b} and I love ${c}`;

console.log(str);

Output is:

My age is 30 and I love JavaScript

Let's create a tagged template string, that is, process the string using a tag function. Let's implement the tag function to do the same thing as the default function. Here is an example to demonstrate this:

let tag = function(strings, ...values)
{
  let result = "";

  for(let i = 0; i<strings.length; i++)
  {
    result += strings[i];

    if(i<values.length)
    {
      result += values[i];
    }
  }

  return result;
};

return result;
};

let a = 20;
let b = 10;
let c = "JavaScript";
let str = tag `My age is ${a+b} and I love ${c}`;

console.log(str);

Output is:

My age is 30 and I love JavaScript

Here our tag function's name is tag but you can name it anything else. The custom function takes two parameters, that is, the first parameter is an array of string literals of the template string and the second parameter is an array of resolved values of the expressions. The second parameter is passed as multiple arguments therefore we use the rest argument.

Multiline strings

Template strings provide a new way to create strings that contain multiple lines of text.

In ES5, we need to use new line character to add new line breaks. Here is an example to demonstrate this:

console.log("1
2
3");

Output is:

1
2
3

In ES6, using multiline string we can simply write:

console.log(`1
2
3`);

Output is:

1
2
3

In the above code we simply included new lines where we needed to place . While converting the template string to normal string the new lines are converted to .

Raw strings

A raw string is a normal string in which escaped characters aren't interpreted.

We can create a raw string using a template string. We can get raw version of a template string use String.raw tag function. Here is an example to demonstrate this:

let s = String.raw `xy
${ 1 + 1 }z`;
console.log(s);

Output is:

xy
2z

Here is not interpreted as new line character instead of its two characters, that is, and n. Length of variable s would be 6.

If you create a tagged function and you want to return the raw string then use raw property of the first argument. raw property is an array, which holds raw versions of the strings of the first argument. Here is an example to demonstrate this:

let tag = function(strings, ...values)
{
    return strings.raw[0]
};

let str = tag `Hello 
 World!!!`;

console.log(str);

Output is:

Hello 
 World!!!
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.235.227