Chapter 5: Reference Types

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5

Reference Types

WHAT’S IN THIS CHAPTER?

Working with objects
Creating and manipulating arrays
Understanding basic JavaScript data types
Working with primitives and primitive wrappers

A reference value (object) is an instance of a specific reference type. In ECMAScript, reference types are structures used to group data and functionality together and are often incorrectly called classes. Although technically an object-oriented language, ECMAScript lacks some basic constructs that have traditionally been associated with object-oriented programming, including classes and interfaces. Reference types are also sometimes called object definitions, because they describe the properties and methods that objects should have.

Even though reference types are similar to classes, the two concepts are not equivalent. To avoid any confusion, the term class is not used in the rest of this book.

Again, objects are considered to be instances of a particular reference type. New objects are created by using the new operator followed by a constructor. A constructor is simply a function whose purpose is to create a new object. Consider the following line of code:

var person = new Object();

This code creates a new instance of the Object reference type and stores it in the variable person. The constructor being used is Object(), which creates a simple object with only the default properties and methods. ECMAScript provides a number of native reference types, such as Object, to help developers with common computing tasks.

THE OBJECT TYPE

Up to this point, most of the reference-value examples have used the Object type, which is one of the most often-used types in ECMAScript. Although instances of Object don’t have much functionality, they are ideally suited to storing and transmitting data around an application.

There are two ways to explicitly create an instance of Object. The first is to use the new operator with the Object constructor like this:

var person = new Object();
person.name = "Nicholas";
person.age = 29;

ObjectTypeExample01.htm

The other way is to use object literal notation. Object literal notation is a shorthand form of object definition designed to simplify creating an object with numerous properties. For example, the following defines the same person object from the previous example using object literal notation:

var person = {
    name : "Nicholas",
    age : 29
};

ObjectTypeExample02.htm

In this example, the left curly brace ({) signifies the beginning of an object literal, because it occurs in an expression context. An expression context in ECMAScript is a context in which a value (expression) is expected. Assignment operators indicate that a value is expected next, so the left curly brace indicates the beginning of an expression. The same curly brace, when appearing in a statement context, such as follows an if statement condition, indicates the beginning of a block statement.

Next, the name property is specified, followed by a colon, followed by the property’s value. A comma is used to separate properties in an object literal, so there’s a comma after the string "Nicholas" but not after the value 29, because age is the last property in the object. Including a comma after the last property causes an error in Internet Explorer 7 and earlier and Opera.

Property names can also be specified as strings or numbers when using object literal notation, such as in this example:

var person = {
    "name" : "Nicholas",
    "age" : 29,
    5: true
};

This example produces an object with a name property, an age property, and a property “5”. Note that numeric property names are automatically converted to strings.

It’s also possible to create an object with only the default properties and methods using object literal notation by leaving the space between the curly braces empty, such as this:

var person = {};                    //same as new Object()
person.name = "Nicholas";
person.age = 29;

This example is equivalent to the first one in this section, though it looks a little strange. It’s recommended to use object literal notation only when you’re going to specify properties for readability.

When defining an object via object literal notation, the Object constructor is never actually called (Firefox 2 and earlier did call the Object constructor; this was changed in Firefox 3).

Though it’s acceptable to use either method of creating Object instances, developers tend to favor object literal notation, because it requires less code and visually encapsulates all related data. In fact, object literals have become a preferred way of passing a large number of optional arguments to a function, such as in this example:

function displayInfo(args) {
    var output = "";
                   
    if (typeof args.name == "string"){
        output += "Name: " + args.name + "
";
    }
                   
    if (typeof args.age == "number") {
        output += "Age: " + args.age + "
";
    }
                   
    alert(output);
}
                   
displayInfo({ 
    name: "Nicholas", 
    age: 29
});
                   
displayInfo({
    name: "Greg"
});

ObjectTypeExample04.htm

Here, the function displayInfo() accepts a single argument named args. The argument may come in with a property called name or age or both or neither of those. The function is set up to test for the existence of each property using the typeof operator and then to construct a message to display based on availability. This function is then called twice, each time with different data specified in an object literal. The function works correctly in both cases.

This pattern for argument passing is best used when there is a large number of optional arguments that can be passed into the function. Generally speaking, named arguments are easier to work with but can get unwieldy when there are numerous optional arguments. The best approach is to use named arguments for those that are required and an object literal to encompass multiple optional arguments.

Although object properties are typically accessed using dot notation, which is common to many object-oriented languages, it’s also possible to access properties via bracket notation. When you use bracket notation, a string containing the property name is placed between the brackets, as in this example:

alert(person["name"]);    //"Nicholas"
alert(person.name);       //"Nicholas"

Functionally, there is no difference between the two approaches. The main advantage of bracket notation is that it allows you to use variables for property access, such as in this example:

var propertyName = "name";
alert(person[propertyName]);    //"Nicholas"

You can also use bracket notation when the property name contains characters that would be either a syntax error or a keyword/reserved word. For example:

person["first name"] = "Nicholas";

Since the name "first name" contains a space, you can’t use dot notation to access it. However, property names can contain nonalphanumeric characters, you just need to use bracket notation to access them.

Generally speaking, dot notation is preferred unless variables are necessary to access properties by name.

THE ARRAY TYPE

After the Object type, the Array type is probably the most used in ECMAScript. An ECMAScript array is very different from arrays in most other programming languages. As in other languages, ECMAScript arrays are ordered lists of data, but unlike in other languages, they can hold any type of data in each slot. This means that it’s possible to create an array that has a string in the first position, a number in the second, an object in the third, and so on. ECMAScript arrays are also dynamically sized, automatically growing to accommodate any data that is added to them.

Arrays can be created in two basic ways. The first is to use the Array constructor, as in this line:

var colors = new Array();

If you know the number of items that will be in the array, you can pass the count into the constructor, and the length property will automatically be created with that value. For example, the following creates an array with an initial length value of 20:

var colors = new Array(20);

The Array constructor can also be passed items that should be included in the array. The following creates an array with three string values:

var colors = new Array("red", "blue", "green");

An array can be created with a single value by passing it into the constructor. This gets a little bit tricky, because providing a single argument that is a number always creates an array with the given number of items, whereas an argument of any other type creates a one-item array that contains the specified value. Here’s an example:

var colors = new Array(3);      //create an array with three items
var names = new Array("Greg");  //create an array with one item, the string "Greg"

ArrayTypeExample01.htm

It’s possible to omit the new operator when using the Array constructor. It has the same result, as you can see here:

var colors = Array(3);      //create an array with three items
var names = Array("Greg");  //create an array with one item, the string "Greg"

The second way to create an array is by using array literal notation. An array literal is specified by using square brackets and placing a comma-separated list of items between them, as in this example:

var colors = ["red", "blue", "green"]; //creates an array with three strings
var names = [];                        //creates an empty array
var values = [1,2,];                   //AVOID! Creates an array with 2 or 3 items
var options = [,,,,,];                 //AVOID! creates an array with 5 or 6 items

ArrayTypeExample02.htm

In this code, the first line creates an array with three string values. The second line creates an empty array by using empty square brackets. The third line shows the effects of leaving a comma after the last value in an array literal: in Internet Explorer 8 and earlier, values becomes a three-item array containing the values 1, 2, and undefined; in all other browsers, values is a two-item array containing the values 1 and 2. This is due to a bug regarding array literals in the Internet Explorer implementation of ECMAScript through version 8 of the browser. Another instance of this bug is shown in the last line, which creates an array with either five (in Internet Explorer 9+, Firefox, Opera, Safari, and Chrome) or six (in Internet Explorer 8 and earlier) items. By omitting values between the commas, each item gets a value of undefined, which is logically the same as calling the Array constructor and passing in the number of items. However, because of the inconsistent implementation of early versions of Internet Explorer, using this syntax is strongly discouraged.

As with objects, the Array constructor isn’t called when an array is created using array literal notation (except in Firefox prior to version 3).

To get and set array values, you use square brackets and provide the zero-based numeric index of the value, as shown here:

var colors = ["red", "blue", "green"];           //define an array of strings
alert(colors[0]);                                //display the first item
colors[2] = "black";                             //change the third item
colors[3] = "brown";                             //add a fourth item

The index provided within the square brackets indicates the value being accessed. If the index is less than the number of items in the array, then it will return the value stored in the corresponding item, as colors[0] displays "red" in this example. Setting a value works in the same way, replacing the value in the designated position. If a value is set to an index that is past the end of the array, as with colors[3] in this example, the array length is automatically expanded to be that index plus 1 (so the length becomes 4 in this example because the index being used is 3).

The number of items in an array is stored in the length property, which always returns 0 or more, as shown in the following example:

var colors = ["red", "blue", "green"];    //creates an array with three strings
var names = [];                           //creates an empty array
                   
alert(colors.length);    //3
alert(names.length);     //0

A unique characteristic of length is that it’s not read-only. By setting the length property, you can easily remove items from or add items to the end of the array. Consider this example:

var colors = ["red", "blue", "green"];    //creates an array with three strings
colors.length = 2;
alert(colors[2]);        //undefined

ArrayFilterExample03.htm

Here, the array colors starts out with three values. Setting the length to 2 removes the last item (in position 2), making it no longer accessible using colors[2]. If the length were set to a number greater than the number of items in the array, the new items would each get filled with the value of undefined, such as in this example:

var colors = ["red", "blue", "green"];    //creates an array with three strings
colors.length = 4;
alert(colors[3]);        //undefined

ArrayFilterExample04.htm

This code sets the length of the colors array to 4 even though it contains only three items. Position 3 does not exist in the array, so trying to access its value results in the special value undefined being returned.

The length property can also be helpful in adding items to the end of an array, as in this example:

var colors = ["red", "blue", "green"];    //creates an array with three strings
colors[colors.length] = "black";          //add a color (position 3)
colors[colors.length] = "brown";          //add another color (position 4)

ArrayFilterExample05.htm

The last item in an array is always at position length - 1, so the next available open slot is at position length. Each time an item is added after the last one in the array, the length property is automatically updated to reflect the change. That means colors[colors.length] assigns a value to position 3 in the second line of this example and to position 4 in the last line. The new length is automatically calculated when an item is placed into a position that’s outside of the current array size, which is done by adding 1 to the position, as in this example:

var colors = ["red", "blue", "green"];    //creates an array with three strings
colors[99] = "black";                     //add a color (position 99)
alert(colors.length);  //100

ArrayFilterExample06.htm

In this code, the colors array has a value inserted into position 99, resulting in a new length of 100 (99 + 1). Each of the other items, positions 3 through 98, doesn’t actually exist and so returns undefined when accessed.

Arrays can contain a maximum of 4,294,967,295 items, which should be plenty for almost all programming needs. If you try to add more than that number, an exception occurs. Trying to create an array with an initial size approaching this maximum may cause a long-running script error.

Detecting Arrays

Ever since ECMAScript 3 was defined, one of the classic problems has been truly determining whether a given object is an array. When dealing with a single web page, and therefore a single global scope, the instanceof operator works well:

if (value instanceof Array){
    //do something on the array
}

The one problem with instanceof is that it assumes a single global execution context. If you are dealing with multiple frames in a web page, you’re really dealing with two distinct global execution contexts and therefore two versions of the Array constructor. If you were to pass an array from one frame into a second frame, that array has a different constructor function than an array created natively in the second frame.

To work around this problem, ECMAScript 5 introduced the Array.isArray() method. The purpose of this method is to definitively determine if a given value is an array regardless of the global execution context in which it was created. Example usage:

if (Array.isArray(value)){
    //do something on the array
}

Internet Explorer 9+, Firefox 4+, Safari 5+, Opera 10.5+, and Chrome have all implemented Array.isArray(). For definitive array detection in browsers that haven’t yet implemented this method, see the section titled “Safe Type Detection” in Chapter 22.

Conversion Methods

As mentioned previously, all objects have toLocaleString(), toString(), and valueOf() methods. The toString() and valueOf() methods return the same value when called on an array. The result is a comma-separated string that contains the string equivalents of each value in the array, which is to say that each item has its toString() method called to create the final string. Take a look at this example:

var colors = ["red", "blue", "green"];    //creates an array with three strings
alert(colors.toString());    //red,blue,green
alert(colors.valueOf());     //red,blue,green
alert(colors);               //red,blue,green

ArrayFilterExample07.htm

In this code, the toString() and valueOf() methods are first called explicitly to return the string representation of the array, which combines the strings, separating them by commas. The last line passes the array directly into alert(). Because alert() expects a string, it calls toString() behind the scenes to get the same result as when toString() is called directly.

The toLocaleString() method may end up returning the same value as toString() and valueOf(), but not always. When toLocaleString() is called on an array, it creates a comma-delimited string of the array values. The only difference between this and the two other methods is that toLocaleString() calls each item’s toLocaleString() instead of toString() to get its string value. Consider the following example:

var person1 = {
    toLocaleString : function () {
        return "Nikolaos";
    },
    
    toString : function() {
        return "Nicholas";
    }
};
                   
var person2 = {
    toLocaleString : function () {
        return "Grigorios";
    },
    
    toString : function() {
        return "Greg";
    }
};
                   
var people = [person1, person2];
alert(people);                      //Nicholas,Greg
alert(people.toString());           //Nicholas,Greg
alert(people.toLocaleString());     //Nikolaos,Grigorios

ArrayTypeExample08.htm

Here, two objects are defined, person1 and person2. Each object defines both a toString() method and a toLocaleString() method that return different values. An array, people, is created to contain both objects. When passed into alert(), the output is "Nicholas,Greg", because the toString() method is called on each item in the array (the same as when toString() is called explicitly on the next line). When toLocaleString() is called on the array, the result is "Nikolaos,Grigorios", because this calls toLocaleString() on each array item.

The inherited methods toLocaleString(), toString(), and valueOf() each return the array items as a comma-separated string. It’s possible to construct a string with a different separator using the join() method. The join() method accepts one argument, which is the string separator to use, and returns a string containing all items. Consider this example:

var colors = ["red", "green", "blue"];
alert(colors.join(","));      //red,green,blue
alert(colors.join("||"));     //red||green||blue

ArrayTypeJoinExample01.htm

Here, the join() method is used on the colors array to duplicate the output of toString(). By passing in a comma, the result is a comma-separated list of values. On the last line, double pipes are passed in, resulting in the string "red||green||blue". If no value or undefined is passed into the join() method, then a comma is used as the separator. Internet Explorer 7 and earlier incorrectly use the string "undefined" as the separator.

If an item in the array is null or undefined, it is represented by an empty string in the result of join(), toLocaleString(), toString(), and valueOf().

Stack Methods

One of the interesting things about ECMAScript arrays is that they provide a method to make an array behave like other data structures. An array object can act just like a stack, which is one of a group of data structures that restrict the insertion and removal of items. A stack is referred to as a last-in-first-out (LIFO) structure, meaning that the most recently added item is the first one removed. The insertion (called a push) and removal (called a pop) of items in a stack occur at only one point: the top of the stack. ECMAScript arrays provide push() and pop() specifically to allow stack-like behavior.

The push() method accepts any number of arguments and adds them to the end of the array, returning the array’s new length. The pop() method, on the other hand, removes the last item in the array, decrements the array’s length, and returns that item. Consider this example:

var colors = new Array();                      //create an array
var count = colors.push("red", "green");       //push two items
alert(count);  //2
                   
count = colors.push("black");                  //push another item on
alert(count);  //3
                   
var item = colors.pop();                       //get the last item
alert(item);   //"black"
alert(colors.length);  //2

ArrayTypeExample09.htm

In this code, an array is created for use as a stack (note that there’s no special code required to make this work; push() and pop() are default methods on arrays). First, two strings are pushed onto the end of the array using push(), and the result is stored in the variable count (which gets the value of 2). Then, another value is pushed on, and the result is once again stored in count. Because there are now three items in the array, push() returns 3. When pop() is called, it returns the last item in the array, which is the string "black". The array then has only two items left.

The stack methods may be used in combination with all of the other array methods as well, as in this example:

var colors = ["red", "blue"];
colors.push("brown");              //add another item
colors[3] = "black";               //add an item
alert(colors.length);  //4
                   
var item = colors.pop();           //get the last item
alert(item);  //"black"

ArrayTypeExample10.htm

Here, an array is initialized with two values. A third value is added via push(), and a fourth is added by direct assignment into position 3. When pop() is called, it returns the string "black", which was the last value added to the array.

Queue Methods

Just as stacks restrict access in a LIFO data structure, queues restrict access in a first-in-first-out (FIFO) data structure. A queue adds items to the end of a list and retrieves items from the front of the list. Because the push() method adds items to the end of an array, all that is needed to emulate a queue is a method to retrieve the first item in the array. The array method for this is called shift(), which removes the first item in the array and returns it, decrementing the length of the array by one. Using shift() in combination with push() allows arrays to be used as queues:

var colors = new Array();                      //create an array
var count = colors.push("red", "green");       //push two items
alert(count);  //2
                   
count = colors.push("black");                  //push another item on
alert(count);  //3
                   
var item = colors.shift();                     //get the first item
alert(item);   //"red"
alert(colors.length);  //2

ArrayTypeExample11.htm

This example creates an array of three colors using the push() method. The highlighted line shows the shift() method being used to retrieve the first item in the array, which is "red". With that item removed, "green" is moved into the first position and "black" is moved into the second, leaving the array with two items.

ECMAScript also provides an unshift() method for arrays. As the name indicates, unshift() does the opposite of shift(): it adds any number of items to the front of an array and returns the new array length. By using unshift() in combination with pop(), it’s possible to emulate a queue in the opposite direction, where new values are added to the front of the array and values are retrieved off the back, as in this example:

var colors = new Array();                      //create an array
var count = colors.unshift("red", "green");    //push two items
alert(count);  //2
                   
count = colors.unshift("black");               //push another item on
alert(count);  //3
                   
var item = colors.pop();                     //get the first item
alert(item);   //"green"
alert(colors.length);  //2

ArrayTypeExample12.htm

In this code, an array is created and then populated by using unshift(). First "red" and "green" are added to the array, and then "black" is added, resulting in an order of "black", "red", "green". When pop() is called, it removes the last item, "green", and returns it.

Internet Explorer 7 and earlier always return undefined, instead of the new length of the array, for unshift(). Internet Explorer 8 returns the length correctly when not in compatibility mode.

Reordering Methods

Two methods deal directly with the reordering of items already in the array: reverse() and sort(). As one might expect, the reverse() method simply reverses the order of items in an array. Take this code for example:

var values = [1, 2, 3, 4, 5];
values.reverse();
alert(values);       //5,4,3,2,1

ArrayTypeExample13.htm

Here, the array’s values are initially set to 1, 2, 3, 4, and 5, in that order. Calling reverse() on the array reverses the order to 5, 4, 3, 2, 1. This method is fairly straightforward but doesn’t provide much flexibility, which is where the sort() method comes in.

By default, the sort() method puts the items in ascending order — with the smallest value first and the largest value last. To do this, the sort() method calls the String() casting function on every item and then compares the strings to determine the correct order. This occurs even if all items in an array are numbers, as in this example:

var values = [0, 1, 5, 10, 15];
values.sort();
alert(values);    //0,1,10,15,5

ArrayTypeExample14.htm

Even though the values in this example begin in correct numeric order, the sort() method changes that order based on their string equivalents. So even though 5 is less than 10, the string "10" comes before "5" when doing a string comparison, so the array is updated accordingly. Clearly, this is not an optimal solution in many cases, so the sort() method allows you to pass in a comparison function that indicates which value should come before which.

A comparison function accepts two arguments and returns a negative number if the first argument should come before the second, a zero if the arguments are equal, or a positive number if the first argument should come after the second. Here’s an example of a simple comparison function:

function compare(value1, value2) {
    if (value1 < value2) {
        return -1;
    } else if (value1 > value2) {
        return 1;
    } else {
        return 0;
    }
}

ArrayTypeExample15.htm

This comparison function works for most data types and can be used by passing it as an argument to the sort() method, as in the following example:

var values = [0, 1, 5, 10, 15];
values.sort(compare);
alert(values);    //0,1,5,10,15

When the comparison function is passed to the sort() method, the numbers remain in the correct order. Of course, the comparison function could produce results in descending order if you simply switch the return values like this:

function compare(value1, value2) {
    if (value1 < value2) {
        return 1;
    } else if (value1 > value2) {
        return -1;
    } else {
        return 0;
    }
}
                   
var values = [0, 1, 5, 10, 15];
values.sort(compare);
alert(values);    //15,10,5,1,0

ArrayTypeExample16.htm

In this modified example, the comparison function returns 1 if the first value should come after the second and −1 if the first value should come before the second. Swapping these means the larger value will come first and the array will be sorted in descending order. Of course, if you just want to reverse the order of the items in the array, reverse() is a much faster alternative than sorting.

Both reverse() and sort() return a reference to the array on which they were applied.

A much simpler version of the comparison function can be used with numeric types, and objects whose valueOf() method returns numeric values (such as the Date object). In either case, you can simply subtract the second value from the first as shown here:

function compare(value1, value2){
    return value2 - value1;
}

Because comparison functions work by returning a number less than zero, zero, or a number greater than zero, the subtraction operation handles all of the cases appropriately.

Manipulation Methods

There are various ways to work with the items already contained in an array. The concat() method, for instance, allows you to create a new array based on all of the items in the current array. This method begins by creating a copy of the array and then appending the method arguments to the end and returning the newly constructed array. When no arguments are passed in, concat() simply clones the array and returns it. If one or more arrays are passed in, concat() appends each item in these arrays to the end of the result. If the values are not arrays, they are simply appended to the end of the resulting array. Consider this example:

var colors = ["red", "green", "blue"];
var colors2 = colors.concat("yellow", ["black", "brown"]);
                   
alert(colors);     //red,green,blue        
alert(colors2);    //red,green,blue,yellow,black,brown

ArrayTypeConcatExample01.htm

This code begins with the colors array containing three values. The concat() method is called on colors, passing in the string "yellow" and an array containing "black" and "brown". The result, stored in colors2, contains "red", "green", "blue", "yellow", "black", and "brown". The original array, colors, remains unchanged.

The next method, slice(), creates an array that contains one or more items already contained in an array. The slice() method may accept one or two arguments: the starting and stopping positions of the items to return. If only one argument is present, the method returns all items between that position and the end of the array. If there are two arguments, the method returns all items between the start position and the end position, not including the item in the end position. Keep in mind that this operation does not affect the original array in any way. Consider the following:

var colors = ["red", "green", "blue", "yellow", "purple"];
var colors2 = colors.slice(1);
var colors3 = colors.slice(1,4);
                   
alert(colors2);   //green,blue,yellow,purple
alert(colors3);   //green,blue,yellow

ArrayTypeSliceExample01.htm

In this example, the colors array starts out with five items. Calling slice() and passing in 1 yields an array with four items, omitting "red" because the operation began copying from position 1, which contains "green". The resulting colors2 array contains "green", "blue", "yellow", and "purple". The colors3 array is constructed by calling slice() and passing in 1 and 4, meaning that the method will begin copying from the item in position 1 and stop copying at the item in position 3. As a result, colors3 contains "green", "blue", and "yellow".

If either the start or end position of slice() is a negative number, then the number is subtracted from the length of the array to determine the appropriate locations. For example, calling slice(-2, -1) on an array with five items is the same as calling slice(3, 4). If the end position is smaller than the start, then an empty array is returned.

Perhaps the most powerful array method is splice(), which can be used in a variety of ways. The main purpose of splice() is to insert items into the middle of an array, but there are three distinct ways of using this method. They are as follows:

Deletion — Any number of items can be deleted from the array by specifying just two arguments: the position of the first item to delete and the number of items to delete. For example, splice(0, 2) deletes the first two items.
Insertion — Items can be inserted into a specific position by providing three or more arguments: the starting position, 0 (the number of items to delete), and the item to insert. Optionally, you can specify a fourth parameter, fifth parameter, or any number of other parameters to insert. For example, splice(2, 0, "red", "green") inserts the strings "red" and "green" into the array at position 2.
Replacement — Items can be inserted into a specific position while simultaneously deleting items, if you specify three arguments: the starting position, the number of items to delete, and any number of items to insert. The number of items to insert doesn’t have to match the number of items to delete. For example, splice(2, 1, "red", "green") deletes one item at position 2 and then inserts the strings "red" and "green" into the array at position 2.

The splice() method always returns an array that contains any items that were removed from the array (or an empty array if no items were removed). These three uses are illustrated in the following code:

var colors = ["red", "green", "blue"];
var removed = colors.splice(0,1);                  //remove the first item
alert(colors);     //green,blue
alert(removed);    //red - one item array
                   
removed = colors.splice(1, 0, "yellow", "orange"); //insert two items at position 1
alert(colors);     //green,yellow,orange,blue
alert(removed);    //empty array
                   
removed = colors.splice(1, 1, "red", "purple");    //insert two values, remove one
alert(colors);     //green,red,purple,orange,blue
alert(removed);    //yellow - one item array

ArrayTypeSpliceExample01.htm

This example begins with the colors array containing three items. When splice is called the first time, it simply removes the first item, leaving colors with the items "green" and "blue". The second time splice() is called, it inserts two items at position 1, resulting in colors containing "green", "yellow", "orange", and "blue". No items are removed at this point, so an empty array is returned. The last time splice() is called, it removes one item, beginning in position 1, and inserts "red" and "purple". After all of this code has been executed, the colors array contains "green", "red", "purple", "orange", and "blue".

Location Methods

ECMAScript 5 adds two item location methods to array instances: indexOf() and lastIndexOf(). Each of these methods accepts two arguments: the item to look for and an optional index from which to start looking. The indexOf() method starts searching from the front of the array (item 0) and continues to the back, whereas lastIndexOf() starts from the last item in the array and continues to the front.

The methods each return the position of the item in the array or -1 if the item isn’t in the array. An identity comparison is used when comparing the first argument to each item in the array, meaning that the items must be strictly equal as if compared using ===. Here are some examples of this usage:

var numbers = [1,2,3,4,5,4,3,2,1];
                   
alert(numbers.indexOf(4));        //3
alert(numbers.lastIndexOf(4));    //5
                   
alert(numbers.indexOf(4, 4));     //5
alert(numbers.lastIndexOf(4, 4)); //3
                   
var person = { name: "Nicholas" };
var people = [{ name: "Nicholas" }];
var morePeople = [person];
                   
alert(people.indexOf(person));     //-1
alert(morePeople.indexOf(person)); //0

ArrayIndexOfExample01.htm

The indexOf() and lastIndexOf() methods make it trivial to locate specific items inside of an array and are supported in Internet Explorer 9+, Firefox 2+, Safari 3+, Opera 9.5+, and Chrome.

Iterative Methods

ECMAScript 5 defines five iterative methods for arrays. Each of the methods accepts two arguments: a function to run on each item and an optional scope object in which to run the function (affecting the value of this). The function passed into one of these methods will receive three arguments: the array item value, the position of the item in the array, and the array object itself. Depending on the method, the results of this function’s execution may or may not affect the method’s return value. The iterative methods are as follows:

every() — Runs the given function on every item in the array and returns true if the function returns true for every item.
filter() — Runs the given function on every item in the array and returns an array of all items for which the function returns true.
forEach() — Runs the given function on every item in the array. This method has no return value.
map() — Runs the given function on every item in the array and returns the result of each function call in an array.
some() — Runs the given function on every item in the array and returns true if the function returns true for any one item.

These methods do not change the values contained in the array.

Of these methods, the two most similar are every() and some(), which both query the array for items matching some criteria. For every(), the passed-in function must return true for every item in order for the method to return true; otherwise, it returns false. The some() method, on the other hand, returns true if even one of the items causes the passed-in function to return true. Here is an example:

var numbers = [1,2,3,4,5,4,3,2,1];
                   
var everyResult = numbers.every(function(item, index, array){
    return (item > 2);
});
                   
alert(everyResult);       //false
                   
var someResult = numbers.some(function(item, index, array){
    return (item > 2);
});
                   
alert(someResult);       //true

ArrayEveryAndSomeExample01.htm

This code calls both every() and some() with a function that returns true if the given item is greater than 2. For every(), the result is false, because only some of the items fit the criteria. For some(), the result is true, because at least one of the items is greater than 2.

The next method is filter(), which uses the given function to determine if an item should be included in the array that it returns. For example, to return an array of all numbers greater than 2, the following code can be used:

var numbers = [1,2,3,4,5,4,3,2,1];
                   
var filterResult = numbers.filter(function(item, index, array){
    return (item > 2);
});
                   
alert(filterResult);   //[3,4,5,4,3]

ArrayFilterExample01.htm

Here, an array containing the values 3, 4, 5, 4, and 3 is created and returned by the call to filter(), because the passed-in function returns true for each of those items. This method is very helpful when querying an array for all items matching some criteria.

The map() method also returns an array. Each item in the array is the result of running the passed-in function on the original array item in the same location. For example, you can multiply every number in an array by two and are returned an array of those numbers, as shown here:

var numbers = [1,2,3,4,5,4,3,2,1];
                   
var mapResult = numbers.map(function(item, index, array){
    return item * 2;
});
                   
alert(mapResult);   //[2,4,6,8,10,8,6,4,2]

ArrayMapExample01.htm

The code in this example returns an array containing the result of multiplying each number by two. This method is helpful when creating arrays whose items correspond to one another.

The last method is forEach(), which simply runs the given function on every item in an array. There is no return value and is essentially the same as iterating over an array using a for loop. Here’s an example:

var numbers = [1,2,3,4,5,4,3,2,1];
                   
numbers.forEach(function(item, index, array){
    //do something here
});

All of these array methods ease the processing of arrays by performing a number of different operations. The iterative methods are supported in Internet Explorer 9+, Firefox 2+, Safari 3+, Opera 9.5+, and Chrome.

Reduction Methods

ECMAScript 5 also introduced two reduction methods for arrays: reduce() and reduceRight(). Both methods iterate over all items in the array and build up a value that is ultimately returned. The reduce() method does this starting at the first item and traveling toward the last, whereas reduceRight() starts at the last and travels toward the first.

Both methods accept two arguments: a function to call on each item and an optional initial value upon which the reduction is based. The function passed into reduce() or reduceRight() accepts four arguments: the previous value, the current value, the item’s index, and the array object. Any value returned from the function is automatically passed in as the first argument for the next item. The first iteration occurs on the second item in the array, so the first argument is the first item in the array and the second argument is the second item in the array.

You can use the reduce() method to perform operations such as adding all numbers in an array. Here’s an example:

var values = [1,2,3,4,5];
var sum = values.reduce(function(prev, cur, index, array){
    return prev + cur;
});
alert(sum); //15

ArrayReductionExample01.htm

The first time the callback function is executed, prev is 1 and cur is 2. The second time, prev is 3 (the result of adding 1 and 2), and cur is 3 (the third item in the array). This sequence continues until all items have been visited and the result is returned.

The reduceRight() method works in the same way, just in the opposite direction. Consider the following example:

var values = [1,2,3,4,5];
var sum = values.reduceRight(function(prev, cur, index, array){
    return prev + cur;
});
alert(sum); //15

In this version of the code, prev is 5 and cur is 4 the first time the callback function is executed. The result is the same, of course, since the operation is simple addition.

The decision to use reduce() or reduceRight() depends solely on the direction in which the items in the array should be visited. They are exactly equal in every other way.

The reduction methods are supported in Internet Explorer 9+, Firefox 3+, Safari 4+, Opera 10.5, and Chrome.

THE DATE TYPE

The ECMAScript Date type is based on an early version of java.util.Date from Java. As such, the Date type stores dates as the number of milliseconds that have passed since midnight on January 1, 1970 UTC (Universal Time Code). Using this data storage format, the Date type can accurately represent dates 285,616 years before or after January 1, 1970.

To create a date object, use the new operator along with the Date constructor, like this:

var now = new Date();

DateTypeExample01.htm

When the Date constructor is used without any arguments, the created object is assigned the current date and time. To create a date based on another date or time, you must pass in the millisecond representation of the date (the number of milliseconds after midnight, January 1, 1970 UTC). To aid in this process, ECMAScript provides two methods: Date.parse() and Date.UTC().

The Date.parse() method accepts a string argument representing a date. It attempts to convert the string into a millisecond representation of a date. ECMA-262 fifth edition defines which date formats Date.parse() should support, filling in a void left by the third edition. All implementations must now support the following date formats:

month/date/year (such as 6/13/2004)
month_name date, year (such as January 12, 2004)
day_of_week month_name date year hours:minutes:seconds time_zone (such as Tue May 25 2004 00:00:00 GMT-0700)
ISO 8601 extended format YYYY-MM-DDTHH:mm:ss.sssZ (such as 2004-05-25T00:00:00). This works only in ECMAScript 5–compliant implementations.

For instance, to create a date object for May 25, 2004, you can use the following code:

var someDate = new Date(Date.parse("May 25, 2004"));

DateTypeExample01.htm

If the string passed into Date.parse() doesn’t represent a date, then it returns NaN. The Date constructor will call Date.parse() behind the scenes if a string is passed in directly, meaning that the following code is identical to the previous example:

var someDate = new Date("May 25, 2004");

This code produces the same result as the previous example.

There are a lot of quirks surrounding the Date type and its implementation in various browsers. There is a tendency to replace out-of-range values with the current value to produce an output, so when trying to parse "January 32, 2007", some browsers will interpret it as "February 1, 2007", whereas Opera tends to insert the current day of the current month, returning "January current_day, 2007". This means running the code on September 21 returns "January 21, 2007".

The Date.UTC() method also returns the millisecond representation of a date but constructs that value using different information than Date.parse(). The arguments for Date.UTC() are the year, the zero-based month (January is 0, February is 1, and so on), the day of the month (1 through 31), and the hours (0 through 23), minutes, seconds, and milliseconds of the time. Of these arguments, only the first two (year and month) are required. If the day of the month isn’t supplied, it’s assumed to be 1, while all other omitted arguments are assumed to be 0. Here are two examples of Date.UTC() in action:

//January 1, 2000 at midnight GMT
var y2k = new Date(Date.UTC(2000, 0));
                   
//May 5, 2005 at 5:55:55 PM GMT
var allFives = new Date(Date.UTC(2005, 4, 5, 17, 55, 55));

DateTypeUTCExample01.htm

Two dates are created in this example. The first date is for midnight (GMT) on January 1, 2000, which is represented by the year 2000 and the month 0 (which is January). Because the other arguments are filled in (the day of the month as 1 and everything else as 0), the result is the first day of the month at midnight. The second date represents May 5, 2005, at 5:55:55 PM GMT. Even though the date and time contain only fives, creating this date requires some different numbers: the month must be set to 4 because months are zero-based, and the hour must be set to 17 because hours are represented as 0 through 23. The rest of the arguments are as expected.

As with Date.parse(), Date.UTC() is mimicked by the Date constructor but with one major difference: the date and time created are in the local time zone, not in GMT. However, the Date constructor takes the same arguments as Date.UTC(), so if the first argument is a number, the constructor assumes that it is the year of a date, the second argument is the month, and so on. The preceding example can then be rewritten as this:

//January 1, 2000 at midnight in local time
var y2k = new Date(2000, 0);
                   
//May 5, 2005 at 5:55:55 PM local time
var allFives = new Date(2005, 4, 5, 17, 55, 55);

DateTypeConstructorExample01.htm

This code creates the same two dates as the previous example, but this time both dates are in the local time zone as determined by the system settings.

ECMAScript 5 adds Date.now(), which returns the millisecond representation of the date and time at which the method is executed. This method makes it trivial to use Date objects for code profiling, such as:

//get start time
var start = Date.now();
 
//call a function
doSomething();
 
//get stop time
var stop = Date.now(),
    result = stop - start;

The Date.now() method has been implemented in Internet Explorer 9+, Firefox 3+, Safari 3+, Opera 10.5, and Chrome. For browsers that don’t yet support this method, you can simulate the same behavior by using the + operator to convert a Date object into a number:

//get start time
var start = +new Date();
 
//call a function
doSomething();
 
//get stop time
var stop = +new Date(),
    result = stop - start;

Inherited Methods

As with the other reference types, the Date type overrides toLocaleString(), toString(), and valueOf(), though unlike the previous types, each method returns something different. The Date type’s toLocaleString() method returns the date and time in a format appropriate for the locale in which the browser is being run. This often means that the format includes AM or PM for the time and doesn’t include any time-zone information (the exact format varies from browser to browser). The toString() method typically returns the date and time with time-zone information, and the time is typically indicated in 24-hour notation (hours ranging from 0 to 23). The following list displays the formats that various browsers use for toLocaleString() and toString() when representing the date/time of February 1, 2007, at midnight PST (Pacific Standard Time) in the “en-US” locale:

Internet Explorer 8

toLocaleString() – Thursday, February 01, 2007 12:00:00 AM

toString() – Thu Feb 1 00:00:00 PST 2007

Firefox 3.5

toLocaleString() – Thursday, February 01, 2007 12:00:00 AM

toString() – Thu Feb 01 2007 00:00:00 GMT-0800 (Pacific Standard Time)

Safari 4

toLocaleString() – Thursday, February 01, 2007 00:00:00

toString() – Thu Feb 01 2007 00:00:00 GMT-0800 (Pacific Standard Time)

Chrome 4

toLocaleString() – Thu Feb 01 2007 00:00:00 GMT-0800 (Pacific Standard Time)

toString() – Thu Feb 01 2007 00:00:00 GMT-0800 (Pacific Standard Time)

Opera 10

toLocaleString() – 2/1/2007 12:00:00 AM

toString() – Thu, 01 Feb 2007 00:00:00 GMT-0800

As you can see, there are some pretty significant differences between the formats that browsers return for each method. These differences mean toLocaleString() and toString() are really useful only for debugging purposes, not for display purposes.

The valueOf() method for the Date type doesn’t return a string at all, because it is overridden to return the milliseconds representation of the date so that operators (such as less-than and greater-than) will work appropriately for date values. Consider this example:

var date1 = new Date(2007, 0, 1);          //"January 1, 2007"
var date2 = new Date(2007, 1, 1);          //"February 1, 2007"
                   
alert(date1 < date2);  //true
alert(date1 > date2);  //false

DateTypeValueOfExample01.htm

The date January 1, 2007, comes before February 1, 2007, so it would make sense to say that the former is less than the latter. Because the milliseconds representation of January 1, 2007, is less than that of February 1, 2007, the less-than operator returns true when the dates are compared, providing an easy way to determine the order of dates.

Date-Formatting Methods

There are several Date type methods used specifically to format the date as a string. They are as follows:

toDateString() — Displays the date’s day of the week, month, day of the month, and year in an implementation-specific format.
toTimeString() — Displays the date’s hours, minutes, seconds, and time zone in an implementation-specific format.
toLocaleDateString() — Displays the date’s day of the week, month, day of the month, and year in an implementation- and locale-specific format.
toLocaleTimeString() — Displays the date’s hours, minutes, and seconds in an implementation-specific format.
toUTCString() — Displays the complete UTC date in an implementation-specific format.

The output of these methods, as with toLocaleString() and toString(), varies widely from browser to browser and therefore can’t be employed in a user interface for consistent display of a date.

There is also a method called toGMTString(), which is equivalent to toUTCString() and is provided for backwards compatibility. However, the specification recommends that new code use toUTCString() exclusively.

Date/Time Component Methods

The remaining methods of the Date type (listed in the following table) deal directly with getting and setting specific parts of the date value. Note that references to a UTC date mean the date value when interpreted without a time-zone offset (the date when converted to GMT).

METHOD	DESCRIPTION
getTime()	Returns the milliseconds representation of the date; same as valueOf().
setTime(milliseconds)	Sets the milliseconds representation of the date, thus changing the entire date.
getFullYear()	Returns the four-digit year (2007 instead of just 07).
getUTCFullYear()	Returns the four-digit year of the UTC date value.
setFullYear(year)	Sets the year of the date. The year must be given with four digits (2007 instead of just 07).
setUTCFullYear(year)	Sets the year of the UTC date. The year must be given with four digits (2007 instead of just 07).
getMonth()	Returns the month of the date, where 0 represents January and 11 represents December.
getUTCMonth()	Returns the month of the UTC date, where 0 represents January and 11 represents December.
setMonth(month)	Sets the month of the date, which is any number 0 or greater. Numbers greater than 11 add years.
setUTCMonth(month)	Sets the month of the UTC date, which is any number 0 or greater. Numbers greater than 11 add years.
getDate()	Returns the day of the month (1 through 31) for the date.
getUTCDate()	Returns the day of the month (1 through 31) for the UTC date.
setDate(date)	Sets the day of the month for the date. If the date is greater than the number of days in the month, the month value also gets increased.
setUTCDate(date)	Sets the day of the month for the UTC date. If the date is greater than the number of days in the month, the month value also gets increased.
getDay()	Returns the date’s day of the week as a number (where 0 represents Sunday and 6 represents Saturday).
getUTCDay()	Returns the UTC date’s day of the week as a number (where 0 represents Sunday and 6 represents Saturday).
getHours()	Returns the date’s hours as a number between 0 and 23.
getUTCHours()	Returns the UTC date’s hours as a number between 0 and 23.
setHours(hours)	Sets the date’s hours. Setting the hours to a number greater than 23 also increments the day of the month.
setUTCHours(hours)	Sets the UTC date’s hours. Setting the hours to a number greater than 23 also increments the day of the month.
getMinutes()	Returns the date’s minutes as a number between 0 and 59.
getUTCMinutes()	Returns the UTC date’s minutes as a number between 0 and 59.
setMinutes(minutes)	Sets the date’s minutes. Setting the minutes to a number greater than 59 also increments the hour.
setUTCMinutes(minutes)	Sets the UTC date’s minutes. Setting the minutes to a number greater than 59 also increments the hour.
getSeconds()	Returns the date’s seconds as a number between 0 and 59.
getUTCSeconds()	Returns the UTC date’s seconds as a number between 0 and 59.
setSeconds(seconds)	Sets the date’s seconds. Setting the seconds to a number greater than 59 also increments the minutes.
setUTCSeconds(seconds)	Sets the UTC date’s seconds. Setting the seconds to a number greater than 59 also increments the minutes.
getMilliseconds()	Returns the date’s milliseconds.
getUTCMilliseconds()	Returns the UTC date’s milliseconds.
setMilliseconds(milliseconds)	Sets the date’s milliseconds.
setUTCMilliseconds(milliseconds)	Sets the UTC date’s milliseconds.
getTimezoneOffset()	Returns the number of minutes that the local time zone is offset from UTC. For example, Eastern Standard Time returns 300. This value changes when an area goes into Daylight Saving Time.

THE REGEXP TYPE

ECMAScript supports regular expressions through the RegExp type. Regular expressions are easy to create using syntax similar to Perl, as shown here:

var expression = /pattern/flags;

The pattern part of the expression can be any simple or complicated regular expression, including character classes, quantifiers, grouping, lookaheads, and backreferences. Each expression can have zero or more flags indicating how the expression should behave. Three supported flags represent matching modes, as follows:

g — Indicates global mode, meaning the pattern will be applied to all of the string instead of stopping after the first match is found.
i — Indicates case-insensitive mode, meaning the case of the pattern and the string are ignored when determining matches.
m — Indicates multiline mode, meaning the pattern will continue looking for matches after reaching the end of one line of text.

A regular expression is created using a combination of a pattern and these flags to produce different results, as in this example:

/*
 * Match all instances of "at" in a string.
 */
var pattern1 = /at/g;
                   
/*
 * Match the first instance of "bat" or "cat", regardless of case.
 */
var pattern2 = /[bc]at/i;
                   
/*
 * Match all three-character combinations ending with "at", regardless of case.
 */
var pattern3 = /.at/gi;

As with regular expressions in other languages, all metacharacters must be escaped when used as part of the pattern. The metacharacters are as follows:

( [ {  ^ $ | ) ] } ? * + .

Each metacharacter has one or more uses in regular-expression syntax and so must be escaped by a backslash when you want to match the character in a string. Here are some examples:

/*
 * Match the first instance of "bat" or "cat", regardless of case.
 */
var pattern1 = /[bc]at/i;
                   
/*
 * Match the first instance of "[bc]at", regardless of case.
 */
var pattern2 = /[bc]at/i;
                   
/*
 * Match all three-character combinations ending with "at", regardless of case.
 */
var pattern3 = /.at/gi;
                   
/*
 * Match all instances of ".at", regardless of case.
 */
var pattern4 = /.at/gi;

In this code, pattern1 matches all instances of "bat" or "cat", regardless of case. To match "[bc]at" directly, both square brackets need to be escaped with a backslash, as in pattern2. In pattern3, the dot indicates that any character can precede "at" to be a match. If you want to match ".at", then the dot needs to be escaped, as in pattern4.

The preceding examples all define regular expressions using the literal form. Regular expressions can also be created by using the RegExp constructor, which accepts two arguments: a string pattern to match and an optional string of flags to apply. Any regular expression that can be defined using literal syntax can also be defined using the constructor, as in this example:

/*
 * Match the first instance of "bat" or "cat", regardless of case.
 */
var pattern1 = /[bc]at/i;
                   
/*
 * Same as pattern1, just using the constructor.
 */
var pattern2 = new RegExp("[bc]at", "i");

Here, pattern1 and pattern2 define equivalent regular expressions. Note that both arguments of the RegExp constructor are strings (regular-expression literals should not be passed into the RegExp constructor). Because the pattern argument of the RegExp constructor is a string, there are some instances in which you need to double-escape characters. All metacharacters must be double-escaped, as must characters that are already escaped, such as (the character, which is normally escaped in strings as \ becomes \\ when used in a regular-expression string). The following table shows some patterns in their literal form and the equivalent string that would be necessary to use the RegExp constructor.

LITERAL PATTERN	STRING EQUIVALENT
/[bc]at/	"\[bc\]at"
/.at/	"\.at"
/name/age/	"name\/age"
/d.d{1,2}/	"\d.\d{1,2}"
/w\hello\123/	"\w\\hello\\123"

Keep in mind that creating a regular expression using a literal is not exactly the same as creating a regular expression using the RegExp constructor. In ECMAScript 3, regular-expression literals always share the same RegExp instance, while creating a new RegExp via constructor always results in a new instance. Consider the following:

var re = null,
    i;
 
for (i=0; i < 10; i++){
    re = /cat/g;
    re.test("catastrophe");
}
 
for (i=0; i < 10; i++){
    re = new RegExp("cat", "g");
    re.test("catastrophe");
}

In the first loop, there is only one instance of RegExp created for /cat/, even though it is specified in the body of the loop. Instance properties (mentioned in the next section) are not reset, so calling test() fails every other time through the loop. This happens because the "cat" is found in the first call to test(), but the second call begins its search from index 3 (the end of the last match) and can’t find it. Since the end of the string is found, the subsequent call to test() starts at the beginning again.

The second loop uses the RegExp constructor to create the regular expression each time through the loop. Each call to test() returns true since a new instance of RegExp is created for each iteration.

ECMAScript 5 clarifies the behavior of regular-expression literals by explicitly stating that regular-expression literals must create new instances of RegExp as if the RegExp constructor were called directly. This change was made in Internet Explorer 9+, Firefox 4+, and Chrome.

RegExp Instance Properties

Each instance of RegExp has the following properties that allow you to get information about the pattern:

global — A Boolean value indicating whether the g flag has been set.
ignoreCase — A Boolean value indicating whether the i flag has been set.
lastIndex — An integer indicating the character position where the next match will be attempted in the source string. This value always begins as 0.
multiline — A Boolean value indicating whether the m flag has been set.
source — The string source of the regular expression. This is always returned as if specified in literal form (without opening and closing slashes) rather than a string pattern as passed into the constructor.

These properties are helpful in identifying aspects of a regular expression; however, they typically don’t have much use, because the information is available in the pattern declaration. Here’s an example:

var pattern1 = /[bc]at/i;
                   
alert(pattern1.global);     //false
alert(pattern1.ignoreCase); //true
alert(pattern1.multiline);  //false
alert(pattern1.lastIndex);  //0
alert(pattern1.source);     //"[bc]at"
                   
var pattern2 = new RegExp("[bc]at", "i");
                   
alert(pattern2.global);     //false
alert(pattern2.ignoreCase); //true
alert(pattern2.multiline);  //false
alert(pattern2.lastIndex);  //0
alert(pattern2.source);     //"[bc]at"

RegExpInstancePropertiesExample01.htm

Note that the source properties of each pattern are equivalent even though the first pattern is in literal form and the second uses the RegExp constructor. The source property normalizes the string into the form you’d use in a literal.

RegExp Instance Methods

The primary method of a RegExp object is exec(), which is intended for use with capturing groups. This method accepts a single argument, which is the string on which to apply the pattern, and returns an array of information about the first match or null if no match was found. The returned array, though an instance of Array, contains two additional properties: index, which is the location in the string where the pattern was matched, and input, which is the string that the expression was run against. In the array, the first item is the string that matches the entire pattern. Any additional items represent captured groups inside the expression (if there are no capturing groups in the pattern, then the array has only one item). Consider the following:

var text = "mom and dad and baby";
var pattern = /mom( and dad( and baby)?)?/gi;
                   
var matches = pattern.exec(text);
alert(matches.index);    //0
alert(matches.input);    //"mom and dad and baby"
alert(matches[0]);       //"mom and dad and baby"
alert(matches[1]);       //" and dad and baby"
alert(matches[2]);       //" and baby"

RegExpExecExample01.htm

In this example, the pattern has two capturing groups. The innermost one matches " and baby", and its enclosing group matches " and dad" or " and dad and baby". When exec() is called on the string, a match is found. Because the entire string matches the pattern, the index property on the matches array is set to 0. The first item in the array is the entire matched string, the second contains the contents of the first capturing group, and the third contains the contents of the third capturing group.

The exec() method returns information about one match at a time even if the pattern is global. When the global flag is not specified, calling exec() on the same string multiple times will always return information about the first match. With the g flag set on the pattern, each call to exec() moves further into the string looking for matches, as in this example:

var text = "cat, bat, sat, fat";        
var pattern1 = /.at/;
                   
var matches = pattern1.exec(text);        
alert(matches.index);        //0
alert(matches[0]);           //cat
alert(pattern1.lastIndex);   //0
                   
matches = pattern1.exec(text);        
alert(matches.index);        //0
alert(matches[0]);           //cat
alert(pattern1.lastIndex);   //0
                   
var pattern2 = /.at/g;
                   
var matches = pattern2.exec(text);        
alert(matches.index);        //0
alert(matches[0]);           //cat
alert(pattern2.lastIndex);   //0
                   
matches = pattern2.exec(text);        
alert(matches.index);        //5
alert(matches[0]);           //bat
alert(pattern2.lastIndex);   //8

RegExpExecExample02.htm

The first pattern in this example, pattern1, is not global, so each call to exec() returns the first match only ("cat"). The second pattern, pattern2, is global, so each call to exec() returns the next match in the string until the end of the string is reached. Note also how the pattern’s lastIndex property is affected. In global matching mode, lastIndex is incremented after each call to exec(), but it remains unchanged in nonglobal mode.

A deviation in the Internet Explorer implementation of JavaScript causes lastIndex to always be updated, even in nonglobal mode.

Another method of regular expressions is test(), which accepts a string argument and returns true if the pattern matches the argument and false if it does not. This method is useful when you want to know if a pattern is matched, but you have no need for the actual matched text. The test() method is often used in if statements, such as the following:

var text = "000-00-0000";        
var pattern = /d{3}-d{2}-d{4}/;
                   
if (pattern.test(text)){
    alert("The pattern was matched.");
}

In this example, the regular expression tests for a specific numeric sequence. If the input text matches the pattern, then a message is displayed. This functionality is often used for validating user input, when you care only if the input is valid, not necessarily why it’s invalid.

The inherited methods of toLocaleString() and toString() each return the literal representation of the regular expression, regardless of how it was created. Consider this example:

var pattern = new RegExp("\[bc\]at", "gi");
alert(pattern.toString());          // /[bc]at/gi
alert(pattern.toLocaleString());    // /[bc]at/gi

RegExpToStringExample01.htm

Even though the pattern in this example is created using the RegExp constructor, the toLocaleString() and toString() methods return the pattern as if it were specified in literal format.

The valueOf() method for a regular expression returns the regular expression itself.

RegExp Constructor Properties

The RegExp constructor function has several properties. (These would be considered static properties in other languages.) These properties apply to all regular expressions that are in scope, and they change based on the last regular-expression operation that was performed. Another unique element of these properties is that they can be accessed in two different ways. Each property has a verbose property name and a shorthand name (except in Opera, which doesn’t support the short names). The RegExp constructor properties are listed in the following table.

VERBOSE NAME	SHORT NAME	DESCRIPTION
input	$_	The last string matched against. This is not implemented in Opera.
lastMatch	$&	The last matched text. This is not implemented in Opera.
lastParen	$+	The last matched capturing group. This is not implemented in Opera.
leftContext	$`	The text that appears in the input string prior to lastMatch.
multiline	$*	A Boolean value specifying whether all expressions should use multiline mode. This is not implemented in IE or Opera.
rightContext	$'	The text that appears in the input string after lastMatch.

These properties can be used to extract specific information about the operation performed by exec() or test(). Consider this example:

var text = "this has been a short summer";
var pattern = /(.)hort/g;
                   
/*
 * Note: Opera doesn't support input, lastMatch, lastParen, or multiline.
 * Internet Explorer doesn't support multiline.
 */        
if (pattern.test(text)){
    alert(RegExp.input);               //this has been a short summer
    alert(RegExp.leftContext);         //this has been a            
    alert(RegExp.rightContext);        // summer
    alert(RegExp.lastMatch);           //short
    alert(RegExp.lastParen);           //s
    alert(RegExp.multiline);           //false
}

RegExpConstructorPropertiesExample01.htm

This code creates a pattern that searches for any character followed by "hort" and puts a capturing group around the first letter. The various properties are used as follows:

The input property contains the original string.
The leftContext property contains the characters of the string before the word "short", and the rightContext property contains the characters after the word "short".
The lastMatch property contains the last string that matches the entire regular expression, which is "short".
The lastParen property contains the last matched capturing group, which is "s" in this case.

These verbose property names can be replaced with the short property names, although you must use bracket notation to access them, as shown in the following example, because most are illegal identifiers in ECMAScript:

var text = "this has been a short summer";
var pattern = /(.)hort/g;
                   
/*
 * Note: Opera doesn't short property names.
 * Internet Explorer doesn't support multiline.
 */        
if (pattern.test(text)){
    alert(RegExp.$_);               //this has been a short summer
    alert(RegExp["$`"]);            //this has been a            
    alert(RegExp["$'"]);            // summer
    alert(RegExp["$&"]);            //short
    alert(RegExp["$+"]);            //s
    alert(RegExp["$*"]);            //false
}

RegExpConstructorPropertiesExample02.htm

There are also constructor properties that store up to nine capturing-group matches. These properties are accessed via RegExp.$1, which contains the first capturing-group match through RegExp.$9, which contains the ninth capturing-group match. These properties are filled in when calling either exec() or test(), allowing you to do things like this:

var text = "this has been a short summer";
var pattern = /(..)or(.)/g;
      
if (pattern.test(text)){
    alert(RegExp.$1);       //sh
    alert(RegExp.$2);       //t
}

RegExpConstructorPropertiesExample03.htm

In this example, a pattern with two matching groups is created and tested against a string. Even though test() simply returns a Boolean value, the properties $1 and $2 are filled in on the RegExp constructor.

Pattern Limitations

Although ECMAScript’s regular-expression support is fully developed, it does lack some of the advanced regular-expression features available in languages such as Perl. The following features are not supported in ECMAScript regular expressions (for more information, see www.regular-expressions.info):

The A and anchors (matching the start or end of a string, respectively)
Lookbehinds
Union and intersection classes
Atomic grouping
Unicode support (except for matching a single character at a time)
Named capturing groups
The s (single-line) and x (free-spacing) matching modes
Conditionals
Regular-expression comments

Despite these limitations, ECMAScript’s regular-expression support is powerful enough for doing most pattern-matching tasks.

THE FUNCTION TYPE

Some of the most interesting parts of ECMAScript are its functions, primarily because functions actually are objects. Each function is an instance of the Function type that has properties and methods just like any other reference type. Because functions are objects, function names are simply pointers to function objects and are not necessarily tied to the function itself. Functions are typically defined using function-declaration syntax, as in this example:

function sum (num1, num2) {
    return num1 + num2;
}

This is almost exactly equivalent to using a function expression, such as this:

var sum = function(num1, num2){
    return num1 + num2;
};

In this code, a variable sum is defined and initialized to be a function. Note that there is no name included after the function keyword, because it’s not needed — the function can be referenced by the variable sum. Also note that there is a semicolon after the end of the function, just as there would be after any variable initialization.

The last way to define functions is by using the Function constructor, which accepts any number of arguments. The last argument is always considered to be the function body, and the previous arguments enumerate the new function’s arguments. Take this for example:

var sum = new Function("num1", "num2", "return num1 + num2");   //not recommended

This syntax is not recommended because it causes a double interpretation of the code (once for the regular ECMAScript code and once for the strings that are passed into the constructor) and thus can affect performance. However, it’s important to think of functions as objects and function names as pointers — this syntax is great at representing that concept.

Because function names are simply pointers to functions, they act like any other variable containing a pointer to an object. This means it’s possible to have multiple names for a single function, as in this example:

function sum(num1, num2){
    return num1 + num2;
}        
alert(sum(10,10));    //20
                   
var anotherSum = sum;        
alert(anotherSum(10,10));  //20
                   
sum = null;        
alert(anotherSum(10,10));  //20

FunctionTypeExample01.htm

This code defines a function named sum() that adds two numbers together. A variable, anotherSum, is declared and set equal to sum. Note that using the function name without parentheses accesses the function pointer instead of executing the function. At this point, both anotherSum and sum point to the same function, meaning that anotherSum() can be called and a result returned. When sum is set to null, it severs its relationship with the function, although anotherSum() can still be called without any problems.

No Overloading (Revisited)

Thinking of function names as pointers also explains why there can be no function overloading in ECMAScript. Recall the following example from Chapter 3:

function addSomeNumber(num){
    return num + 100;
}
                   
function addSomeNumber(num) {
    return num + 200;
}
                   
var result = addSomeNumber(100);    //300

In this example, it’s clear that declaring two functions with the same name always results in the last function overwriting the previous one. This code is almost exactly equivalent to the following:

var addSomeNumber = function (num){
    return num + 100;
};
                   
addSomeNumber = function (num) {
    return num + 200;
};
                   
var result = addSomeNumber(100);    //300

In this rewritten code, it’s much easier to see exactly what is going on. The variable addSomeNumber is simply being overwritten when the second function is created.

Function Declarations versus Function Expressions

Throughout this section, the function declaration and function expression are referred to as being almost equivalent. This hedging is due to one major difference in the way that a JavaScript engine loads data into the execution context. Function declarations are read and available in an execution context before any code is executed, whereas function expressions aren’t complete until the execution reaches that line of code. Consider the following:

alert(sum(10,10));
function sum(num1, num2){
    return num1 + num2;
}

FunctionDeclarationExample01.htm

This code runs perfectly, because function declarations are read and added to the execution context before the code begins running through a process called function declaration hoisting. As the code is being evaluated, the JavaScript engine does a first pass for function declarations and pulls them to the top of the source tree. So even though the function declaration appears after its usage in the actual source code, the engine changes this to hoist the function declarations to the top. Changing the function declaration to an equivalent function expression, as in the following example, will cause an error during execution:

alert(sum(10,10));
var sum = function(num1, num2){
    return num1 + num2;
};

FunctionInitializationExample01.htm

This updated code will cause an error, because the function is part of an initialization statement, not part of a function declaration. That means the function isn’t available in the variable sum until the highlighted line has been executed, which won’t happen, because the first line causes an “unexpected identifier” error.

Aside from this difference in when the function is available by the given name, the two syntaxes are equivalent.

It is possible to have named function expressions that look like declarations, such as var sum = function sum() {}. See Chapter 7 for a longer discussion on function expressions.

Functions as Values

Because function names in ECMAScript are nothing more than variables, functions can be used any place any other value can be used. This means it’s possible not only to pass a function into another function as an argument but also to return a function as the result of another function. Consider the following function:

function callSomeFunction(someFunction, someArgument){
    return someFunction(someArgument);
}

This function accepts two arguments. The first argument should be a function, and the second argument should be a value to pass to that function. Any function can then be passed in as follows:

function add10(num){
    return num + 10;
}
                   
var result1 = callSomeFunction(add10, 10);
alert(result1);   //20
                   
function getGreeting(name){
    return "Hello, " + name;
}
                   
var result2 = callSomeFunction(getGreeting, "Nicholas");
alert(result2);   //"Hello, Nicholas"

FunctionAsAnArgumentExample01.htm

The callSomeFunction() function is generic, so it doesn’t matter what function is passed in as the first argument — the result will always be returned from the first argument being executed. Remember that to access a function pointer instead of executing the function, you must leave off the parentheses, so the variables add10 and getGreeting are passed into callSomeFunction() instead of their results being passed in.

Returning a function from a function is also possible and can be quite useful. For instance, suppose that you have an array of objects and want to sort the array on an arbitrary object property. A comparison function for the array’s sort() method accepts only two arguments, which are the values to compare, but really you need a way to indicate which property to sort by. This problem can be addressed by defining a function to create a comparison function based on a property name, as in the following example:

function createComparisonFunction(propertyName) {
                   
    return function(object1, object2){
        var value1 = object1[propertyName];
        var value2 = object2[propertyName];
                   
        if (value1 < value2){
            return -1;
        } else if (value1 > value2){
            return 1;
        } else {
            return 0;
        }
    };
}

FunctionReturningFunctionExample01.htm

This function’s syntax may look complicated, but essentially it’s just a function inside of a function, preceded by the return operator. The propertyName argument is accessible from the inner function and is used with bracket notation to retrieve the value of the given property. Once the property values are retrieved, a simple comparison can be done. This function can be used as in the following example:

var data = [{name: "Zachary", age: 28}, {name: "Nicholas", age: 29}];
                   
data.sort(createComparisonFunction("name"));
alert(data[0].name);  //Nicholas
                   
data.sort(createComparisonFunction("age"));
alert(data[0].name);  //Zachary

In this code, an array called data is created with two objects. Each object has a name property and an age property. By default, the sort() method would call toString() on each object to determine the sort order, which wouldn’t give logical results in this case. Calling createComparisonFunction ("name") creates a comparison function that sorts based on the name property, which means the first item will have the name "Nicholas" and an age of 29. When createComparisonFunction ("age") is called, it creates a comparison function that sorts based on the age property, meaning the first item will be the one with its name equal to "Zachary" and age equal to 28.

Function Internals

Two special objects exist inside a function: arguments and this. The arguments object, as discussed in Chapter 3, is an array-like object that contains all of the arguments that were passed into the function. Though its primary use is to represent function arguments, the arguments object also has a property named callee, which is a pointer to the function that owns the arguments object. Consider the following classic factorial function:

function factorial(num){
    if (num <= 1) {
        return 1;
    } else {
        return num * factorial(num-1)
    }
}

Factorial functions are typically defined to be recursive, as in this example, which works fine when the name of the function is set and won’t be changed. However, the proper execution of this function is tightly coupled with the function name "factorial". It can be decoupled by using arguments.callee as follows:

function factorial(num){
    if (num <= 1) {
        return 1;
    } else {
        return num * arguments.callee(num-1)
    }
}

FunctionTypeArgumentsExample01.htm

In this rewritten version of the factorial() function, there is no longer a reference to the name "factorial" in the function body, which ensures that the recursive call will happen on the correct function no matter how the function is referenced. Consider the following:

var trueFactorial = factorial;
                   
factorial = function(){
    return 0;
};
                   
alert(trueFactorial(5));   //120
alert(factorial(5));       //0

Here, the variable trueFactorial is assigned the value of factorial, effectively storing the function pointer in a second location. The factorial variable is then reassigned to a function that simply returns 0. Without using arguments.callee in the original factorial() function’s body, the call to trueFactorial() would return 0. However, with the function decoupled from the function name, trueFactorial() correctly calculates the factorial, and factorial() is the only function that returns 0.

The other special object is called this, which operates similar to the this object in Java and C# though isn’t exactly the same. It is a reference to the context object that the function is operating on — often called the this value (when a function is called in the global scope of a web page, the this object points to window). Consider the following:

window.color = "red";
var o = { color: "blue" };
                   
function sayColor(){
    alert(this.color);
}
                   
sayColor();     //"red"
                   
o.sayColor = sayColor;
o.sayColor();   //"blue"

FunctionTypeThisExample01.htm

The function sayColor() is defined globally but references the this object. The value of this is not determined until the function is called, so its value may not be consistent throughout the code execution. When sayColor() is called in the global scope, it outputs "red" because this is pointing to window, which means this.color evaluates to window.color. By assigning the function to the object o and then calling o.sayColor(), the this object points to o, so this.color evaluates to o.color and "blue" is displayed.

Remember that function names are simply variables containing pointers, so the global sayColor() function and o.sayColor() point to the same function even though they execute in different contexts.

ECMAScript 5 also formalizes an additional property on a function object: caller. Though not defined in ECMAScript 3, all browsers except earlier versions of Opera supported this property, which contains a reference to the function that called this function or null if the function was called from the global scope. For example:

function outer(){
    inner();
}
 
function inner(){
    alert(inner.caller);
}
 
outer();

FunctionTypeArgumentsCallerExample01.htm

This code displays an alert with the source text of the outer() function. Because outer() calls inner(), then inner.caller points back to outer(). For looser coupling, you can also access the same information via arguments.callee.caller:

function outer(){
    inner();
}
 
function inner(){
    alert(arguments.callee.caller);
}
 
outer();

FunctionTypeArgumentsCallerExample02.htm

The caller property is supported in all versions of Internet Explorer, Firefox, Chrome, and Safari, as well as Opera 9.6.

When function code executes in strict mode, attempting to access arguments.callee results in an error. ECMAScript 5 also defines arguments.caller, which also results in an error in strict mode and is always undefined outside of strict mode. This is to clear up confusion between arguments.caller and the caller property of functions. These changes were made as security additions to the language, so third-party code could not inspect other code running in the same context.

Strict mode places one additional restriction: you cannot assign a value to the caller property of a function. Doing so results in an error.

Function Properties and Methods

Functions are objects in ECMAScript and, as mentioned previously, therefore have properties and methods. Each function has two properties: length and prototype. The length property indicates the number of named arguments that the function expects, as in this example:

function sayName(name){
    alert(name);
}      
                   
function sum(num1, num2){
    return num1 + num2;
}
                   
function sayHi(){
    alert("hi");
}
                   
alert(sayName.length);  //1
alert(sum.length);      //2
alert(sayHi.length);    //0

FunctionTypeLengthPropertyExample01.htm

This code defines three functions, each with a different number of named arguments. The sayName() function specifies one argument, so its length property is set to 1. Similarly, the sum() function specifies two arguments, so its length property is 2, and sayHi() has no named arguments, so its length is 0.

The prototype property is perhaps the most interesting part of the ECMAScript core. The prototype is the actual location of all instance methods for reference types, meaning methods such as toString() and valueOf() actually exist on the prototype and are then accessed from the object instances. This property is very important in terms of defining your own reference types and inheritance. (These topics are covered in Chapter 6.) In ECMAScript 5, the prototype property is not enumerable and so will not be found using for-in.

There are two additional methods for functions: apply() and call(). These methods both call the function with a specific this value, effectively setting the value of the this object inside the function body. The apply() method accepts two arguments: the value of this inside the function and an array of arguments. This second argument may be an instance of Array, but it can also be the arguments object. Consider the following:

function sum(num1, num2){
    return num1 + num2;
}
                   
function callSum1(num1, num2){
    return sum.apply(this, arguments);    //passing in arguments object
}
                   
function callSum2(num1, num2){
    return sum.apply(this, [num1, num2]); //passing in array
}
                   
alert(callSum1(10,10));   //20
alert(callSum2(10,10));   //20

FunctionTypeApplyMethodExample01.htm

In this example, callSum1() executes the sum() method, passing in this as the this value (which is equal to window because it’s being called in the global scope) and also passing in the arguments object. The callSum2() method also calls sum(), but it passes in an array of the arguments instead. Both functions will execute and return the correct result.

In strict mode, the this value of a function called without a context object is not coerced to window. Instead, this becomes undefined unless explicitly set by either attaching the function to an object or using apply() or call().

The call() method exhibits the same behavior as apply(), but arguments are passed to it differently. The first argument is the this value, but the remaining arguments are passed directly into the function. Using call() arguments must be enumerated specifically, as in this example:

function sum(num1, num2){
    return num1 + num2;
}
                   
function callSum(num1, num2){
    return sum.call(this, num1, num2);
}
                   
alert(callSum(10,10));   //20

FunctionTypeCallMethodExample01.htm

Using the call() method, callSum() must pass in each of its arguments explicitly. The result is the same as using apply(). The decision to use either apply() or call() depends solely on the easiest way for you to pass arguments into the function. If you intend to pass in the arguments object directly or if you already have an array of data to pass in, then apply() is the better choice; otherwise, call() may be a more appropriate choice. (If there are no arguments to pass in, these methods are identical.)

The true power of apply() and call() lies not in their ability to pass arguments but rather in their ability to augment the this value inside of the function. Consider the following example:

window.color = "red";
var o = { color: "blue" };
                   
function sayColor(){
    alert(this.color);
}
                   
sayColor();            //red
                   
sayColor.call(this);   //red
sayColor.call(window); //red
sayColor.call(o);      //blue

FunctionTypeCallExample01.htm

This example is a modified version of the one used to illustrate the this object. Once again, sayColor() is defined as a global function, and when it’s called in the global scope, it displays "red" because this.color evaluates to window.color. You can then call the function explicitly in the global scope by using sayColor.call(this) and sayColor.call(window), which both display "red". Running sayColor.call(o) switches the context of the function such that this points to o, resulting in a display of "blue".

The advantage of using call() (or apply()) to augment the scope is that the object doesn’t need to know anything about the method. In the first version of this example, the sayColor() function was placed directly on the object o before it was called; in the updated example, that step is no longer necessary.

ECMAScript 5 defines an additional method called bind(). The bind() method creates a new function instance whose this value is bound to the value that was passed into bind(). For example:

window.color = "red";
var o = { color: "blue" };
                   
function sayColor(){
    alert(this.color);
}
var objectSayColor = sayColor.bind(o);
objectSayColor();   //blue

FunctionTypeBindMethodExample01.htm

Here, a new function called objectSayColor() is created from sayColor() by calling bind() and passing in the object o. The objectSayColor() function has a this value equivalent to o, so calling the function, even as a global call, results in the string "blue" being displayed. The advantages of this technique are discussed in Chapter 22.

The bind() method is supported in Internet Explorer 9+, Firefox 4+, Safari 5.1+, Opera 12+, and Chrome.

For functions, the inherited methods toLocaleString() and toString() always return the function’s code. The exact format of this code varies from browser to browser — some return your code exactly as it appeared in the source code, including comments, whereas others return the internal representation of your code, which has comments removed and possibly some code changes that the interpreter made. Because of these differences, you can’t rely on what is returned for any important functionality, though this information may be useful for debugging purposes. The inherited method valueOf() simply returns the function itself.

PRIMITIVE WRAPPER TYPES

Three special reference types are designed to ease interaction with primitive values: the Boolean type, the Number type, and the String type. These types can act like the other reference types described in this chapter, but they also have a special behavior related to their primitive-type equivalents. Every time a primitive value is read, an object of the corresponding primitive wrapper type is created behind the scenes, allowing access to any number of methods for manipulating the data. Consider the following example:

var s1 = "some text";
var s2 = s1.substring(2);

In this code, s1 is a variable containing a string, which is a primitive value. On the next line, the substring() method is called on s1 and stored in s2. Primitive values aren’t objects, so logically they shouldn’t have methods, though this still works as you would expect. In truth, there is a lot going on behind the scenes to allow this seamless operation. When s1 is accessed in the second line, it is being accessed in read mode, which is to say that its value is being read from memory. Any time a string value is accessed in read mode, the following three steps occur:

1. Create an instance of the String type.

2. Call the specified method on the instance.

3. Destroy the instance.

You can think of these three steps as they’re used in the following three lines of ECMAScript code:

var s1 = new String("some text");
var s2 = s1.substring(2);
s1 = null;

This behavior allows the primitive string value to act like an object. These same three steps are repeated for Boolean and numeric values using the Boolean and Number types, respectively.

The major difference between reference types and primitive wrapper types is the lifetime of the object. When you instantiate a reference type using the new operator, it stays in memory until it goes out of scope, whereas automatically created primitive wrapper objects exist for only one line of code before they are destroyed. This means that properties and methods cannot be added at runtime. Take this for example:

var s1 = "some text";
s1.color = "red";
alert(s1.color);   //undefined

Here, the second line attempts to add a color property to the string s1. However, when s1 is accessed on the third line, the color property is gone. This happens because the String object that was created in the second line is destroyed by the time the third line is executed. The third line creates its own String object, which doesn’t have the color property.

It is possible to create the primitive wrapper objects explicitly using the Boolean, Number, and String constructors. This should be done only when absolutely necessary, because it is often confusing for developers as to whether they are dealing with a primitive or reference value. Calling typeof on an instance of a primitive wrapper type returns "object", and all primitive wrapper objects convert to the Boolean value true.

The Object constructor also acts as a factory method and is capable of returning an instance of a primitive wrapper based on the type of value passed into the constructor. For example:

var obj = new Object("some text");
alert(obj instanceof String);   //true

When a string is passed into the Object constructor, an instance of String is created; a number argument results in an instance of Number, while a Boolean argument returns an instance of Boolean.

Keep in mind that calling a primitive wrapper constructor using new is not the same as calling the casting function of the same name. For example:

var value = "25";
var number = Number(value);   //casting function
alert(typeof number);   //"number"
 
var obj = new Number(value);    //constructor
alert(typeof obj);              //"object"

In this example, the variable number is filled with a primitive number value of 25 while the variable obj is filled with an instance of Number. For more on casting functions, see Chapter 3.

Even though it’s not recommended to create primitive wrapper objects explicitly, their functionality is important in being able to manipulate primitive values. Each primitive wrapper type has methods that make data manipulation easier.

The Boolean Type

The Boolean type is the reference type corresponding to the Boolean values. To create a Boolean object, use the Boolean constructor and pass in either true or false, as in the following example:

var booleanObject = new Boolean(true);

Instances of Boolean override the valueOf() method to return a primitive value of either true or false. The toString() method is also overridden to return a string of "true" or "false" when called. Unfortunately, not only are Boolean objects of little use in ECMAScript, they can actually be rather confusing. The problem typically occurs when trying to use Boolean objects in Boolean expressions, as in this example:

var falseObject = new Boolean(false);
var result = falseObject && true;
alert(result);  //true
                   
var falseValue = false;
result = falseValue && true;
alert(result);  //false

BooleanTypeExample01.htm

In this code, a Boolean object is created with a value of false. That same object is then ANDed with the primitive value true. In Boolean math, false AND true is equal to false. However, in this line of code, it is the object named falseObject being evaluated, not its value (false). As discussed earlier, all objects are automatically converted to true in Boolean expressions, so falseObject actually is given a value of true in the expression. Then, true ANDed with true is equal to true.

There are a couple of other differences between the primitive and the reference Boolean types. The typeof operator returns "boolean" for the primitive but "object" for the reference. Also, a Boolean object is an instance of the Boolean type and will return true when used with the instanceof operator, whereas a primitive value returns false, as shown here:

alert(typeof falseObject);   //object
alert(typeof falseValue);    //boolean
alert(falseObject instanceof Boolean);  //true
alert(falseValue instanceof Boolean);   //false

It’s very important to understand the difference between a primitive Boolean value and a Boolean object — it is recommended to never use the latter.

The Number Type

The Number type is the reference type for numeric values. To create a Number object, use the Number constructor and pass in any number. Here’s an example:

var numberObject = new Number(10);

NumberTypeExample01.htm

As with the Boolean type, the Number type overrides valueOf(), toLocaleString(), and toString(). The valueOf() method returns the primitive numeric value represented by the object, whereas the other two methods return the number as a string. As mentioned in Chapter 3, the toString() method optionally accepts a single argument indicating the radix in which to represent the number, as shown in the following examples:

var num = 10;
alert(num.toString());       //"10"
alert(num.toString(2));      //"1010"
alert(num.toString(8));      //"12"
alert(num.toString(10));     //"10"
alert(num.toString(16));     //"a"

NumberTypeExample01.htm

Aside from the inherited methods, the Number type has several additional methods used to format numbers as strings.

The toFixed() method returns a string representation of a number with a specified number of decimal points, as in this example:

var num = 10;
alert(num.toFixed(2));    //"10.00"

NumberTypeExample01.htm

Here, the toFixed() method is given an argument of 2, which indicates how many decimal places should be displayed. As a result, the method returns the string "10.00", filling out the empty decimal places with zeros. If the number has more than the given number of decimal places, the result is rounded to the nearest decimal place, as shown here:

var num = 10.005;
alert(num.toFixed(2));    //"10.01"

The rounding nature of toFixed() may be useful for applications dealing with currency, though it’s worth noting that rounding using this method differs between browsers. Internet Explorer through version 8 incorrectly rounds numbers in the range {(−0.94,−0.5], [0.5,0.94)} when zero is passed to toFixed(). In these cases, Internet Explorer returns 0 when it should return either -1 or 1 (depending on the sign); other browsers behave as expected, and Internet Explorer 9 fixes this issue.

The toFixed() method can represent numbers with 0 through 20 decimal places. Some browsers may support larger ranges, but this is the typically implemented range.

Another method related to formatting numbers is the toExponential() method, which returns a string with the number formatted in exponential notation (aka e-notation). Just as with toFixed(), toExponential() accepts one argument, which is the number of decimal places to output. Consider this example:

var num = 10;
alert(num.toExponential(1));    //"1.0e+1"

This code outputs "1.0e+1" as the result. Typically, this small number wouldn’t be represented using e-notation. If you want to have the most appropriate form of the number, the toPrecision() method should be used instead.

The toPrecision() method returns either the fixed or the exponential representation of a number, depending on which makes the most sense. This method takes one argument, which is the total number of digits to use to represent the number (not including exponents). Here’s an example:

var num = 99;
alert(num.toPrecision(1));    //"1e+2"
alert(num.toPrecision(2));    //"99"
alert(num.toPrecision(3));    //"99.0"

NumberTypeExample01.htm

In this example, the first task is to represent the number 99 with a single digit, which results in "1e+2", otherwise known as 100. Because 99 cannot accurately be represented by just one digit, the method rounded up to 100, which can be represented using just one digit. Representing 99 with two digits yields "99" and with three digits returns "99.0". The toPrecision() method essentially determines whether to call toFixed() or toExponential() based on the numeric value you’re working with; all three methods round up or down to accurately represent a number with the correct number of decimal places.

The toPrecision() method can represent numbers with 1 through 21 decimal places. Some browsers may support larger ranges, but this is the typically implemented range.

Similar to the Boolean object, the Number object gives important functionality to numeric values but really should not be instantiated directly because of the same potential problems. The typeof and instanceof operators work differently when dealing with primitive numbers versus reference numbers, as shown in the following examples:

var numberObject = new Number(10);
var numberValue = 10;
alert(typeof numberObject);   //"object"
alert(typeof numberValue);    //"number"
alert(numberObject instanceof Number);  //true
alert(numberValue instanceof Number);   //false

Primitive numbers always return "number" when typeof is called on them, whereas Number objects return "object". Similarly, a Number object is an instance of Number, but a primitive number is not.

The String Type

The String type is the object representation for strings and is created using the String constructor as follows:

var stringObject = new String("hello world");

StringTypeExample01.htm

The methods of a String object are available on all string primitives. All three of the inherited methods — valueOf(), toLocaleString(), and toString() — return the object’s primitive string value.

Each instance of String contains a single property, length, which indicates the number of characters in the string. Consider the following example:

var stringValue = "hello world";
alert(stringValue.length);   //"11"

This example outputs "11", the number of characters in "hello world". Note that even if the string contains a double-byte character (as opposed to an ASCII character, which uses just one byte), each character is still counted as one.

The String type has a large number of methods to aid in the dissection and manipulation of strings in ECMAScript.

Character Methods

Two methods access specific characters in the string: charAt() and charCodeAt(). These methods each accept a single argument, which is the character’s zero-based position. The charAt() method simply returns the character in the given position as a single-character string. (There is no character type in ECMAScript.) For example:

var stringValue = "hello world";
alert(stringValue.charAt(1));   //"e"

The character in position 1 of "hello world" is "e", so calling charAt(1) returns "e". If you want the character’s character code instead of the actual character, then calling charCodeAt() is the appropriate choice, as in the following example:

var stringValue = "hello world";
alert(stringValue.charCodeAt(1));   //outputs "101"

This example outputs "101", which is the character code for the lowercase "e" character.

ECMAScript 5 defines another way to access an individual character. Supporting browsers allow you to use bracket notation with a numeric index to access a specific character in the string, as in this example:

var stringValue = "hello world";
alert(stringValue[1]);   //"e"

Individual character access using bracket notation is supported in Internet Explorer 8 and all current versions of Firefox, Safari, Chrome, and Opera. If this syntax is used in Internet Explorer 7 or earlier, the result is undefined (though not the special value undefined).

String-Manipulation Methods

Several methods manipulate the values of strings. The first of these methods is concat(), which is used to concatenate one or more strings to another, returning the concatenated string as the result. Consider the following example:

var stringValue = "hello ";
var result = stringValue.concat("world");
alert(result);            //"hello world"
alert(stringValue);       //"hello"

The result of calling the concat() method on stringValue in this example is "hello world" — the value of stringValue remains unchanged. The concat() method accepts any number of arguments, so it can create a string from any number of other strings, as shown here:

var stringValue = "hello ";
var result = stringValue.concat("world", "!");
alert(result);            //"hello world!"
alert(stringValue);       //"hello"

This modified example concatenates "world" and "!" to the end of "hello ". Although the concat() method is provided for string concatenation, the addition operator (+) is used more often and, in most cases, actually performs better than the concat() method even when concatenating multiple strings.

ECMAScript provides three methods for creating string values from a substring: slice(), substr(), and substring(). All three methods return a substring of the string they act on, and all accept either one or two arguments. The first argument is the position where capture of the substring begins; the second argument, if used, indicates where the operation should stop. For slice() and substring(), this second argument is the position before which capture is stopped (all characters up to this point are included except the character at that point). For substr(), the second argument is the number of characters to return. If the second argument is omitted in any case, it is assumed that the ending position is the length of the string. Just as with the concat() method, slice(), substr(), and substring() do not alter the value of the string itself — they simply return a primitive string value as the result, leaving the original unchanged. Consider this example:

var stringValue = "hello world";
alert(stringValue.slice(3));        //"lo world"
alert(stringValue.substring(3));    //"lo world"
alert(stringValue.substr(3));       //"lo world"
alert(stringValue.slice(3, 7));     //"lo w"
alert(stringValue.substring(3,7));  //"lo w"
alert(stringValue.substr(3, 7));    //"lo worl"

StringTypeManipulationMethodsExample01.htm

In this example, slice(), substr(), and substring() are used in the same manner and, in most cases, return the same value. When given just one argument, 3, all three methods return "lo world", because the second "l" in "hello" is in position 3. When given two arguments, 3 and 7, slice() and substring() return "lo w" (the "o" in "world" is in position 7, so it is not included), while substr() returns "lo worl", because the second argument specifies the number of characters to return.

There are different behaviors for these methods when an argument is a negative number. For the slice() method, a negative argument is treated as the length of the string plus the negative argument.

For the substr() method, a negative first argument is treated as the length of the string plus the number, whereas a negative second number is converted to 0. For the substring() method, all negative numbers are converted to 0. Consider this example:

var stringValue = "hello world";
alert(stringValue.slice(-3));         //"rld"
alert(stringValue.substring(-3));     //"hello world"
alert(stringValue.substr(-3));        //"rld"
alert(stringValue.slice(3, -4));      //"lo w"
alert(stringValue.substring(3, -4));  //"hel"
alert(stringValue.substr(3, -4));     //"" (empty string)

StringTypeManipulationMethodsExample01.htm

This example clearly indicates the differences between three methods. When slice() and substr() are called with a single negative argument, they act the same. This occurs because -3 is translated into 7 (the length plus the argument), effectively making the calls slice(7) and substr(7). The substring() method, on the other hand, returns the entire string, because -3 is translated to 0.

Because of a deviation in the Internet Explorer implementation of JavaScript, passing in a negative number to substr() results in the original string being returned. Internet Explorer 9 fixes this issue.

When the second argument is negative, the three methods act differently from one another. The slice() method translates the second argument to 7, making the call equivalent to slice(3, 7) and so returning "lo w". For the substring() method, the second argument gets translated to 0, making the call equivalent to substring(3, 0), which is actually equivalent to substring(0,3), because this method expects that the smaller number is the starting position and the larger one is the ending position. For the substr() method, the second argument is also converted to 0, which means there should be zero characters in the returned string, leading to the return value of an empty string.

String Location Methods

There are two methods for locating substrings within another string: indexOf() and lastIndexOf(). Both methods search a string for a given substring and return the position (or -1 if the substring isn’t found). The difference between the two is that the indexOf() method begins looking for the substring at the beginning of the string, whereas the lastIndexOf() method begins looking from the end of the string. Consider this example:

var stringValue = "hello world";
alert(stringValue.indexOf("o"));         //4
alert(stringValue.lastIndexOf("o"));     //7

StringTypeLocationMethodsExample01.htm

Here, the first occurrence of the string "o" is at position 4, which is the "o" in "hello". The last occurrence of the string "o" is in the word "world", at position 7. If there is only one occurrence of "o" in the string, then indexOf() and lastIndexOf() return the same position.

Each method accepts an optional second argument that indicates the position to start searching from within the string. This means that the indexOf() method will start searching from that position and go toward the end of the string, ignoring everything before the start position, whereas lastIndexOf() starts searching from the given position and continues searching toward the beginning of the string, ignoring everything between the given position and the end of the string. Here’s an example:

var stringValue = "hello world";
alert(stringValue.indexOf("o", 6));         //7
alert(stringValue.lastIndexOf("o", 6));     //4

When the second argument of 6 is passed into each method, the results are the opposite from the previous example. This time, indexOf() returns 7 because it starts searching the string from position 6 (the letter "w") and continues to position 7, where "o" is found. The lastIndexOf() method returns 4 because the search starts from position 6 and continues back toward the beginning of the string, where it encounters the "o" in "hello". Using this second argument allows you to locate all instances of a substring by looping callings to indexOf() or lastIndexOf(), as in the following example:

var stringValue = "Lorem ipsum dolor sit amet, consectetur adipisicing elit";
var positions = new Array();
var pos = stringValue.indexOf("e");
                   
while(pos > -1){
    positions.push(pos);
    pos = stringValue.indexOf("e", pos + 1);
}
    
alert(positions);    //"3,24,32,35,52"

StringTypeLocationMethodsExample02.htm

This example works through a string by constantly increasing the position at which indexOf() should begin. It begins by getting the initial position of "e" in the string and then enters a loop that continually passes in the last position plus one to indexOf(), ensuring that the search continues after the last substring instance. Each position is stored in the positions array so the data can be used later.

The trim() Method

ECMAScript 5 introduces a trim() method on all strings. The trim() method creates a copy of the string, removes all leading and trailing white space, and then returns the result. For example:

var stringValue = "    hello world    ";
var trimmedStringValue = stringValue.trim();
alert(stringValue);            //"    hello world    "
alert(trimmedStringValue);     //"hello world"

Note that since trim() returns a copy of a string, the original string remains intact with leading and trailing white space in place. This method has been implemented in Internet Explorer 9+, Firefox 3.5+, Safari 5+, Opera 10.5+, and Chrome. Firefox 3.5+, Safari 5+, and Chrome 8+ also support two nonstandard trimLeft() and trimRight() methods that remove white space only from the beginning or end of the string, respectively.

String Case Methods

The next set of methods involves case conversion. Four methods perform case conversion: toLowerCase(), toLocaleLowerCase(), toUpperCase(), and toLocaleUpperCase(). The toLowerCase() and toUpperCase() methods are the original methods, modeled after the same methods in java.lang.String. The toLocaleLowerCase() and toLocaleUpperCase() methods are intended to be implemented based on a particular locale. In many locales, the locale-specific methods are identical to the generic ones; however, a few languages (such as Turkish) apply special rules to Unicode case conversion, and this necessitates using the locale-specific methods for proper conversion. Here are some examples:

var stringValue = "hello world";
alert(stringValue.toLocaleUpperCase());  //"HELLO WORLD"
alert(stringValue.toUpperCase());        //"HELLO WORLD"
alert(stringValue.toLocaleLowerCase());  //"hello world"
alert(stringValue.toLowerCase());        //"hello world"

StringTypeCaseMethodExample01.htm

This code outputs "HELLO WORLD" for both toLocaleUpperCase() and toUpperCase(), just as "hello world" is output for both toLocaleLowerCase() and toLowerCase(). Generally speaking, if you do not know the language in which the code will be running, it is safer to use the locale-specific methods.

String Pattern-Matching Methods

The String type has several methods designed to pattern-match within the string. The first of these methods is match() and is essentially the same as calling a RegExp object’s exec() method. The match() method accepts a single argument, which is either a regular-expression string or a RegExp object. Consider this example:

var text = "cat, bat, sat, fat";        
var pattern = /.at/;
                   
//same as pattern.exec(text)
var matches = text.match(pattern);
alert(matches.index);        //0
alert(matches[0]);           //"cat"
alert(pattern.lastIndex);   //0