Modules and packages are the building blocks for breaking down your application into smaller pieces. A module encapsulates some functionality, primarily JavaScript functions, while hiding implementation details and exposing an API for the module. Modules can be distributed by third parties and installed for use by our modules. An installed module is called a package.

The npm package repository is a huge library of modules that's available for all Node.js developers to use. Within that library are hundreds of thousands of packages you can be used to accelerate the development of your application.

Since modules and packages are the building blocks of your application, understanding how they work is vital to your success with Node.js. By the end of this chapter, you will have a solid grounding in both CommonJS and ES6 modules, how to structure the modules in an application, how to manage dependencies on third-party packages, and how to publish your own packages.

In this chapter, we will cover the following topics:

Definitions of all types of Node.js modules and how to structure both simple and complex modules
Using CommonJS and ES2015/ES6 modules and when to use each
Understanding how Node.js finds modules and installed packages, so you can better structure your application
Using the npm package management system (and Yarn) to manage application dependencies, to publish packages, and to record administrative scripts for the project

So, let's get on with it.

Defining a Node.js module

Modules are the basic building blocks for constructing Node.js applications. A Node.js module encapsulates functions, hiding details inside a well-protected container, and exposing an explicitly declared API.

When Node.js was created, the ES6 module system, of course, did not yet exist. Ryan Dahl, therefore, based on the Node.js module system on the CommonJS standard. The examples we've seen so far are modules written to that format. With ES2015/ES2016, a new module format was created for use with all JavaScript implementations. This new module format is used by both front-end engineers in their in-browser JavaScript code and by Node.js engineers, and for any other JavaScript implementation.

Because ES6 modules are now the standard module format, the Node.js Technical Steering Committee (TSC) committed to first-class support for ES6 modules alongside the CommonJS format. Starting with Node.js 14.x, the Node.js TSC delivered on that promise.

Every source file used in an application on the Node.js platform is a module. Over the next few sections, we'll examine the different types of modules, starting with the CommonJS module format.

Throughout this book, we'll identify traditional Node.js modules as CommonJS modules, and the new module format as ES6 modules.

To start our exploration of Node.js modules, we must, of course, start at the beginning.

Examining the traditional Node.js module format

We already saw CommonJS modules in action in the previous chapter. It's now time to see what they are and how they work.

In the ls.js example in Chapter 2, Setting Up Node.js, we wrote the following code to pull in the fs module, giving us access to its functions:

const fs = require('fs');

The require function is given a module identifier, and it searches for the module named by that identifier. If found, it loads the module definition into the Node.js runtime and making its functions available. In this case, the fs object contains the code (and data) exported by the fs module. The fs module is part of the Node.js core and provides filesystem functions.

By declaring fs as const, we have a little bit of assurance against making coding mistakes. We could mistakenly assign a value to fs, and then the program would fail, but as a const we know the reference to the fs module will not be changed.

The file, ls.js, is itself a module because every source file we use on Node.js is a module. In this case, it does not export anything but is instead a script that consumes other modules.

What does it mean to say the fs object contains the code exported by the fs module? In a CommonJS module, there is an object, module, provided by Node.js, with which the module's author describes the module. Within this object is a field, module.exports, containing the functions and data exported by the module. The return value of the require function is the object. The object is the interface provided by the module to other modules. Anything added to the module.exports object is available to other pieces of code, and everything else is hidden. As a convenience, the module.exports object is also available as exports.

The module object contains several fields that you might find useful. Refer to the online Node.js documentation for details.

Because exports is an alias of module.exports, the following two lines of code are equivalent:

exports.funcName = function(arg, arg1) { ... };
module.exports.funcName = function(arg, arg2) { .. };

Whether you use module.exports or exports is up to you. However, do not ever do anything like the following:

exports = function(arg, arg1) { ... };

Any assignment to exports will break the alias, and it will no longer be equivalent to module.exports. Assignments to exports.something are okay, but assigning to exports will cause failure. If your intent is to assign a single object or function to be returned by require, do this instead:

module.exports = function(arg, arg1) { ... };

Some modules do export a single function because that's how the module author envisioned delivering the desired functionality.

When we said ls.js does not export anything, we meant that ls.js did not assign anything to module.exports.

To give us a brief example, let's create a simple module, named simple.js:

var count = 0;
exports.next = function() { return ++count; };
exports.hello = function() {
  return "Hello, world!";
};

We have one variable, count, which is not attached to the exports object, and a function, next, which is attached. Because count is not attached to exports, it is private to the module.

Any module can have private implementation details that are not exported and are therefore not available to any other code.

Now, let's use the module we just wrote:

$ node
> const s = require('./simple');
undefined
> s.hello();
'Hello, world!'
> s.next();
1
> s.next();
2
> s.next();
3
> console.log(s.count);
undefined
undefined
>

The exports object in the module is the object that is returned by require('./simple'). Therefore, each call to s.next calls the next function in simple.js. Each returns (and increments) the value of the local variable, count. An attempt to access the private field, count, shows it's unavailable from outside the module.

This is how Node.js solves the global object problem of browser-based JavaScript. The variables that look like they are global variables are only global to the module containing the variable. These variables are not visible to any other code.

The Node.js package format is derived from the CommonJS module system (http://commonjs.org). When developed, the CommonJS team aimed to fill a gap in the JavaScript ecosystem. At that time, there was no standard module system, making it trickier to package JavaScript applications. The require function, the exports object, and other aspects of Node.js modules come directly from the CommonJS Modules/1.0 spec.

The module object is a global-to-the-module object injected by Node.js. It also injects two other variables: __dirname and __filename. These are useful for helping code in a module know where it is located in the filesystem. Primarily, this is used for loading other files using a path relative to the module's location.

For example, one can store assets like CSS or image files in a directory relative to the module. An app framework can then make the files available via an HTTP server. In Express, we do so with this code snippet:

app.use('/assets/vendor/jquery', express.static( 
 path.join(__dirname, 'node_modules', 'jquery')));

This says that HTTP requests on the /assets/vendor/jquery URL are to be handled by the static handler in Express, from the contents of a directory relative to the directory containing the module. Don't worry about the details because we'll discuss this more carefully in a later chapter. Just notice that __dirname is useful to calculate a filename relative to the location of the module source code.

To see it in action, create a file named dirname.js containing the following:

console.log(`dirname: ${__dirname}`);
console.log(`filename: ${__filename}`);

This lets us see the values we receive:

$ node dirname.js 
dirname: /home/david/Chapter03
filename: /home/david/Chapter03/dirname.js

Simple enough, but as we'll see later these values are not directly available in ES6 modules.

Now that we've got a taste for CommonJS modules, let's take a look at ES2015 modules.

Examining the ES6/ES2015 module format

ES6 modules are a new module format designed for all JavaScript environments. While Node.js has always had a good module system, browser-side JavaScript has not. That meant the browser-side community had to use non-standardized solutions. The CommonJS module format was one of those non-standard solutions, which was borrowed for use in Node.js. Therefore, ES6 modules are a big improvement for the entire JavaScript world, by getting everyone on the same page with a common module format and mechanisms.

An issue we have to deal with is the file extension to use for ES6 modules. Node.js needs to know whether to parse using the CommonJS or ES6 module syntax. To distinguish between them, Node.js uses the file extension .mjs to denote ES6 modules, and .js to denote CommonJS modules. However, that's not the entire story since Node.js can be configured to recognize the .js files as ES6 modules. We'll give the exact particulars later in this chapter.

The ES6 and CommonJS modules are conceptually similar. Both support exporting data and functions from a module, and both support hiding implementation inside a module. But they are very different in many practical ways.

Let's start with defining an ES6 module. Create a file named simple2.mjs in the same directory as the simple.js example that we looked at earlier:

let count = 0;
export function next() { return ++count; }
function squared() { return Math.pow(count, 2); }
export function hello() {
    return "Hello, world!";
}
export default function() { return count; }
export const meaning = 42;
export let nocount = -1;
export { squared };

This is similar to simple.js but with a few additions to demonstrate further features. As before count is a private variable that isn't exported, and next is an exported function that increments count.

The export keyword declares what is being exported from an ES6 module. In this case, we have several exported functions and two exported variables. The export keyword can be put in front of any top-level declaration, such as variable, function, or class declarations:

 export function next() { .. }

The effect of this is similar to the following:

module.exports.next = function() { .. }

The intent of both is essentially the same: to make a function or other object available to code outside the module. But instead of explicitly creating an object, module.exports, we're simply declaring what is to be exported. A statement such as export function next() is a named export, meaning the exported function (as here) or object has a name, and that code outside the module uses that name to access the object. As we see here, named exports can be functions or objects, and they may also be class definitions.

The default export from a module, defined with export default, can be done once per module. The default export is what code outside the module accesses when using the module object itself, rather than when using one of the exports from the module.

You can also declare something, such as the squared function, and then export it later.

Now let's see how to use the ES2015 module. Create a simpledemo.mjs file with the following:

import * as simple2 from './simple2.mjs';

console.log(simple2.hello());
console.log(`${simple2.next()} ${simple2.squared()}`);
console.log(`${simple2.next()} ${simple2.squared()}`);
console.log(`${simple2.default()} ${simple2.squared()}`);
console.log(`${simple2.next()} ${simple2.squared()}`);
console.log(`${simple2.next()} ${simple2.squared()}`);
console.log(`${simple2.next()} ${simple2.squared()}`);
console.log(simple2.meaning);

The import statement does what it says: it imports objects exported from a module. Because it uses the import * as foo syntax, it imports everything from the module, attaching everything to an object, in this case named simple2. This version of the import statement is most similar to a traditional Node.js require statement because it creates an object with fields containing the objects exported from the module.

This is how the code executes:

$ node simpledemo.mjs 
Hello, world!
1 1
2 4
2 4
3 9
4 16
5 25
42

In the past, the ES6 module format was hidden behind an option flag, --experimental-module, but as of Node.js 13.2 that flag is no longer required. Accessing the default export is accomplished by accessing the field named default. Accessing an exported value, such as the meaning field, is done without parentheses because it is a value and not a function.

Now to see a different way to import objects from a module, create another file, named simpledemo2.mjs, containing the following:

import { 
    default as simple, hello, next, meaning 
} from './simple2.mjs';
console.log(hello());
console.log(next());
console.log(next());
console.log(simple());
console.log(next());
console.log(next());
console.log(next());
console.log(meaning);

In this case, the import is treated similarly to an ES2015 destructuring assignment. With this style of import, we specify exactly what is to be imported, rather than importing everything. Furthermore, instead of attaching the imported things to a common object, and therefore executing simple2.next(), the imported things are executed using their simple name, as in next().

The import for default as simple is the way to declare an alias of an imported thing. In this case, it is necessary so that the default export has a name other than default.

Node.js modules can be used from the ES2015 .mjs code. Create a file named ls.mjs containing the following:

import { promises as fs } from 'fs';

async function listFiles() {
    const files = await fs.readdir('.');
    for (const file of files) {
        console.log(file);
    }
}

listFiles().catch(err => { console.error(err); });

This is a reimplementation of the ls.js example in Chapter 2, Setting Up Node.js. In both cases, we're using the promises submodule of the fs package. To do this with the import statement, we access the promises export from the fs module, and use the as clause to rename fs.promises to fs. This way we can use an async function rather than deal with callbacks.

Otherwise, we have an async function, listFiles, that performs filesystem operations to read filenames from a directory. Because listFiles is async, it returns a Promise, and we must catch any errors using a .catch clause.

Executing the script gives the following:

$ node ls.mjs
ls.mjs
module1.js
module2.js
simple.js
simple2.mjs
simpledemo.mjs
simpledemo2.mjs

The last thing to note about ES2015 module code is that the import and export statements must be top-level code. Try putting an export inside a simple block like this:

{
   export const meaning = 42;
}

That innocent bit of code results in an error:

$ node badexport.mjs 
file:///home/david/Chapter03/badexport.mjs:2
    export const meaning = 42;
    ^^^^^^

SyntaxError: Unexpected token 'export'
    at Loader.moduleStrategy (internal/modules/esm/translators.js:83:18)
    at async link (internal/modules/esm/module_job.js:36:21)

While there are a few more details about the ES2015 modules, these are their most important attributes.

Remember that the objects injected into CommonJS modules are not available to ES6 modules. The __dirname and __filename objects are the most important, since there are many cases where we compute a filename relative to the currently executing module. Let us explore how to handle that issue.

Injected objects in ES6 modules

Just as for CommonJS modules, certain objects are injected into ES6 modules. Furthermore, ES6 modules do not receive the __dirname, and __filename objects or other objects that are injected into CommonJS modules.

The import.meta meta-property is the only value injected into ES6 modules. In Node.js it contains a single field, url. This is the URL from which the currently executing module was loaded.

Using import.meta.url, we can compute __dirname and __filename.

Computing the missing __dirname variable in ES6 modules

If we make a duplicate of dirname.js as dirname.mjs, so it will be interpreted as an ES6 module, we get the following:

$ cp dirname.js dirname.mjs
$ node dirname.mjs 
console.log(`dirname: ${__dirname}`);
 ^
ReferenceError: __dirname is not defined
 at file:///home/david/Chapter03/dirname.mjs:1:25
 at ModuleJob.run (internal/modules/esm/module_job.js:109:37)
 at async Loader.import (internal/modules/esm/loader.js:132:24)

Since __dirname and __filename are not part of the JavaScript specification, they are not available within ES6 modules. Enter the import.meta.url object, from which we can compute __dirname and __filename. To see it in action, create a dirname-fixed.mjs file containing the following:

import { fileURLToPath } from 'url';
import { dirname } from 'path';

console.log(`import.meta.url: ${import.meta.url}`);

const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

console.log(`dirname: ${__dirname}`);
console.log(`filename: ${__filename}`);

We are importing a couple of useful functions from the url and path core packages. While we could take the import.meta.url object and do our own computations, these functions already exist. The computation is to extract the pathname portion of the module URL, to compute __filename, and then use dirname to compute __dirname.

$ node dirname-fixed.mjs 
import.meta.url: file:///home/david/Chapter03/dirname-fixed.mjs
dirname: /home/david/Chapter03
filename: /home/david/Chapter03/dirname-fixed.mjs

And we see the file:// URL of the module, and the computed values for __dirname and __filename using the built-in core functions.

We've talked about both the CommonJS and ES6 module formats, and now it's time to talk about using them together in an application.

Using CommonJS and ES6 modules together

Node.js supports two module formats for JavaScript code: the CommonJS format originally developed for Node.js, and the new ES6 module format. The two are conceptually similar, but there are many practical differences. Because of this, we will face situations of using both in the same application and will need to know how to proceed.

First is the question of file extensions and recognizing which module format to use. The ES6 module format is used in the following situations:

Files where the filename ends in .mjs.
If the package.json has a field named type with the value module, then filenames ending with .js.
If the node binary is executed with the --input-type=module flag, then any code passed through the --eval or --print argument, or piped in via STDIN (the standard input), is interpreted as ES6 module code.

That's fairly straight-forward. ES6 modules are in files named with the .mjs extension, unless you've declared in the package.json that the package defaults to ES6 modules, in which case files named with the .js extension are also interpreted as ES6 modules.

The CommonJS module format is used in the following situations:

Files where the file name ends in .cjs.
If the package.json does not contain a type field, or if it contains a type field with a value of commonjs, the filenames will end with .js.
If the node binary is executed with the --input-type flag or with the --type-type=commonjs flag, then any code passed through the --eval or --print argument, or piped in via STDIN (the standard input), is interpreted as CommonJS module code.

Again this is straight-forward, with Node.js defaulting to CommonJS modules for the .js files. If the package is explicitly declared to default to CommonJS modules, then Node.js will interpret the .js files as CommonJS.

The Node.js team strongly recommends that package authors include a type field in package.json, even if the type is commonjs.

Consider a package.json with this declaration:

{
    "type": "module" ...
}

This, of course, informs Node.js that the package defaults to ES6 modules. Therefore, this command interprets the module as an ES6 module:

$ node my-module.js

This command will do the same, even without the package.json entry:

$ node --input-type=module my-module.js

If instead, the type field had the commonjs, or the --input-type flag specified as commonjs, or if both those were completely missing, then my-module.js would be interpreted as a CommonJS module.

These rules also apply to the import statement, the import() function, and the require() function. We will cover those commands in more depth in a later section. In the meantime, let's learn how the import() function partly resolves the inability to use ES6 modules in a CommonJS module.

Using ES6 modules from CommonJS using import()

The import statement in ES6 modules is a statement, and not a function like require(). This means that import can only be given a static string, and you cannot compute the module identifier to import. Another limitation is that import only works in ES6 modules, and therefore a CommonJS module cannot load an ES6 module. Or, can it?

Since the import() function is available in both CommonJS and ES6 modules, that means we should be able to use it to import ES6 modules in a CommonJS module.

To see how this works, create a file named simple-dynamic-import.js containing the following:

async function simpleFn() {
    const simple2 = await import('./simple2.mjs');
    console.log(simple2.hello());
    console.log(simple2.next());
    console.log(simple2.next());
    console.log(`count = ${simple2.default()}`);
    console.log(`Meaning: ${simple2.meaning}`);
}

simpleFn().catch(err => { console.error(err); });

This is a CommonJS module that's using an ES6 module we created earlier. It simply calls a few of the functions, nothing exciting except that it is using an ES6 module when we said earlier import only works in ES6 modules. Let's see this module in action:

$ node simple-dynamic-import.js 
Hello, world!
1
2
count = 2
Meaning: 42

This is a CommonJS module successfully executing code contained in an ES6 module simply by using import().

Notice that import() was called not in the global scope of the module, but inside an async function. As we saw earlier, the ES6 module keyword statements like export and import must be called in the global scope. However, import() is an asynchronous function, limiting our ability to use it in the global scope.

The import statement is itself an asynchronous process, and by extension the import() function is asynchronous, while the Node.js require() function is synchronous.

In this case, we executed import() inside an async function using the await keyword. Therefore, even if import() were used in the global scope, it would be tricky getting a global-scope variable to hold the reference to that module. To see, why let's rewrite that example as simple-dynamic-import-fail.js:

const simple2 = import('./simple2.mjs');
console.log(simple2);
console.log(simple2.hello());
console.log(simple2.next());
console.log(simple2.next());
console.log(`count = ${simple2.default()}`);
console.log(`Meaning: ${simple2.meaning}`);

It's the same code but running in the global scope. In the global scope, we cannot use the await keyword, so we should expect that simple2 will contain a pending Promise. Running the script gives us this failure:

$ node simple-dynamic-import-fail.js 
Promise { <pending> }
/home/david/Chapter03/simple-dynamic-import-fail.js:4
console.log(simple2.hello());
 ^
TypeError: simple2.hello is not a function
 at Object.<anonymous> (/home/david/Chapter03/simple-dynamic-import-fail.js:4:21)
 at Module._compile (internal/modules/cjs/loader.js:1139:30)
 at Object.Module._extensions..js (internal/modules/cjs/loader.js:1159:10)
 at Module.load (internal/modules/cjs/loader.js:988:32)
 at Function.Module._load (internal/modules/cjs/loader.js:896:14)
 at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
 at internal/main/run_main_module.js:17:47

We see that simple2 does indeed contain a pending Promise, meaning that import() has not yet finished. Since simple2 does not contain a reference to the module, attempts to call the exported function fail.

The best we could do in the global scope is to attach the .then and .catch handlers to the import() function call. That would wait until the Promise transitions to either a success or failure state, but the loaded module would be inside the callback function. We'll see this example later in the chapter.

Let's now see how modules hide implementation details.

Hiding implementation details with encapsulation in CommonJS and ES6 modules

We've already seen a couple of examples of how modules hide implementation details with the simple.js example and the programs we examined in Chapter 2, Setting up Node.js. Let's take a closer look.

Node.js modules provide a simple encapsulation mechanism to hide implementation details while exposing an API. To review, in CommonJS modules the exposed API is assigned to the module.exports object, while in ES6 modules the exposed API is declared with the export keyword. Everything else inside a module is not available to code outside the module.

In practice, CommonJS modules are treated as if they were written as follows:

(function(exports, require, module, __filename, __dirname) {
// Module code actually lives in here
});

Thus, everything within the module is contained within an anonymous private namespace context. This is how the global object problem is resolved: everything in a module that looks global is actually contained within a private context. This also explains how the injected variables are actually injected into the module. They are parameters to the function that creates the module.

The other advantage is code safety. Because the private code in a module is stashed in a private namespace, it is impossible for code outside the module to access the private code or data.

Let's take a look at a practical demonstration of the encapsulation. Create a file named module1.js, containing the following:

const A = "value A";
const B = "value B";
exports.values = function() {
   return { A: A, B: B };
}

Then, create a file named module2.js, containing the following:

const util = require('util');
const A = "a different value A";
const B = "a different value B";
const m1 = require('./module1');
console.log(`A=${A} B=${B} values=${util.inspect(m1.values())}`);
console.log(`${m1.A} ${m1.B}`);
const vals = m1.values();
vals.B = "something completely different";
console.log(util.inspect(vals));
console.log(util.inspect(m1.values()));

Using these two modules we can see how each module is its own protected bubble.

Then run it as follows:

$ node module2.js 
A=a different value A B=a different value B values={ A: 'value A', B: 'value B' }
undefined undefined
{ A: 'value A', B: 'something completely different' }
{ A: 'value A', B: 'value B' }

This artificial example demonstrates encapsulation of the values in module1.js from those in module2.js. The A and B values in module1.js don't overwrite A and B in module2.js because they're encapsulated within module1.js. The values function in module1.js does allow code in module2.js access to the values; however, module2.js cannot directly access those values. We can modify the object module2.js received from module1.js. But doing so does not change the values within module1.js.

In Node.js modules can also be data, not just code.

Using JSON modules

Node.js supports using require('./path/to/file-name.json') to import a JSON file in a CommonJS module. It is equivalent to the following code:

const fs = require('fs');
module.exports = JSON.parse(
        fs.readFileSync('/path/to/file-name.json', 'utf8'));

That is, the JSON file is read synchronously, and the text is parsed as JSON. The resultant object is available as the object exported from the module. Create a file named data.json, containing the following:

{ 
    "hello": "Hello, world!", 
    "meaning": 42 
}

Now create a file named showdata.js containing the following:

const data = require('./data.json');
console.log(data);

It will execute as follows:

$ node showdata.js 
{ hello: 'Hello, world!', meaning: 42 }

The console.log function outputs information to the Terminal. When it receives an object, it prints out the object content like this. And this demonstrates that require correctly read the JSON file since the resulting object matched the JSON.

In an ES6 module, this is done with the import statement and requires a special flag. Create a file named showdata-es6.mjs containing the following:

import * as data from './data.json';
console.log(data);

So far that is equivalent to the CommonJS version of this script, but using import rather than require.

$ node --experimental-modules --experimental-json-modules showdata-es6.mjs 
(node:12772) ExperimentalWarning: The ESM module loader is experimental.
[Module] { default: { hello: 'Hello, world!', meaning: 42 } }

Currently using import to load a JSON file is an experimental feature. Enabling the feature requires these command-line arguments, causing this warning to be printed. We also see that instead of data being an anonymous object, it is an object with the type Module.

Now let's look at how to use ES6 modules on some older Node.js releases.

Supporting ES6 modules on older Node.js versions

Initially, ES6 module support was an experimental feature in Node.js 8.5 and became a fully supported feature in Node.js 14. With the right tools, we can use it on earlier Node.js implementations.

For an example of using Babel to transpile ES6 code for older Node.js versions, see https://blog.revillweb.com/using-es2015-es6-modules-with-babel-6-3ffc0870095b.

The better method of using ES6 modules on Node.js 6.x is the esm package. Simply do the following:

$ nvm install 6
Downloading and installing node v6.14.1...
Downloading https://nodejs.org/dist/v6.14.1/node-v6.14.1-darwin-x64.tar.xz...
######################################################################## 100.0%
Computing checksum with shasum -a 256
Checksums matched!
Now using node v6.14.1 (npm v3.10.10)
$ nvm use 6
Now using node v6.14.1 (npm v3.10.10)
$ npm install esm
... npm output
$ node --require esm simpledemo.mjs 
Hello, world!
1 1
2 4
2 4
3 9
4 16
5 25
42

There are two ways to use this module:

In a CommonJS module, invoke require('esm').
On the command line, use --require esm, as shown here.

In both cases, the effect is the same, to load the esm module. This module only needs to be loaded once, and we do not have to call any of its methods. Instead esm retrofits ES6 module support into the Node.js runtime, and is compatible with version 6.x and later.

So, we can use this module to retrofit ES6 module support; it does not retrofit other features such as async functions. Successfully executing the ls.mjs example requires support for both the async functions and arrow functions. Since Node.js 6.x does not support either, the ls.mjs example will load correctly, but will still fail because it uses other unsupported features.

$ node --version
v6.14.1
$ node --require esm ls.mjs 
/Users/David/chap03/ls.mjs:5
(async () => {
       ^

SyntaxError: Unexpected token (
 at exports.runInThisContext (vm.js:53:16)
 at Module._compile (module.js:373:25)

It is, of course, possible to use Babel in such cases to convert the full set of ES2015+ features to run on older Node.js releases.

For more information about esm, see:
https://medium.com/web-on-the-edge/es-modules-in-node-today-32cff914e4b. The article describes an older release of the esm module, at the time named @std/esm.

Th current documentation for the esm package is available at: https://www.npmjs.com/package/esm.

In this section, we've learned about how to define a Node.js module and various ways to use both CommonJS and ES6 modules. But we've left out some very important things: what is the module identifier and all the ways to locate and use modules. In the next section, we cover these topics.

Finding and loading modules using require and import

In the course of learning about modules for Node.js, we've used the require and import features without going into detail about how modules are found and all the options available. The algorithm for finding Node.js modules is very flexible. It supports finding modules that are siblings of the currently executing module, or have been installed local to the current project, or have been installed globally.

For both require and import, the command takes a module identifier. The algorithm Node.js uses is in charge of resolving the module identifier into a file containing the module, so that Node.js can load the module.

The official documentation for this is in the Node.js documentation, at https://nodejs.org/api/modules.html.

The official documentation for ES6 modules also discusses how the algorithm differs, at https://nodejs.org/api/esm.html .

Understanding the module resolution algorithm is one key to success with Node.js. This algorithm determines how best to structure the code in a Node.js application. While debugging problems with loading the correct version of a given package, we need to know how Node.js finds packages.

First, we must consider several types of modules, starting with the simple file modules we've already used.

Understanding File modules

The CommonJS and ES6 modules we've just looked at are what the Node.js documentation describes as a file module. Such modules are contained within a single file, whose filename ends with .js, .cjs, .mjs, .json, or .node. The latter are compiled from C or C++ source code, or even other languages such as Rust, while the former are, of course, written in JavaScript or JSON.

The module identifier of a file module must start with ./ or ../. This signals Node.js that the module identifier refers to a local file. As should already be clear, this module identifier refers to a pathname relative to the currently executing module.

It is also possible to use an absolute pathname as the module identifier. In a CommonJS module, such an identifier might be /path/to/some/directory/my-module.js. In an ES6 module, since the module identifier is actually a URL, then we must use a file:// URL like file:///path/to/some/directory/my-module.mjs. There are not many cases where we would use an absolute module identifier, but the capability does exist.

One difference between CommonJS and ES6 modules is the ability to use extensionless module identifiers. The CommonJS module loader allows us to do this, which you should save as extensionless.js:

const simple = require('./simple');

console.log(simple.hello());
console.log(`${simple.next()}`);
console.log(`${simple.next()}`);

This uses an extension-less module identifier to load a module we've already discussed, simple.js:

$ node ./extensionless
Hello, world!
1
2

And we can run it with the node command using an extension-less module identifier.

But if we specify an extension-less identifier for an ES6 module:

$ node ./simpledemo2
internal/modules/cjs/loader.js:964
 throw err;
 ^
Error: Cannot find module '/home/david/Chapter03/simpledemo2'
 at Function.Module._resolveFilename (internal/modules/cjs/loader.js:961:17)
 at Function.Module._load (internal/modules/cjs/loader.js:854:27)
 at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
 at internal/main/run_main_module.js:17:47 {
 code: 'MODULE_NOT_FOUND',
 requireStack: []
}

We get the error message making it clear that Node.js could not resolve the file name. Similarly, in an ES6 module, the file name given to the import statement must have the file extension.

Next, let's discuss another side effect of ES6 module identifiers being a URL.

The ES6 import statement takes a URL

The module identifier in the ES6 import statement is a URL. There are several important considerations.

Since Node.js only supports the file:// URLs, we're not allowed to retrieve a module over from a web server. There are obvious security implications, and the corporate security team would rightfully get anxious if modules could be loaded from http:// URLs.

Referencing a file with an absolute pathname must use the file:///path/to/file.ext syntax, as mentioned earlier. This is different from require, where we would use /path/to/file.ext instead.

Since ? and # have special significance in a URL, they also have special significance to the import statement, as in the following example:

import './module-name.mjs?query=1'

This loads the module named module-name.mjs with a query string containing query=1. By default, this is ignored by the Node.js module loader, but there is an experimental loader hook feature by which you can do something with the module identifier URL.

The next type of module to consider is those baked into Node.js, the core modules.

Understanding the Node.js core modules

Some modules are pre-compiled into the Node.js binary. These are the core Node.js modules documented on the Node.js website at https://nodejs.org/api/index.html.

They start out as source code within the Node.js build tree. The build process compiles them into the binary so that the modules are always available.

We've already seen how the core modules are used. In a CommonJS module, we might use the following:

const http = require('http');
const fs = require('fs').promises;

And the equivalent in an ES6 module would be as follows:

import http from 'http';
import { promises as fs } from 'fs';

In both cases, we're loading the http and fs core modules that would then be used by other code in the module.

Moving on, we will next talk about more complex module structures.

Using a directory as a module

We commonly organize stuff into a directory structure. The stuff here is a technical term referring to internal file modules, data files, template files, documentation, tests, assets, and more. Node.js allows us to create an entry-point module into such a directory structure.

For example, with a module identifier like ./some-library that refers to a directory, then there must be a file named index.js, index.cjs, index.mjs, or index.node in the directory. In such a case, the module loader loads the appropriate index module even though the module identifier did not reference a full pathname. The pathname is computed by appending the file it finds in the directory.

One common use for this is that the index module provides an API for a library stored in the directory and that other modules in the directory contain what's meant to be private implement details.

This may be a little confusing because the word module is being overloaded with two meanings. In some cases, a module is a file, and in other cases, a module is a directory containing one or more file modules.

While overloading the word module this way might be a little confusing, it's going to get even more so as we consider the packages we install from other sources.

Comparing installed packages and modules

Every programming platform supports the distribution of libraries or packages that are meant to be used in a wide array of applications. For example, where the Perl community has CPAN, the Node.js community has the npm registry. A Node.js installed package is the same as we just described as a folder as a module, in that the package format is simply a directory containing a package.json file along with the code and other files comprising the package.

There is the same risk of confusion caused by overloading the word module since an installed package is typically the same as the directories as modules concept just described. Therefore, it's useful to refer to an installed package with the word package.

The package.json file describes the package. A minimal set of fields are defined by Node.js, specifically as follows:

{ "name" : "some-library",
 "main" : "./lib/some-library.js" }

The name field gives the name of the package. If the main field is present, it names the JavaScript file to use instead of index.js to load when the package is loaded. The package manager applications like npm and Yarn support many more fields in package.json, which they use to manage dependencies and versions and everything else.

If there is no package.json, then Node.js will look for either index.js or index.node. In such a case, require('some-library') will load the file module in /path/to/some-library/index.js.

Installed packages are kept in a directory named node_modules. When JavaScript source code has require('some-library') or import 'some-library', Node.js searches through one or more node_modules directories to find the named package.

Notice that the module identifier, in this case, is just the package name. This is different from the file and directory module identifiers we studied earlier since both those are pathnames. In this case, the module identifier is somewhat abstract, and that's because Node.js has an algorithm for finding packages within the nested structure of the node_modules directories.

To understand how that works, we need a deeper dive into the algorithm.

Finding the installed package in the file system

One key to why the Node.js package system is so flexible is the algorithm used to search for packages.

For a given require, import(), or import statement, Node.js searches upward in the file system from the directory containing the statement. It is looking for a directory named node_modules containing a module satisfying the module identifier.

For example, with a source file named /home/david/projects/notes/foo.js and a require or import statement requesting the module identifier bar.js, Node.js tries the following options:

As just said, the search starts at the same level of the file system as foo.js. Node.js will look either for a file module named bar.js or else a directory named bar.js containing a module as described earlier in Using a Directory as a module. Node.js will check for this package in the node_modules directory next to foo.js and in every directory above that file. It will not, however, descend into any directory such as express or express/node_modules. The traversal only moves upward in the file system, not downward.

While some of the third-party packages have a name ending in .js, the vast majority do not. Therefore, we will typically use require('bar'). Also typically the 3rd party installed packages are delivered as a directory containing a package.json file and some JavaScript files. Therefore, in the typical case, the package module identifier would be bar, and Node.js will find a directory named bar in one of the node_modules directories and access the package from that directory.

This act of searching upward in the file system means Node.js supports the nested installation of packages. A Node.js package that in turn depends on other modules that will have its own node_modules directory; that is, the bar package might depend on the fred package. The package manager application might install fred as /home/david/projects/notes/node_modules/bar/node_modules/fred:

In such a case, when a JavaScript file in the bar package uses require('fred') its search for modules starts in /home/david/projects/notes/node_modules/bar/node_modules, where it will find the fred package. But if the package manager detects that other packages used by notes also use the fred package, the package manager will install it as /home/david/projects/notes/node_modules/fred.

Because the search algorithm traverses the file system upwards, it will find fred in either location.

The last thing to note is that this nesting of node_modules directories can be arbitrarily deep. While the package manager applications try to install packages in a flat hierarchy, it may be necessary to nest them deeply.

One reason for doing so is to enable using two or more versions of the same package.

Handling multiple versions of the same installed package

The Node.js package identifier resolution algorithm allows us to install two or more versions of the same package. Returning to the hypothetical notes project, notice that the fred package is installed not just for the bar package but also for the express package.

Looking at the algorithm, we know that require('fred') in the bar package, and in the express package, will be satisfied by the corresponding fred package installed locally to each.

Normally, the package manager applications will detect the two instances of the fred package and install only one. But, suppose the bar package required the fred version 1.2, while the express package required the fred version 2.1.

In such a case, the package manager application will detect the incompatibility and install two versions of the fred package as so:

In /home/david/projects/notes/node_modules/bar/node_modules, it will install fred version 1.2.
In /home/david/projects/notes/node_modules/express/node_modules, it will install fred version 2.1.

When the express package executes require('fred') or import 'fred', it will be satisfied by the package in /home/david/projects/notes/node_modules/express/node_modules/fred. Likewise, the bar package will be satisfied by the package in /home/david/projects/notes/node_modules/bar/node_modules/fred. In both cases, the bar and express packages have the correct version of the fred package available. Neither is aware there is another version of fred installed.

The node_modules directory is meant for packages required by an application. Node.js also supports installing packages in a global location so they can be used by multiple applications.

Searching for globally installed packages

We've already seen that with npm we can perform a global install of a package. For example, command-line tools like hexy or babel are convenient if installed globally. In such a case the package is installed in another folder outside of the project directory. Node.js has two strategies for finding globally installed packages.

Similar to the PATH variable, the NODE_PATH environment variable can be used to list additional directories in which to search for packages. On Unix-like operating systems, NODE_PATH is a colon-separated list of directories, and on Windows it is semicolon-separated. In both cases, it is similar to how the PATH variable is interpreted, meaning that NODE_PATH has a list of directory names in which to find installed modules.

The NODE_PATH approach is not recommended, because of surprising behavior that can happen if people are unaware that this variable must be set. If a specific module located in a specific directory referenced in NODE_PATH is required for a proper function and the variable is not set, the application will likely fail. The best practice is for all dependencies to be explicitly declared, and with Node.js that means listing all dependencies in the package.json file so that npm or yarn can manage the dependencies.

This variable was implemented before the module resolution algorithm just described was finalized. Because of that algorithm, NODE_PATH is largely unnecessary.

There are three additional locations that can hold modules:

$HOME/.node_modules
$HOME/.node_libraries
$PREFIX/lib/node

In this case, $HOME is what you expect (the user's home directory), and $PREFIX is the directory where Node.js is installed.

Some recommend against using global packages. The rationale is the desire for repeatability and deployability. If you've tested an app and all its code is conveniently located within a directory tree, you can copy that tree for deployment to other machines. But, what if the app depended on some other file that was magically installed elsewhere on the system? Will you remember to deploy such files? The application author might write documentation saying to install this then install that and install something-else before running npm install, but will the users of the application correctly follow all those steps?

The best installation instructions is to simply run npm install or yarn install. For that to work, all dependencies must be listed in package.json.

Before moving forward, let's review the different kinds of module identifiers.

Reviewing module identifiers and pathnames

That was a lot of details spread out over several sections. It's useful, therefore, to quickly review how the module identifiers are interpreted when using the require, import(), or import statements:

Relative module identifiers: These begin with ./ or ../, and absolute identifiers begin with /. The module name is identical to POSIX filesystem semantics. The resultant pathname is interpreted relative to the location of the file being executed. That is, a module identifier beginning with ./ is looked for in the current directory, whereas one starting with ../ is looked for in the parent directory.
Absolute module identifiers: These begin with / (or file:// for ES6 modules) and are, of course, looked for in the root of the filesystem. This is not a recommended practice.
Top-level module identifiers: These do not begin with those strings and are just the module name. These must be stored in a node_modules directory, and the Node.js runtime has a nicely flexible algorithm for locating the correct node_modules directory.
Core modules: These are the same as the top-level module identifiers, in that there is no prefix, but the core modules are prebaked into the Node.js binary.

In all cases, except for the core modules, the module identifier resolves to a file that contains the actual module, and which is loaded by Node.js. Therefore, what Node.js does is to compute the mapping between the module identifier and the actual file name to load.

Using a package manager application is not required. The Node.js module resolution algorithm does not depend on a package manager, like npm or Yarn, to set up the node_modules directories. There is nothing magical about those directories, and it is possible to use other means to construct a node_modules directory containing installed packages. But the simplest mechanism is to use a package manager application.

Some packages offer what we might call a sub-package included with the main package, let's see how to use them.

Using deep import module specifiers

In addition to a simple module identifier like require('bar'), Node.js lets us directly access modules contained within a package. A different module specifier is used that starts with the module name, adding what's called a deep import path. For a concrete example, let's look at the mime module (https://www.npmjs.com/package/mime), which handles mapping a file name to its corresponding MIME type.

In the normal case, you use require('mime') to use the package. However, the authors of this package developed a lite version of this package that leaves out a lot of vendor-specific MIME types. For that version, you use require('mime/lite') instead. And of course, in an ES6 module, you use import 'mime' and import 'mime/lite', as appropriate.

The specifier mime/lite is an example of a deep import module specifier.

With such a module identifier, Node.js first locates the node_modules directory containing the main package. In this case, that is the mime package. By default, the deep import module is simply a path-name relative to the package directory, for example, /path/to/node_modules/mime/lite. Going by the rules we've already examined, it will be satisfied by a file named lite.js or a by a directory named lite containing a file named index.js or index.mjs.

But it is possible to override the default behavior and have the deep import specifier refer to a different file within the module.

Overriding a deep import module identifier

The deep import module identifier used by code using the package does not have to be the pathname used within the package source. We can put declarations in package.json describing the actual pathname for each deep import identifier. For example, a package with interior modules named ./src/cjs-module.js and ./src/es6-module.mjs can be remapped with this declaration in package.json:

{
 "exports": {
   "./cjsmodule": "./src/cjs-module.js",
   "./es6module": "./src/es6-module.mjs"
 }
}

With this, code using such a package can load the inner module using require('module-name/cjsmodule') or import 'module-name/es6module'. Notice that the filenames do not have to match what's exported.

In a package.json file using this exports feature, a request for an inner module not listed in exports will fail. Supposing the package has a ./src/hidden-module.js file, calling require('module-name/src/hidden-module.js') will fail.

All these modules and packages are meant to be used in the context of a Node.js project. Let's take a brief look at a typical project.

Studying an example project directory structure

A typical Node.js project is a directory containing a package.json file declaring the characteristics of the package, especially its dependencies. That, of course, describes a directory module, meaning that each module is its own project. At the end of the day, we create applications, for example, an Express application, and these applications depend on one or more (possibly thousands of) packages that are to be installed:

This is an Express application (we'll start using Express in Chapter 5, Your First Express Application) containing a few modules installed in the node_modules directory. A typical Express application uses app.js as the main module for the application, and has code and asset files distributed in the public, routes, and views directories. Of course, the project dependencies are installed in the node_modules directory.

But let's focus on the content of the node_modules directory versus the actual project files. In this screenshot, we've selected the express package. Notice it has a package.json file and there is an index.js file. Between those two files, Node.js will recognize the express directory as a module, and calling require('express') or import 'express' will be satisfied by this directory.

The express directory has its own node_modules directory, in which are installed two packages. The question is, why are those packages installed in express/node_modules rather than as a sibling of the express package?

Earlier we discussed what happens if two modules (modules A and B) list a dependency on different versions of the same module (C). In such a case, the package manager application will install two versions of C, one as A/node_modules/C and the other as B/node_modules/C. The two copies of C are thus located such that the module search algorithm will cause module A and module B to have the correct version of module C.

That's the situation we see with express/node_modules/cookie. To verify this, we can use an npm command to query for all references to the module:

$ npm ls cookie
[email protected] /Users/David/chap05/notes
├─┬ [email protected]
│ └── [email protected] 
└─┬ [email protected]
  └── [email protected]

This says the cookie-parser module depends on version 0.1.3 of cookie, while Express depends on version 0.1.5.

Now that we can recognize what a module is and how they're found in the file system, let's discuss when we can use each of the methods to load modules.

Loading modules using require, import, and import()

Obviously require is used in CommonJS modules, and import is used in ES6 modules, but there are some details to go over. We've already discussed the format and filename differences between CommonJS and ES6 modules, so let's focus here on loading the modules.

The require function is only available in CommonJS modules, and it is used for loading a CommonJS module. The module is loaded synchronously, meaning that when the require function returns, the module is completely loaded.

By default, a CommonJS module cannot load an ES6 module. But as we saw with the simple-dynamic-import.js example, a CommonJS module can load an ES6 module using import(). Since the import() function is an asynchronous operation, it returns a Promise, and we, therefore, cannot use the resulting module as a top-level object. But we can use it inside a function:

module.exports.usesES6module = async function() {
    const es6module = await import('./es6-module.mjs');
    return es6module.functionCall();
}

And at the top-level of a Node.js script, the best we can do is the following:

import('./simple2.mjs')
.then(simple2 => {
 console.log(simple2.hello());
 console.log(simple2.next());
 console.log(simple2.next());
 console.log(`count = ${simple2.default()}`);
 console.log(`Meaning: ${simple2.meaning}`);
})
.catch(err => {
 console.error(err);
});

It's the same as the simple-dynamic-import.js example, but we are explicitly handling the Promise returned by import() rather than using an async function. While we could assign simple2 to a global variable, other code using that variable would have to accommodate the possibility the assignment hasn't yet been made.

The module object provided by import() contains the fields and functions exported with the export statements in the ES6 module. As we see here, the default export has the default name.

In other words, using an ES6 module in a CommonJS module is possible, so long as we accommodate waiting for the module to finish loading before using it.

The import statement is used to load ES6 modules, and it only works inside an ES6 module. The module specifier you hand to the import statement is interpreted as a URL.

An ES6 module can have multiple named exports. In the simple2.mjs we used earlier, these are the functions next, squared, and hello, and the values meaning and nocount. ES6 modules can have a single default export, as we saw in simple2.mjs.

With simpledemo2.mjs, we saw that we can import only the required things from the module:

import { default as simple, hello, next } from './simple2.mjs';

In this case, we use the exports as just the name, without referring to the module: simple(), hello(), and next().

It is possible to import just the default export:

import simple from './simple2.mjs';

In this case, we can invoke the function as simple(). We can also use what's called a namespace import; that is similar to how we import CommonJS modules:

import * as simple from './simple2.mjs';

console.log(simple.hello());
console.log(simple.next());
console.log(simple.next());
console.log(simple.default());
console.log(simple.meaning);

In this case, each property exported from the module is a property of the named object in the import statement.

An ES6 module can also use import to load a CommonJS module. Loading the simple.js module we used earlier is accomplished as follows:

import simple from './simple.js';
console.log(simple.next());
console.log(simple.next());
console.log(simple.hello());

This is similar to the default export method shown for ES6 modules, and we can think of the module.exports object inside the CommonJS module as the default export. Indeed, the import can be rewritten as follows:

import { default as simple } from './simple.js';

This demonstrates that the CommonJS module.exports object is surfaced as default when imported.

We've learned a lot about using modules in Node.js. This included the different types of modules, and how to find them in the file system. Our next step is to learn about package management applications and the npm package repository.

Using npm – the Node.js package management system

As described in Chapter 2, Setting up Node.js, npm is a package management and distribution system for Node.js. It has become the de facto standard for distributing modules (packages) for use with Node.js. Conceptually, it's similar to tools such as apt-get (Debian), rpm/yum (Red Hat/Fedora), MacPorts/Homebrew (macOS), CPAN (Perl), or PEAR (PHP). Its purpose is to publish and distributing Node.js packages over the internet using a simple command-line interface. In recent years, it has also become widely used for distributing front-end libraries like jQuery and Bootstrap that are not Node.js modules. With npm, you can quickly find packages to serve specific purposes, download them, install them, and manage packages you've already installed.

The npm application extends on the package format for Node.js, which in turn is largely based on the CommonJS package specification. It uses the same package.json file that's supported natively by Node.js, but with additional fields for additional functionality.

The npm package format

An npm package is a directory structure with a package.json file describing the package. This is exactly what was referred to earlier as a directory module, except that npm recognizes many more package.json tags than Node.js does. The starting point for npm's package.json file is the CommonJS Packages/1.0 specification. The documentation for the npm package.json implementation is accessed using the following command:

$  npm help package.json

A basic package.json file is as follows:

{ "name": "packageName", 
   "version": "1.0", 
   "main": "mainModuleName".
   "bin": "./path/to/program"
}

Npm recognizes many more fields than this, and we'll go over some of them in the coming sections. The file is in JSON format, which, as a JavaScript programmer, you should be familiar with.

There is a lot to cover concerning the npm package.json format, and we'll do so over the following sections.

Accessing npm helpful documentation

The main npm command has a long list of subcommands for specific package management operations. These cover every aspect of the life cycle of publishing packages (as a package author), and downloading, using, or removing packages (as an npm consumer).

You can view the list of these commands just by typing npm (with no arguments). If you see one you want to learn more about, view the help information:

$ npm help <command>

The help text will be shown on your screen.

Help information is also available on the npm website at: https://docs.npmjs.com/cli-documentation/.

Before we can look for and install Node.js packages, we must have a project directory initialized.

Initializing a Node.js package or project with npm init

The npm tool makes it easy to initialize a Node.js project directory. Such a directory contains at the minimum a package.json file and one or more Node.js JavaScript files.

All Node.js project directories are therefore modules, going by the definition we learned earlier. However, in many cases, a Node.js project is not meant to export any functionality but instead is an application. Such a project will likely require other Node.js packages, and those packages will be declared in the package.json file so that they're easy to install using npm. The other common use case of a Node.js project is a package of functionality meant to be used by other Node.js packages or applications. These also consist of a package.json file plus one or more Node.js JavaScript files, but in this case, they're Node.js modules that export functions and can be loaded using require, import(), or import.

What this means is the key to initializing a Node.js project directory is creating the package.json file.

While the package.json file can be created by hand – it's just a JSON file after all - the npm tool provides a convenient method:

$ mkdir example-package
$ cd example-package/
$ npm init
This utility will walk you through creating a package.json file.
It only covers the most common items, and tries to guess sensible defaults.

See `npm help json` for definitive documentation on these fields
and exactly what they do.

Use `npm install <pkg>` afterwards to install a package and
save it as a dependency in the package.json file.

Press ^C at any time to quit.
package name: (example-package) 
version: (1.0.0) 
description: This is an example of initializing a Node.js project
entry point: (index.js) 
test command: mocha
git repository: 
keywords: example, package
author: David Herron <[email protected]>
license: (ISC) 
About to write to /home/david/example-package/package.json:

{
 "name": "example-package",
 "version": "1.0.0",
 "description": "This is an example of initializing a Node.js project",
 "main": "index.js",
 "scripts": {
     "test": "mocha"
 },
 "keywords": [
     "example",
     "package"
 ],
 "author": "David Herron <[email protected]>",
 "license": "ISC"
}

Is this OK? (yes) yes

In a blank directory, run npm init, answer the questions, and as quick as that you have the starting point for a Node.js project.

This is, of course, a starting point, and as you write the code for your project it will often be necessary to use other packages.

Finding npm packages

By default, npm packages are retrieved over the internet from the public package registry maintained on http://npmjs.com. If you know the module name, it can be installed simply by typing the following:

$ npm install moduleName

But what if you don't know the module name? How do you discover the interesting modules? The website http://npmjs.com publishes a searchable index of the modules in the registry. The npm package also has a command-line search function to consult the same index:

Of course, upon finding a module, it's installed as follows:

$ npm install acoustid

The npm repository uses a few package.json fields to aid in finding packages.

The package.json fields that help finding packages

For a package to be easily found in the npm repository requires a good package name, package description, and keywords. The npm search function scans those package attributes and presents them in search results.

The relevant package.json fields are as follows:

{ ...
  "description": "My wonderful package that walks dogs",
  "homepage": "http://npm.dogs.org/dogwalker/",
  "author": "[email protected]",
  "keywords": [ "dogs", "dog walking" ]
... }

The npm view command shows us information from package.json file for a given package, and with the --json flag we're shown the raw JSON.

The name tag is of course the package name, and it is used in URLs and command names, so choose one that's safe for both. If you desire to publish a package in the public npm repository, it's helpful to check whether a particular name is already being used by searching on https://npmjs.com or by using the npm search command.

The description tag is a short description that's meant as a brief/terse description of the package.

It is the name and description tags that are shown in npm search results.

The keywords tag is where we list attributes of the package. The npm website contains pages listing all packages using a particular keyword. These keyword indexes are useful when searching for a package since it lists the related packages in one place, and therefore when publishing a package it's useful to land on the correct keyword pages.

Another source is the contents of the README.md file. This file should be added to the package to provide basic package documentation. This file is shown on the package page on npmjs.com, and therefore it is important for this file to convince potential users of your package to actually use it. As the file name implies, this is a Markdown file.

Once you have found a package to use, you must install it in order to use the package.

Installing an npm package

The npm install command makes it easy to install packages upon finding one of your dreams, as follows:

$ npm install express
/home/david/projects/notes/
- [email protected]
...

The named module is installed in node_modules in the current directory. During the installation process, the package is set up. This includes installing any packages it depends on and running the preinstall and postinstall scripts. Of course, installing the dependent packages also involves the same installation process of installing dependencies and executing pre-install and post-install scripts.

Some packages in the npm repository have a package scope prepended to the package name. The package name in such cases is presented as @scope-name/package-name, or, for example, @akashacms/plugins-footnotes. In such a package, the name field in package.json contains the full package name with its @scope.

We'll discuss dependencies and scripts later. In the meantime, we notice that a version number was printed in the output, so let's discuss package version numbers.

Installing a package by version number

Version number matching in npm is powerful and flexible. With it, we can target a specific release of a given package or any version number range. By default, npm installs the latest version of the named package, as we did in the previous section. Whether you take the default or specify a version number, npm will determine what to install.

The package version is declared in the package.json file, so let's look at the relevant fields:

{ ...
  "version": "1.2.1",
  "dist-tags": {
    "latest": "1.2.1"
  },
... }

The version field obviously declares the current package version. The dist-tags field lists symbolic tags that the package maintainer can use to aid their users in selecting the correct version. This field is maintained by the npm dist-tag command.

The npm install command supports these variants:

$ npm install package-name@tag
$ npm install package-name@version
$ npm install package-name@version-range

The last two are what they sound like. You can specify [email protected] to target a precise version, or express@">4.1.0 < 5.0" to target a range of Express V4 versions. We might use that specific expression because Express 5.0 might include breaking changes.

The version match specifiers include the following choices:

Exact version match: 1.2.3
At least version N: >1.2.3
Up to version N: <1.2.3
Between two releases: >=1.2.3 <1.3.0

The @tag attribute is a symbolic name such as @latest, @stable, or @canary. The package owner assigns these symbolic names to specific version numbers and can reassign them as desired. The exception is @latest, which is updated whenever a new release of the package is published.

For more documentation, run these commands: npm help json and npm help npm-dist-tag.

In selecting the correct package to use, sometimes we want to use packages that are not in the npm repository.

Installing packages from outside the npm repository

As awesome as the npm repository is, we don't want to push everything we do through their service. This is especially true for internal development teams who cannot publish their code for all the world to see. Fortunately, Node.js packages can be installed from other locations. Details about this are in npm help package.json in the dependencies section. Some examples are as follows:

URL: You can specify any URL that downloads a tarball, that is, a .tar.gz file. For example, GitHub or GitLab repositories can easily export a tarball URL. Simply go to the Releases tab to find them.
Git URL: Similarly, any Git repository can be accessed with the right URL, for example:

$ npm install git+ssh://user@hostname:project.git#git-tag

GitHub shortcut: For GitHub repositories, you can list just the repository specifier, such as expressjs/express. A tag or a commit can be referenced using expressjs/express#tag-name.
GitLab, BitBucket, and GitHub URL shortcuts: In addition to the GitHub shortcut, npm supports a special URL format for specific Git services with URLs like github:user/repo, bitbucket:user/repo, and gitlab:user/repo.
Local filesystem: You can install from a local directory using a URL with the: file:../../path/to/dir.

Sometimes we need to install a package for use by several projects, without requiring that each project installs the package.

Global package installs

In some instances, you want to install a module globally, so that it can be used from any directory. For example, the Grunt or Babel build tools are widely useful, and conceivably you will find it useful if these tools are installed globally. Simply add the -g option:

$ npm install -g grunt-cli

If you get an error, and you're on a Unix-like system (Linux/Mac), you may need to run this with sudo:

$ sudo npm install -g grunt-cli

This variant, of course, runs npm install with elevated permissions.

The npm website offers a guideline with more information at https://docs.npmjs.com/resolving-eacces-permissions-errors-when-installing-packages-globally.

If a local package install lands in node_modules, where does a global package install land? On a Unix-like system, it lands in PREFIX/lib/node_modules, and on Windows, it lands in PREFIX/node_modules. In this case, PREFIX means the directory where Node.js is installed. You can inspect the location of the directory as follows:

$ npm config get prefix
/opt/local

The algorithm used by Node.js for the require function automatically searches the directory for packages if the package is not found elsewhere.

ES6 modules do not support global packages.

Many believe it is not a good idea to install packages globally, which we will look at next.

Avoiding global module installation

Some in the Node.js community now frown on installing packages globally. One rationale is that a software project is more reliable if all its dependencies are explicitly declared. If a build tool such as Grunt is required but is not explicitly declared in package.json, the users of the application would have to receive instructions to install Grunt, and they would have to follow those instructions.

Users being users, they might skip over the instructions, fail to install the dependency, and then complain the application doesn't work. Surely, most of us have done that once or twice.

It's recommended to avoid this potential problem by installing everything locally via one mechanism—the npm install command.

There are two strategies we use to avoid using globally installed Node.js packages. For the packages that install commands, we can configure the PATH variable, or use npx to run the command. In some cases, a package is used only during development and can be declared as such in package.json.

Maintaining package dependencies with npm

The npm install command by itself, with no package name specified, installs the packages listed in the dependencies section of package.json. Likewise, the npm update command compares the installed packages against the dependencies and against what's available in the npm repository and updates any package that is out of date in regards to the repository.

These two commands make it easy and convenient to set up a project, and to keep it up to date as dependencies are updated. The package author simply lists all the dependencies, and npm installs or updates the dependencies required for using the package. What happens is npm looks in package.json for the dependencies or devDependencies fields, and it works out what to do from there.

You can manage the dependencies manually by editing package.json. Or you can use npm to assist you with editing the dependencies. You can add a new dependency like so:

$ npm install akasharender --save

With the --save flag, npm will add a dependencies tag to package.json:

"dependencies": { 
    "akasharender": "^0.7.8" 
}

With the added dependency, when your application is installed, npm will now install the package along with any other dependencies listed in package.json file.

The devDependencies lists modules used during development and testing. The field is initialized the same as the preceding one, but with the --save-dev flag. The devDependencies can be used to avoid some cases where one might instead perform a global package install.

By default, when npm install is run, modules listed in both dependencies and devDependencies are installed. Of course, the purpose of having two dependency lists is to control when each set of dependencies is installed.

$ npm install --production

This installs the "production" version, which means to install only the modules listed in dependencies and none of the devDependencies modules. For example, if we use a build tool like Babel in development, the tool should not be installed in production.

While we can manually maintain dependencies in package.json, npm can handle this for us.

Automatically updating package.json dependencies

With npm@5 (also known as npm version 5), one change was that it's no longer required to add --save to the npm install command. Instead, npm by default acts as if you ran the command with --save, and will automatically add the dependency to package.json. This is meant to simplify using npm, and it is arguably more convenient that npm now does this. At the same time, it can be very surprising and inconvenient for npm to go ahead and modify package.json for you. The behavior can be disabled by using the --no-save flag, or it can be permanently disabled using the following:

$ npm config set save false

The npm config command supports a long list of settable options for tuning the behavior of npm. See npm help config for the documentation and npm help 7 config for the list of options.

Now let's talk about the one big use for package dependencies: to fix or avoid bugs.

Fixing bugs by updating package dependencies

Bugs exist in every piece of software. An update to the Node.js platform may break an existing package, as might an upgrade to packages used by the application. Your application may trigger a bug in a package it uses. In these and other cases, fixing the problem might be as simple as updating a package dependency to a later (or earlier) version.

First, identify whether the problem exists in the package or in your code. After determining it's a problem in another package, investigate whether the package maintainers have already fixed the bug. Is the package hosted on GitHub or another service with a public issue queue? Look for an open issue on this problem. That investigation will tell you whether to update the package dependency to a later version. Sometimes, it will tell you to revert to an earlier version; for example, if the package maintainer introduced a bug that doesn't exist in an earlier version.

Sometimes, you will find that the package maintainers are unprepared to issue a new release. In such a case, you can fork their repository and create a patched version of their package. In such a case, your package might use a Github URL referencing your patched package.

One approach to fixing this problem is pinning the package version number to one that's known to work. You might know that version 6.1.2 was the last release against which your application functioned and that starting with version 6.2.0 your application breaks. Hence, in package.json:

"dependencies": {
    "module1": "6.1.2"
}

This freezes your dependency on the specific version number. You're free, then, to take your time updating your code to work against later releases of the module. Once your code is updated, or the upstream project is updated, change the dependency appropriately.

When listing dependencies in package.json, it's tempting to be lazy, but that leads to trouble.

Explicitly specifying package dependency version numbers

As we've said several times in this chapter, explicitly declaring your dependencies is A Good Thing. We've already touched on this, but it's worth reiterating and to see how npm makes this easy to accomplish.

The first step is ensuring that your application code is checked into a source code repository. You probably already know this, and even have the best of intentions to ensure that everything is checked in. With Node.js, each module should have its own repository rather than putting every single last piece of code in one repository.

Each module can then progress on its own timeline. A breakage in one module is easy to back out by changing the version dependency in package.json.

The next step is to explicitly declare all dependencies of every module. The goal is simplifying and automating the process of setting up every module. Ideally, on the Node.js platform, the module setup is as simple as running npm install.

Any additional required steps can be forgotten or executed incorrectly. An automated setup process eliminates several kinds of potential mistakes.

With the dependencies and devDependencies sections of package.json, we can explicitly declare not only the dependencies but the precise version numbers.

The lazy way of declaring dependencies is putting * in the version field. That uses the latest version in the npm repository. This will seem to work, until that one day the maintainers of that package introduce a bug. You'll type npm update, and all of a sudden your code doesn't work. You'll head over to the GitHub site for the package, look in the issue queue, and possibly see that others have already reported the problem you're seeing. Some of them will say that they've pinned on the previous release until this bug is fixed. What that means is their package.json file does not depend on * for the latest version, but on a specific version number before the bug was created.

Don't do the lazy thing, do the smart thing.

The other aspect of explicitly declaring dependencies is to not implicitly depend on global packages. Earlier, we said that some people in the Node.js community caution against installing modules in the global directories. This might seem like an easy shortcut to sharing code between applications. Just install it globally, and you don't have to install the code in each application.

But, doesn't that make deployment harder? Will the new team member be instructed on all the special files to install here and there to make the application run? Will you remember to install that global module on all destination machines?

For Node.js, that means listing all the module dependencies in package.json, and then the installation instructions are simply npm install, followed perhaps by editing a configuration file.

While most packages in the npm repository are libraries with an API, some are tools we can run from the command line.

Packages that install commands

Some packages install command-line programs. A side effect of installing such packages is a new command that you can type at the shell prompt or use in shell scripts. An example is the hexy program that we briefly used in Chapter 2, Setting Up Node.js. Another example is the widely used Grunt or Babel build tools.

The recommendation to explicitly declare all dependencies in package.json applies to command-line tools as well as any other package. Therefore these packages will typically be installed locally. This requires special care in setting up the PATH environment variable correctly. As you should already be aware, the PATH variable is used on both Unix-like systems and Windows to list the directories in which the command-line shell searches for commands.

The command can be installed to one of two places:

Global install: It is installed either to a directory, such as /usr/local, or to the bin directory where Node.js was installed. The npm bin -g command tells you the absolute pathname for this directory. In this case, it's unlikely you'll have to modify the PATH environment variable.
Local install: Installs to node_modules/.bin in the package where the module is being installed, the npm bin command tells you the absolute pathname for the directory. Because the directory is inconveniently located to run commands, a change to the PATH variable is useful.

To run the command, simply type the command name at a shell prompt. This works correctly if the directory where the command is installed happens to be in the PATH variable. Let's look at how to configure the PATH variable to handle locally installed commands.

Configuring the PATH variable to handle locally installed commands

Assume we have installed the hexy command like so:

$ npm install hexy

As a local install, this creates a command as node_modules/.bin/hexy. We can attempt to use it as follows:

$ hexy package.json 
-bash: hexy: command not found

But this breaks because the command is not in a directory listed in the PATH. The workaround is to use the full pathname or relative pathname:

$ ./node_modules/.bin/hexy package.json
... hexy output

But obviously typing the full or partial pathname is not a user-friendly way to execute the command. We want to use the commands installed by modules, and we want a simple process for doing so. This means, we must add an appropriate value in the PATH variable, but what is it?

For global package installations, the executable lands in a directory that is probably already in your PATH variable, like /usr/bin or /usr/local/bin. Local package installations require special handling. The full path for the node_modules/.bin directory varies for each project, and obviously it won't work to add the full path for every node_modules/.bin directory to your PATH.

Adding ./node_modules/.bin to the PATH variable (or, on Windows, . ode_modules.bin) works great. Any time your shell is in the root of a Node.js project, it will automatically find locally installed commands from Node.js packages.

How we do this depends on the command shell you use and your operating system.

On a Unix-like system, the command shells are bash and csh. Your PATH variable would be set up in one of these ways:

$ export PATH=./node_modules/.bin:${PATH}     # bash
$ setenv PATH ./node_modules/.bin:${PATH}     # csh

The next step is adding the command to your login scripts so the variable is always set. On bash, add the corresponding line to ~/.bashrc, and on csh add it to ~/.cshrc.

Once this is accomplished the command-line tool executes correctly.

Configuring the PATH variable on Windows

On Windows, this task is handled through a system-wide settings panel:

This pane of the System Properties panel is found by searching for PATH in the Windows Settings screen. Click on the Environment Variables button, then select the Path variable, and finally click on the Edit button. On the screen here, click the New button to add an entry to this variable, and enter . ode_modules.bin as shown. You'll have to restart any open command shell windows. Once you do, the effect will be as shown previously.

As easy as it is to modify the PATH variable, we don't want to do this in all circumstances.

Avoiding modifications to the PATH variable

What if you don't want to add these variables to your PATH at all times? The npm-path module may be of interest. This is a small program that computes the correct PATH variable for your shell and operating system. See the package at https://www.npmjs.com/package/npm-path.

Another option is to use the npx command to execute such commands. This tool is automatically installed alongside the npm command. This command either executes commands from a locally installed package or it silently installs commands in a global cache:

$ npx hexy package.json

Using npx is this easy.

Of course, once you've installed some packages, they'll go out of date and need to be updated.

Updating packages you've installed when they're outdated

The coder codes, updating their package, leaving you in the dust unless you keep up.

To find out whether your installed packages are out of date, use the following command:

$ npm outdated

The report shows the current npm packages, the currently installed version, as well as the current version in the npm repository. Updating outdated packages is very simple:

$ npm update express
$ npm update

Specifying a package name updates just the named package. Otherwise, it updates every package that would be printed by npm outdated.

Npm handles more than package management, it has a decent built-in task automation system.

Automating tasks with scripts in package.json

The npm command handles not just installing packages, it can also be used to automate running tasks related to the project. In package.json, we can add a field, scripts, containing one or more command strings. Originally scripts were meant to handle tasks related to installing an application, such as compiling native code, but they can be used for much more. For example, you might have a deployment task using rsync to copy files to a server. In package.json, you can add this:

{ ...
  "scripts: {
    "deploy": "rsync --archive --delete local-dir user@host:/path/to/dest-dir
  }
... }

What's important here is that we can add any script we like, and the scripts entry records the command to run:

$ npm run deploy

Once it has been recorded in scripts, running the command is this easy.

There is a long list of "lifecycle events" for which npm has defined script names. These include the following:

install, for when the package is installed
uninstall, for when it is uninstalled
test, for running a test suite
start and stop, for controlling a server defined by the package

Package authors are free to define any other script they like.

For the full list of predefined script names, see the documentation: https://docs.npmjs.com/misc/scripts

Npm also defines a pattern for scripts that run before or after another script, namely to prepend pre or post to the script name. Therefore the pretest script runs before the test script, and the posttest script runs afterward.

A practical example is to run a test script in a prepublish script to ensure the package is tested before publishing it to the npm repository:

{
  "scripts": {
      "test": "cd test && mocha",
      "prepublish": "npm run test"
  }
}

With this combination, if the test author types npm publish, the prepublish script will cause the test script to run, which in turn uses mocha to run the test suite.

It is a well-known best practice to automate all administrative tasks, if only so that you never forget how to run those tasks. Creating the scripts entries for every such task not only prevents you from forgetting how to do things, but it also documents the administrative tasks for the benefit of others.

Next, let's talk about how to ensure the Node.js platform on which a package is executed supports the required features.

Declaring Node.js version compatibility

It's important that your Node.js software runs on the correct version of Node.js. The primary reason being that the Node.js platform features required by your package are available every time your package is run. Therefore, the package author must know which Node.js releases are compatible with the package, and then describe in package.json that compatibility.

This dependency is declared in package.json using the engines tag:

"engines": { 
    "node": ">= 8.x <=10.x" 
}

The version string is similar to what we can use in dependencies and devDependencies. In this case, we've defined that the package is compatible with Node.js 8.x, 9.x, and 10.x.

Now that we know how to construct a package, let's talk about publishing packages.

Publishing an npm package

All those packages in the npm repository came from people like you with an idea of a better way of doing something. It is very easy to get started with publishing packages.

Online docs about publishing packages can be found at https://docs.npmjs.com/getting-started/publishing-npm-packages.

Also consider this: https://xkcd.com/927/.

You first use the npm adduser command to register yourself with the npm repository. You can also sign up with the website. Next, you log in using the npm login command.

Finally, while sitting in the package root directory, use the npm publish command. Then, stand back so that you don't get stampeded by the crush of thronging fans, or, maybe not. There are several zillion packages in the repository, with hundreds of packages added every day. To get yours to stand out, you will require some marketing skill, which is another topic beyond the scope of this book.

It is suggested that your first package be a scoped package, for example, @my-user-name/my-great-package.

We've learned a lot in this section about using npm to manage and publish packages. But npm is not the only game in town for managing Node.js packages.

The Yarn package management system

As powerful as npm is, it is not the only package management system for Node.js. Because the Node.js core team does not dictate a package management system, the Node.js community is free to roll up their sleeves and develop any system they feel best. That the vast majority of us use npm is a testament to its value and usefulness. But, there is a significant competitor.

Yarn (see https://yarnpkg.com/en/) is a collaboration between engineers at Facebook, Google, and several other companies. They proclaim that Yarn is ultrafast, ultra-secure (by using checksums of everything), and ultrareliable (by using a yarn-lock.json file to record precise dependencies).

Instead of running their own package repository, Yarn runs on top of the npm package repository at npmjs.com. This means that the Node.js community is not forked by Yarn, but enhanced by having an improved package management tool.

The npm team responded to Yarn in npm@5 (also known as npm version 5) by improving performance and by introducing a package-lock.json file to improve reliability. The npm team has implemented additional improvements in npm@6.

Yarn has become very popular and is widely recommended over npm. They perform extremely similar functions, and the performance is not that different from npm@5. The command-line options are worded differently. Everything we've discussed for npm is also supported by Yarn, albeit with slightly different command syntax. An important benefit Yarn brings to the Node.js community is that the competition between Yarn and npm seems to be breeding faster advances in Node.js package management overall.

To get you started, these are the most important commands:

yarn add: Adds a package to use in your current package
yarn init: Initializes the development of a package
yarn install: Installs all the dependencies defined in a package.json file
yarn publish: Publishes a package to a package manager
yarn remove: Removes an unused package from your current package

Running yarn by itself does the yarn install behavior. There are several other commands in Yarn, and yarn help will list them all.

Summary

You learned a lot in this chapter about modules and packages for Node.js. Specifically, we covered implementing modules and packages for Node.js, the different module structures we can use, the difference between CommonJS and ES6 modules, managing installed modules and packages, how Node.js locates modules, the different types of modules and packages, how and why to declare dependencies on specific package versions, how to find third-party packages, and we gained a good grounding in using npm or Yarn to manage the packages we use and to publish our own packages.

Now that you've learned about modules and packages, we're ready to use them to build applications, which we'll look at in the next chapter.

Table of Contents for Exploring Node.js Modules

Create new playlist

Sign In

Sign Up