© Vasan Subramanian 2019
Vasan SubramanianPro MERN Stackhttps://doi.org/10.1007/978-1-4842-4391-6_6

6. MongoDB

Vasan Subramanian1 
(1)
Bangalore, Karnataka, India
 

In this chapter, I’ll take up MongoDB, the database layer and the M in the MERN stack. Until now, we had an array of issues in the Express server’s memory that we used as the database. We’ll replace this with real persistence and read and write the list of issues from a MongoDB database.

To achieve this, we’ll need to install or use MongoDB on the cloud, get used to its shell commands, install a Node.js driver to access it from Node.js, and finally modify the server code to replace the API calls to read and write from a MongoDB database instead of the in-memory array of issues.

MongoDB Basics

This is an introductory section, where we will not be modifying the application. We’ll look at these core concepts in this section: MongoDB, documents, and collections. Then, we’ll set up MongoDB and explore these concepts with examples using the mongo shell to read and write to the database.

Documents

MongoDB is a document database, which means that the equivalent of a record is a document, or an object. In a relational database, you organize data in terms of rows and columns, whereas in a document database, an entire object can be written as a document.

For simple objects, this may seem no different from a relational database. But let’s say you have objects with nested objects (called embedded documents) and arrays. Now, when using a relational database, this will typically need multiple tables. For example, in a relational database, an Invoice object may be stored in a combination of an invoice table (to store the invoice details such as the customer address and delivery details) and an invoice_items table (to store the details of each item that is part of the shipment). In MongoDB, the entire Invoice object would be stored as one document. That’s because a document can contain arrays and other objects in a nested manner and the contained objects don’t have to be separated out into other documents.

A document is a data structure composed of field and value pairs. The values of fields may include objects, arrays, and arrays of objects and so on, as deeply nested as you want it to be. MongoDB documents are similar to JSON objects, so it is easy to think of them as JavaScript objects. Compared to a JSON object, a MongoDB document has support not only for the primitive data types—Boolean, numbers, and strings—but also other common data types such as dates, timestamps, regular expressions, and binary data.

An Invoice object may look like this:
{
  "invoiceNumber" : 1234,
  "invoiceDate" : ISODate("2018-10-12T05:17:15.737Z"),
  "billingAddress" : {
    "name" : "Acme Inc.",
    "line1" : "106 High Street",
    "city" : "New York City",
    "zip" : "110001-1234"
  },
  "items" : [
    {
      "description" : "Compact Flourescent Lamp",
      "quantity" : 4,
      "price" : 12.48
    },
    {
      "description" : "Whiteboard",
      "quantity" : 1,
      "price" : 5.44
    }
  ]
}

In this document, there are numbers, strings, and a date data type. Further, there is a nested object (billingAddress) and an array of objects (items).

Collections

A collection is like a table in a relational database: it is a set of documents. Just like in a relational database, the collection can have a primary key and indexes. But there are a few differences compared to a relational database.

A primary key is mandated in MongoDB, and it has the reserved field name _id. Even if _id field is not supplied when creating a document, MongoDB creates this field and auto-generates a unique key for every document. More often than not, the auto-generated ID can be used as is, since it is convenient and guaranteed to produce unique keys even when multiple clients are writing to the database simultaneously. MongoDB uses a special data type called the ObjectId for the primary key.

The _id field is automatically indexed. Apart from this, indexes can be created on other fields, and this includes fields within embedded documents and array fields. Indexes are used to efficiently access a subset of documents in a collection.

Unlike a relational database, MongoDB does not require you to define a schema for a collection. The only requirement is that all documents in a collection must have a unique _id, but the actual documents may have completely different fields. In practice, though, all documents in a collection do have the same fields. Although a flexible schema may seem very convenient for schema changes during the initial stages of an application, this can cause problems if some kind of schema checking is not added in the application code.

As of version 3.6, MongoDB has supported a concept of schema, even though it is optional. You can read all about MongoDB schemas at https://docs.mongodb.com/manual/core/schema-validation/index.html . A schema can enforce allowed and required fields and their data types, just like GraphQL can. But it can also validate other things like string length and minimum and maximum values for integers.

But the errors generated because of schema violations do not give enough details as to which of the validation checks fail as of version 3.6. This may improve in future versions of MongoDB, at which point in time it is worth considering adding full-fledged schema checks. For the Issue Tracker application, we’ll not use the schema validation feature of MongoDB, instead, we’ll implement all necessary validations in the back-end code.

Databases

A database is a logical grouping of many collections. Since there are no foreign keys like in a SQL database, the concept of a database is nothing but a logical partitioning namespace. Most database operations read or write from a single collection, but $lookup, which is a stage in an aggregation pipeline, is equivalent to a join in SQL databases. This stage can combine documents within the same database.

Further, taking backups and other administrative tasks work on the database as a unit. A database connection is restricted to accessing only one database, so to access multiple databases, multiple connections are required. Thus, it is useful to keep all the collections of an application in one database, though a database server can host multiple databases.

Query Language

Unlike the universal English-like SQL in a relational database, the MongoDB query language is made up of methods to achieve various operations. The main methods for read and write operations are the CRUD methods. Other methods include aggregation, text search, and geospatial queries.

All methods operate on a collection and take parameters as JavaScript objects that specify the details of the operation. Each method has its own specification. For example, to insert a document, the only argument needed is the document itself. For querying, the parameters are a query filter and a list of fields to return (also called the projection).

The query filter is a JavaScript object consisting of zero or more properties, where the property name is the name of the field to match on and the property value consists of another object with an operator and a value. For example, to match all documents with the field invoiceNumber that are greater than 1,000, the following query filter can be used:
{ "invoiceNumber": { $gt: 1000 } }

Since there is no "language" for querying or updating, the query filters can be very easily constructed programmatically.

Unlike relational databases, MongoDB encourages denormalization, that is, storing related parts of a document as embedded subdocuments rather than as separate collections (tables) in a relational database. Take an example of people (name, gender, etc.) and their contact information (primary address, secondary address etc.). In a relational database, this would require separate tables for People and Contacts, and then a join on the two tables when all of the information is needed together. In MongoDB, on the other hand, it can be stored as a list of contacts within the same People document. That’s because a join of collections is not natural to most methods in MongoDB: the most convenient find() method can operate only on one collection at a time.

Installation

Before you try to install MongoDB on your computer, you may want to try out one of the hosted services that give you access to MongoDB. There are many services, but the following are popular and have a free version that you can use for a small test or sandbox application. Any of these will do quite well for the purpose of the Issue Tracker application that we’ll build as part of this book.
  • MongoDB Atlas ( https://www.mongodb.com/cloud/atlas ): I refer to this as Atlas for short. A small database (shared RAM, 512 MB storage) is available for free.

  • mLab (previously MongoLab) ( https://mlab.com/ ): mLab has announced an acquisition by MongoDB Inc. and may eventually be merged into Atlas itself. A sandbox environment is available for free, limited to 500 MB storage.

  • Compose ( https://www.compose.com ): Among many other services, Compose offers MongoDB as a service. A 30-day trial period is available, but a permanently free sandbox kind of option is not available.

Of these three, I find Atlas the most convenient because there are many options for the location of the host. When connecting to the database, it lets me choose one closest to my location, and that minimizes the latency. mLab does not give a cluster—a database can be created individually. Compose is not permanently free, and it is likely that you may need more than 30 days to complete this book.

The downside of any of the hosted options is that, apart from the small extra latency when accessing the database, you need an Internet connection. Which means that you may not be able to test your code where Internet access is not available, for example, on a flight. In comparison, installing MongoDB on your computer may work better, but the installation takes a bit more work than signing up for one of the cloud-based options.

Even when using one of the cloud options, you will need to download and install the mongo shell to be able to access the database remotely. Each of the services come with instructions on this step as well. Choose version 3.6 or higher of MongoDB when signing up for any of these services. Test the signup by connecting to the cluster or database using the mongo shell, by following instructions given by the service provider.

If you choose to install MongoDB on your computer (it can be installed easily on OS X, Windows, and most distributions based on Linux), look up the installation instructions, which are different for each operating system. You may install MongoDB by following the instructions at the MongoDB website ( https://docs.mongodb.com/manual/installation/ or search for “mongodb installation” in your search engine).

Choose MongoDB version 3.6 or higher, preferably the latest, as some of the examples use features introduced only in version 3.6. Most local installation options let you install the server, the shell, and tools all in one. Check that this is the case; if not, you may have to install them separately.

After a local installation, ensure that you have started MongoDB server (the name of the daemon or service is mongod), if it is not already started by the installation process. Test the installation by running the mongo shell like this:
$ mongo
On a Windows system, you may need to append .exe to the command. The command may require a path depending on your installation method. If the shell starts successfully, it will also connect to the local MongoDB server instance. You should see the version of MongoDB printed on the console, the database it is connecting to (the default is test), and a command prompt, like this, if you had installed MongoDB version 4.0.2 locally:
MongoDB shell version v4.0.2
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 4.0.2
>

The message you see can be slightly different from this, especially if you have installed a different version of MongoDB. But you do need to see the prompt > where you can type further commands. If, instead, you see an error message, revisit the installation and the server starting procedure.

The Mongo Shell

The mongo shell is an interactive JavaScript shell, very much like the Node.js shell. In the interactive shell, a few non-JavaScript conveniences are available over and above the full power of JavaScript. In this section, we’ll discuss the basic operations that are possible via the shell, those that are most commonly used. For a full reference of all the capabilities of the shell, you can take a look at the mongo shell documentation at https://docs.mongodb.com/manual/mongo/ .

The commands that we will be typing in the mongo shell have been collected together in a file called mongo_commands.txt. These commands have been tested to work as is on Atlas or a local installation, but you may find variations in the other options. For example, mLab lets you connect only to a database (as opposed to a cluster), so it does not allow of switching between databases in mLab.

Note

If you find that something is not working as expected when typing a command, cross-check the commands with the same in the GitHub repository ( https://github.com/vasansr/pro-mern-stack-2 ). This is because typos may have been introduced during the production of the book, or last-minute corrections may have missed making it to the book. The GitHub repository, on the other hand, reflects the most up-to-date and tested set of code and commands.

To work with MongoDB, you need to connect to a database. Let’s start with finding which databases are available. The command to show the current databases is:
> show databases
This will list the databases and the storage occupied by them. For example, in a fresh local installation of MongoDB, this is what you will see:
admin         0.000GB
config        0.000GB
local         0.000GB
These are system databases that MongoDB uses for its internal book keeping, etc. We will not be using any of these to create our collections, so we’d better change the current database. To identify the current database, the command is:
> db
The default database a mongo shell connects to is called test and that is what you are likely to see as the output to this command. Let’s now see what collections exist in this database.
> show collections

You will find that there are no collections in this database, since it is a fresh installation. Further, you will also find that the database test was not listed when we listed the available databases. That’s because databases and collections are really created only on the first write operation to any of these.

Let’s switch to a database called issuetracker instead of using the default database:
> use issuetracker
This should result in output that confirms that the new database is issuetracker:
switched to db issuetracker
Let’s confirm that there are no collections in this database either:
> show collections
This command should return nothing. Now, let’s create a new collection. This is done by creating one document in a collection. A collection is referenced as a property of the global object db, with the same name as the collection. The collection called employees can be referred to as db.employees. Let’s insert a new document in the employees collection using the insertOne() method. This method takes in the document to be inserted as an argument:
> db.employees.insertOne({ name: { first: 'John', last: 'Doe' }, age: 44 })
The result of this command will show you the result of the operation and the ID of the new document that was created, something like this:
{
     "acknowledged" : true,
     "insertedId" : ObjectId("5bbc487a69d13abf04edf857")
}
Apart from the insertOne() method, many methods are available on any collection. You can see the list of available methods by pressing the Tab character twice after typing "db.employees." (the period at the end is required before pressing Tab). You may find an output like the following:
db.employees.addIdIfNeeded(              db.employees.getWriteConcern(
db.employees.aggregate(                  db.employees.group(
db.employees.bulkWrite(                  db.employees.groupcmd(
db.employees.constructor                 db.employees.hasOwnProperty
db.employees.convertToCapped(            db.employees.hashAllDocs(
db.employees.convertToSingleObject(      db.employees.help(
db.employees.copyTo(                     db.employees.initializeOrderedBulkOp(
db.employees.count(                      db.employees.initializeUnorderedBulkOp(
db.employees.createIndex(                db.employees.insert(
db.employees.createIndexes(              db.employees.insertMany(
db.employees.dataSize(                   db.employees.insertOne(
...

This is the auto-completion feature of the mongo shell at work. Note that you can let the mongo shell auto-complete the name of any method by pressing the Tab character after entering the beginning few characters of the method.

Let’s now check if the document has been created in the collection. To do that, we can use the find() method on the collection. Without any arguments, this method just lists all the documents in the collection:
> db.employees.find()
This should result in displaying the document we just created, but it is not "pretty" formatted. It will be printed all in one line and may wrap around to the next line inconveniently. To get a more legible output, we can use the pretty() method on the result of the find() method:
> db.employees.find().pretty()
That should show a much more legible output, like this:
{
     "_id" : ObjectId("5bbc487a69d13abf04edf857"),
     "name" : {
            "first" : "John",
            "last" : "Doe"
     },
     "age" : 44
}
At this point in time, if you execute show collections and show databases, you will find that the employees collection and the issuetracker database have indeed been created and are listed in the output of their respective commands. Let’s insert another document in the same collection and try to deal with multiple documents in the collection:
> db.employees.insertOne({ name: { first: 'Jane', last: 'Doe' }, age: 54 })
Now, since we have the full power of JavaScript in the shell, let’s try to exercise some of it to get a taste. Instead of printing the results onscreen, let’s collect the results into a JavaScript array variable. The result of the find() method was a cursor that could be iterated. In the cursor object, there are methods other than pretty(), one of which is toArray(). This method reads all the documents from the query and places them in an array. So, let’s use this method and assign its result to an array variable.
> let result = db.employees.find().toArray()
Now, the variable result should be an array with two elements, each an employee document. Let’s use the JavaScript array method forEach() to iterate through them and print the first names of each employee:
> result.forEach((e) => print('First Name:', e.name.first))
This should give an output like this:
First Name: John
First Name: Jane
In Node.js, the console.log method is available for printing objects on the console. The mongo shell, on the other hand, provides the print() method for the same purpose, but this prints only strings. Objects need to be converted to strings before printing, using the utility function tojson(). There is also another method, called printjson(), which prints objects as JSON. Let’s use that to inspect the contents of the nested document name instead of only the first name:
> result.forEach((e) => printjson(e.name))
Now, you should see the name object expanded into first and last names, like the following:
{ "first" : "John", "last" : "Doe" }
{ "first" : "Jane", "last" : "Doe" }

The shell by itself does very little apart from providing a mechanism to access methods of the database and collections. It is the JavaScript engine, which forms the basis of the shell and gives a lot of flexibility and power to the shell.

In the next section, we will discuss more methods on the collection, such as insertOne() that you just learned about. These methods are accessible from many programming languages via a driver. The mongo shell is just another tool that can access these methods. You will find that the methods and arguments available in other programming languages are very similar to those in the mongo shell.

Exercise: MongoDB Basics

  1. 1.

    Using the shell, display a list of methods available on the cursor object. Hint: Look up the mongo shell documentation for mongo Shell Help at https://docs.mongodb.com/manual/tutorial/access-mongo-shell-help/ .

     

Answers are available at the end of the chapter.

MongoDB CRUD Operations

Since the mongo shell is the easiest to try out, let’s explore the CRUD operations available in MongoDB using the shell itself. We will continue to use the issuetracker database we created in the previous section. But let’s clear the database so that we can start fresh. The collection object provides a convenient method for erasing itself called drop():
> db.employees.drop()
This should result in an output like this:
true

This is different from removing all the documents in the collection, because it also removes any indexes that are part of the collection.

Create

In the previous section, you briefly saw how to insert a document, and as part of that, you found how MongoDB automatically created the primary key, which was a special data type called ObjectID. Let’s now use our own ID instead of letting MongoDB auto-generate one.
> db.employees.insertOne({
  _id: 1,
  name: { first: 'John', last: 'Doe' },
  age: 44
})
This will result in the following output:
{ "acknowledged" : true, "insertedId" : 1 }
Note that the value of insertedId reflected the value that we supplied for _id. Which means that, instead of an ObjectID type of value, we were able to supply our own value. Let’s try to create a new identical document (you can use the Up Arrow key to repeat the previous command in the mongo shell). It will fail with the following error:
WriteError({
     "index" : 0,
     "code" : 11000,
     "errmsg" : "E11000 duplicate key error collection: issuetracker.employees index: _id_ dup key: { : 1.0 }",
     "op" : {
            "_id" : 1,
            "name" : {
                  "first" : "John",
                  "last" : "Doe"
            },
            "age" : 44
     }
})
This shows that the _id field continues to be a primary key and it is expected to be unique, regardless of whether it is auto-generated or supplied in the document. Now, let’s add another document, but with a new field as part of the name, say, the middle name:
> db.employees.insertOne({
  name: {first: 'John', middle: 'H', last: 'Doe'},
  age: 22
})

This works just fine, and using find() , you can see that two documents exist in the collection, but they are not necessarily the same schema. This is the advantage of a flexible schema: the schema can be enhanced whenever a new data element that needs to be stored is discovered, without having to explicitly modify the schema.

In this case, it is implicit that any employee document where the middle field under name is missing indicates an employee without a middle name. If, on the other hand, a field was added that didn’t have an implicit meaning when absent, its absence would have to be handled in the code. Or a migration script would have to be run that defaults the field’s value to something.

You will also find that the format of the _id field is different for the two documents, and even the data type is different. For the first document, the data type is an integer. For the second, it is of type ObjectID (which is why it is shown as ObjectID(...). Thus, it’s not just the presence of fields that can differ between two documents in the same collection, even the data types of the same field can be different.

In most cases, leaving the creation of the primary key to MongoDB works just great, because you don’t have to worry about keeping it unique: MongoDB does that automatically. But, this identifier is not human-readable. In the Issue Tracker application, we want the identifier to be a number so that it can be easily remembered and talked about. But instead of using the _id field to store the human-readable identifier, let’s use a new field called id and let MongoDB auto-generate _id.

So, let’s drop the collection and start creating new documents with a new field called id.
> db.employees.drop()
> db.employees.insertOne({
  id: 1,
  name: { first: 'John', last: 'Doe' },
  age: 48
})
> db.employees.insertOne({
  id: 2,
  name: { first: 'Jane', last: 'Doe'} ,
  age: 16
})
The collection has a method that can take in multiple documents in one go. This method is called insertMany() . Let’s use that to create a few more documents in a single command:
> db.employees.insertMany([
  { id: 3, name: { first: 'Alice', last: 'A' }, age: 32 },
  { id: 4, name: { first: 'Bob', last: 'B' }, age: 64 },
])
The response to this would have shown that multiple insertedIds that were created as opposed to a single insertedId for the insertOne() method , like this:
{
     "acknowledged" : true,
     "insertedIds" : [
            ObjectId("5bc6d80005fb87b8f2f5cf6f"),
            ObjectId("5bc6d80005fb87b8f2f5cf70")
     ]
}

Read

Now that there are multiple documents in the collection, let’s see how to retrieve a subset of the documents as opposed to the full list. The find() method takes in two more arguments. The first is a filter to apply to the list, and the second is a projection, a specification of which fields to retrieve.

The filter specification is an object where the property name is the field to filter on, and the value is its value that it needs to match. Let’s fetch one employee’s document, identified by the id being equal to 1. Since we know that there can only be one employee for the given ID, let’s use findOne() rather than find(). The method findOne() is a variation of the method find(), and it returns a single object rather than a cursor.
> db.employees.findOne({ id: 1 })
This should return the first employee document that we created, and the output will look like this:
{
     "_id" : ObjectId("5bc6d7e505fb87b8f2f5cf6d"),
     "id" : 1,
     "name" : {
            "first" : "John",
            "last" : "Doe"
     },
     "age" : 48
}

Note that we did not use pretty() here, yet, the output is prettified. This is because findOne() returns a single object and the mongo shell prettifies objects by default.

The filter is actually a shorthand for { id: { $eq: 1 } }, where $eq is the operator signifying that the value of the field id has to be equal to 1. In the generic sense, the format of a single element in the filter is fieldname: { operator: value }. Other operators for comparison are available, such as $gt for greater than, etc. Let’s try the $gte (greater than or equal to) operator for fetching a list of employees aged 30 or older:
> db.employees.find({ age: { $gte: 30 } })
That command should return three documents, because we inserted those many whose age was more than 30. If multiple fields are specified, then all of them have to match, which is the same as combining them with an and operator:
> db.employees.find({ age: { $gte: 30 }, 'name.last': 'Doe'  })

The number of documents returned now should be reduced to only one, since there is only one document that matched both the criteria, the last name being equal to 'Doe' as well as age being greater than 30. Note that we used the dot notation for specifying a field embedded in a nested document. And this also made us use quotes around the field name, since it is a regular JavaScript object property.

To match multiple values of the same field—for example, to match age being greater than 30 and age being less than 60—the same strategy cannot be used. That’s because the filter is a regular JavaScript object, and two properties of the same name cannot exist in a document. Thus, a filter like { age: { $gte: 30 }, age: { $lte: 60 } } will not work (JavaScript will not throw an error, instead, it will pick just one of the values for the property age). An explicit $and operator has to be used, which takes in an array of objects specifying multiple field-value criteria. You can read all about the $and operator and many more operators in the operators section of the reference manual of MongoDB at https://docs.mongodb.com/manual/reference/operator/query/ .

When filtering on a field is a common occurrence, it’s typically a good idea to create an index on that field. The createIndex() method on the collection is meant for this purpose. It takes in an argument specifying the fields that form the index (multiple fields will form a composite index). Let’s create an index on the age field:
> db.employees.createIndex({ age: 1 })

With this index, any query that uses a filter that has the field age in it will be significantly faster because MongoDB will use this index instead of scanning through all documents in the collection. But this was not a unique index, as many people can be the same age.

The age field is probably not a frequently used filter, but fetching a document based on its identifier is going to be very frequent. MongoDB automatically creates an index on the _id field, but we have used our own identifier called id, and this field is what is more likely to be used to fetch individual employees. So let’s create an index on this field. Further, it has to be unique since it identifies the employee: no two employees should have the same value for id. The second argument to createIndex() is an object that contains various attributes of the index, one of them specifying whether the index is unique. Let’s use that to create a unique index on id:
> db.employees.createIndex({ id: 1 }, { unique: true })
Now, not only will the find() method perform much better when a filter with id is supplied, but creation of a document with a duplicate id will be prevented by MongoDB. Let’s try that by re-running the insert command for the first employee:
> db.employees.insertOne({
  id: 1,
  name: { first: 'John', last: 'Doe' },
  age: 48
})
Now, you should see an error in the mongo shell like this (the ObjectID will be different for you):
WriteError({
     "index" : 0,
     "code" : 11000,
     "errmsg" : "E11000 duplicate key error collection: issuetracker.employees index: id_1 dup key: { : 1.0 }",
     "op" : {
            "_id" : ObjectId("5bc04b8569334c5ff5bb7e8c"),
            "id" : 1
               ...
     }
})

Projection

All this while, we retrieved the entire document that matched the filter. In the previous section, when we had to print only a subset of the fields of the document, we did it using a forEach() loop. But this means that the entire document is fetched from the server even when we needed only some parts of it for printing. When the documents are large, this can use up a lot of network bandwidth. To restrict the fetch to only some fields, the find() method takes a second argument called the projection. A projection specifies which fields to include or exclude in the result.

The format of this specification is an object with one or more field names as the key and the value as 0 or 1, to indicate exclusion or inclusion. But 0s and 1s cannot be combined. You can either start with nothing and include fields using 1s, or start with everything and exclude fields using 0s. The _id field is an exception; it is always included unless you specify a 0. The following will fetch all employees but only their first names and age:
> db.employees.find({}, { 'name.first': 1, age: 1 })
Note that we specified an empty filter, to say that all documents have to be fetched. This had to be done since the projection is the second argument. The previous request would have printed something like this:
{ "_id" : ObjectId("5bbc...797855"), "name" : { "first" : "John" }, "age" : 48 }
{ "_id" : ObjectId("5bbc...797856"), "name" : { "first" : "Jane" }, "age" : 16 }
{ "_id" : ObjectId("5bbc...797857"), "name" : { "first" : "Alice" }, "age" : 32 }
{ "_id" : ObjectId("5bbc...797858"), "name" : { "first" : "Bob" }, "age" : 64 }
Even though we specified only the first name and age, the field _id was automatically included. To suppress the inclusion of this field, it needs to be explicitly excluded, like this:
> db.employees.find({}, { _id: 0, 'name.first': 1, age: 1 })
Now, the output will exclude the ID and look like this:
{ "name" : { "first" : "John" }, "age" : 48 }
{ "name" : { "first" : "Jane" }, "age" : 16 }
{ "name" : { "first" : "Alice" }, "age" : 32 }
{ "name" : { "first" : "Bob" }, "age" : 64 }

Update

There are two methods—updateOne() and updateMany()—available for modifying a document. The arguments to both methods are the same, except that updateOne() stops after finding and updating the first matching document. The first argument is a query filter, the same as the filter that find() takes. The second argument is an update specification if only some fields of the object need to be changed.

When using updateOne() , the primary key or any unique identifier is what is normally used in the filter, because the filter can match only one document. The update specification is an object with a series of $set properties whose values indicate another object, which specifies the field and its new value. Let’s update the age of the employee identified by the id 2:
> db.employees.updateOne({ id: 2 }, { $set: {age: 23 } })
This should result in the following output:
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }

The matchedCount returned how many documents matched the filter. If the filter had matched more than one, that number would have been returned. But since the method is supposed to modify only one document, the modified count should always be 1, unless the modification had no effect. If you run the command again, you will find that modifiedCount will be 0, since the age was already 23 for the employee with ID 2.

To modify multiple documents in one shot, the updateMany() method has to be used. The format is the same as the updateOne() method, but the effect is that all documents that match will be modified. Let’s add an organization field to all employees using the updateMany() method:
> db.employees.updateMany({}, { $set: { organization: 'MyCompany' } })

Note that even though the field organization did not exist in the documents, the new value MyCompany would have been applied to all of them. If you execute the command find() to show the companies alone in the projection, this fact will be confirmed.

There is also a method to replace the complete document called replaceOne(). Instead of specifying which fields to modify, if the complete modified document is available, it can be used to just replace the existing document with the new one. Here’s an example:
> db.employees.replaceOne({ id: 4 }, {
  id: 4,
  name : { first : "Bobby" },
  age : 66
});
This command will replace the existing document with ID 4, with the new one. The fact that the organization and name.last fields were not specified will have the effect that these fields will not exist in the replaced document, as opposed to not changed using updateOne(). Getting the replaced object should show proof of that:
> db.employees.find({ id: 4 })
This should result in a document that looks as follows:
{ "_id" : ObjectId("5c38ae3da7dc439456c0281b"), "id" : 4, "name" : { "first" : "Bobby" }, "age" : 66 }

You can see that it no longer has the fields name.last and organization, because these were not specified in the document that was supplied to the command replaceOne(). It just replaces the document with the one supplied, except for the field ObjectId. Being the primary key, this field cannot be changed via an updateOne() or a replaceOne().

Delete

The delete operation takes a filter and removes the document from the collection. The filter format is the same, and the variations deleteOne() and deleteMany() are both available, just as in the update operation.

Let’s delete the last document, with ID 4:
> db.employees.deleteOne({ id: 4 })
This should result in the following output, confirming that the deletion affected only one document:
{ "acknowledged" : true, "deletedCount" : 1 }
Let’s also cross-check the deletion by looking at the size of the collection. The count() method on the collection tells us how many documents it contains. Executing that now should return the value 3, because we originally inserted four documents.
> db.employees.count()

Aggregate

The find() method is used to return all the documents or a subset of the documents in a collection. Many a time, instead of the list of documents, we need a summary or an aggregate, for example, the count of documents that match a certain criterion.

The count() method can surely take a filter. But what about other aggregate functions, such as sum? That is where the aggregate() comes into play. When compared to relational databases supporting SQL, the aggregate() method performs the function of the GROUP BY clause. But it can also perform other functions such as a join, or even an unwind (expand the documents based on arrays within), and much more.

You can look up the advanced features that the aggregate() function supports in the MongoDB documentation at https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/ but for now, let’s look at the real aggregation and grouping construct that it provides.

The aggregate() method works in a pipeline. Every stage in the pipeline takes the input from the result of the previous stage and operates as per its specification to result in a new modified set of documents. The initial input to the pipeline is, of course, the entire collection. The pipeline specification is in the form of an array of objects, each element being an object with one property that identifies the pipeline stage type and the value specifying the pipeline’s effect.

For example, the find() method can be replicated using aggregate() by using the stages $match (the filter) and $project (the projection). To perform an actual aggregation, the $group stage needs to be used. The stage’s specification includes the grouping key identified by the property _id and other fields as keys, whose values are aggregation specifications and fields on which the aggregation needs to be performed. The _id can be null to group on the entire collection.

Let’s try this by getting the total age of all employees in the entire collection. There will be only one element in the array of pipelines, an object with a single property $group . In the value, _id will be set to null because we don’t want to group by any field. We’ll need to sum (using the aggregate function $sum) the field age into a new field called total_age like this:
> db.employees.aggregate([
  { $group: { _id: null, total_age: { $sum: '$age' } } }
])
This should result in an output like this:
{ "_id" : null, "total_age" : 103 }
The same function, $sum, can be used to get a count of the records by simply summing the value 1:
> db.employees.aggregate([
  { $group: { _id: null, count: { $sum: 1 } } }
])
To group the aggregate by a field, we’ll need to specify the name of the field (prefixed by a $) as the value of _id. Let’s use the organization field, but before that, let’s insert a new document with an organization different from the rest of the documents (which were all set to MyCompany):
> db.employees.insertOne({
  id: 4,
  name: { first: 'Bob', last: 'B' },
  age: 64,
  organization: 'OtherCompany'
})
Now, here’s the command that aggregates the age using sum across different organizations:
> db.employees.aggregate([
  { $group: { _id: '$organization', total_age: { $sum: '$age' } } }
])
This should result in an output like this:
{ "_id" : "OtherCompany", "total_age" : 64 }
{ "_id" : "MyCompany", "total_age" : 103 }
Let’s also try another aggregate function, say average, using $avg:
> db.employees.aggregate([
  { $group: { _id: '$organization', average_age: { $avg: '$age' } } }
])
This should now result in an output like this:
{ "_id" : "OtherCompany", "average_age" : 64 }
{ "_id" : "MyCompany", "average_age" : 34.333333333333336 }

There are other aggregation functions, including minimum and maximum. For the complete set, refer to the documentation at https://docs.mongodb.com/manual/reference/operator/aggregation/group/#accumulator-operator .

Exercise: MongoDB Crud Operations

  1. 1.

    Write a simple statement to retrieve all employees who have middle names. Hint: Look up the MongoDB documentation for query operators at https://docs.mongodb.com/manual/reference/operator/query/ .

     
  2. 2.

    Is the filter specification a JSON? Hint: Think about date objects and quotes around field names.

     
  3. 3.

    Say an employee’s middle name was set mistakenly, and you need to remove it. Write a statement to do this. Hint: Look up the MongoDB documentation for update operators at https://docs.mongodb.com/manual/reference/operator/update/ .

     
  4. 4.

    During index creation, what did the 1 indicate? What other valid values are allowed? Hint: Look up the MongoDB indexes documentation at https://docs.mongodb.com/manual/indexes/ .

     

Answers are available at the end of the chapter.

MongoDB Node.js Driver

This is the Node.js driver that lets you connect and interact with the MongoDB server. It provides methods very similar to what you saw in the mongo shell, but not exactly the same. Instead of the low-level MongoDB driver, we could use an Object Document Mapper called Mongoose, which has a higher level of abstraction and more convenient methods. But learning about the lower-level MongoDB driver may give you a better handle on the actual working of MongoDB itself, so I’ve chosen to use the low-level driver for the Issue Tracker application.

To start, let’s install the driver:
$ npm install mongodb@3

Let’s also start a new Node.js program just to try out the different ways that the driver’s methods can be used. In the next section, we’ll use some code from this trial to integrate the driver into the Issue Tracker application. Let’s call this sample Node.js program trymongo.js and place it in a new directory called scripts, to distinguish it from other files that are part of the application.

The first thing to do is make a connection to the database server. This can be done by first importing the object MongoClient from the driver, then creating a new client object from it using a URL that identifies a database to connect to, and finally calling the connect method on it, like this:
...
const { MongoClient } = require('mongodb');
const client = new MongoClient(url);
client.connect();
...

The URL should start with mongodb:// followed by the hostname or the IP address of the server to connect to. An optional port can be added using : as the separator, but it’s not required if the MongoDB server is running on the default port, 27017. It’s good practice to separate the connection parameters into a configuration file rather than keep them in a checked-in file, but we’ll do this in the next chapter. For the moment, let’s hard code this. If you have used one of the cloud providers, the URL can be obtained from the corresponding connection instructions. For the local installation, the URL will be mongodb://localhost/issuetracker. Note that the MongoDB Node.js driver accepts the database name as part of the URL itself, and it is best to specify it this way, even though a cloud provider may not show this explicitly.

Let’s add the local installation URL to trymongo.js and a commented version of cloud providers’ URLs.
...
const url = 'mongodb://localhost/issuetracker';
// Atlas URL - replace UUU with user, PPP with password, XXX with hostname
// const url = 'mongodb+srv://UUU:[email protected]/issuetracker?retryWrites=true';
// mLab URL - replace UUU with user, PPP with password, XXX with hostname
// const url = 'mongodb://UUU:[email protected]:33533/issuetracker';
...
Further, the client constructor takes in another argument with more settings for the client, one of which is whether to use the new style parser. Let’s change the constructor to pass this also, to avoid a warning in the latest Node.js driver (version 3.1).
...
const client = new MongoClient(url, { useNewUrlParser: true });
...
The connect() method is an asynchronous method and needs a callback to receive the result of the connection. The callback takes in two arguments: an error and the result. The result is the client object itself. Within the callback, a connection to the database (as opposed a connection to the server) can be obtained by calling the db method of the client object. Thus, the callback and connection to the database can be written like this:
...
client.connect(function(err, client) {
  const db = client.db();
...
The connection to the database, db, is similar to the db variable we used in the mongo shell. In particular, it is the one that we can use to get a handle to a collection and its methods. Let’s get a handle to the collection called employees that we were using in the previous section using the mongo shell.
...
  const collection = db.collection('employees');
...

With this collection, we can do the same things we did with the mongo shell’s equivalent db.employees in the previous section. The methods are also very similar, except that they are all asynchronous. This means that the methods take in the regular arguments, but also a callback function that’s called when the operation completes. The convention in the callback functions is to pass the error as the first argument and the result of the operation as the second argument. You already saw this pattern of callback in the previous connection method.

Let’s insert a document and read it back to see how these methods work within the Node.js driver. The insertion can be written using the insertOne method , passing in an employee document and a callback. Within the callback, let’s print the new _id that was created. Just as in the mongo shell insertOne command, the created ID is returned as part of the result object, in the property called insertedId.
...
  const employee = { id: 1, name: 'A. Callback', age: 23 };
  collection.insertOne(employee, function(err, result) {
    console.log('Result of insert: ', result.insertedId);
...

Note that accessing the collection and the insert operation can only be called within the callback of the connection operation, because only then do we know that the connection has succeeded. There also needs to be some amount of error handling, but let’s deal with this a little later.

Now, within the callback of the insert operation, let’s read back the inserted document, using the ID of the result. We could use either the ID we supplied (id) or the auto-generated MongoDB ID (_id). Let’s use _id just to make sure that we are able to use the result values.
...
    collection.find({ _id: result.insertedId})
      .toArray(function(err, docs) {
        console.log('Result of find: ', docs);
      }
...
Now that we are done inserting and reading back the document, we can close the connection to the server. If we don’t do this, the Node.js program will not exit, because the connection object is waiting to be used and listening to a socket.
...
      client.close();
...

Let’s put all this together in a function called testWithCallbacks(). We will soon also use a different method of using the Node.js driver using async/await. Also, as is customary, let’s pass a callback function to this function, which we will call from the testWithCallbacks() function once all the operations are completed. Then, if there are any errors, these can be passed to the callback function.

Let’s first declare this function:
...
function testWithCallbacks(callback) {
  console.log(' --- testWithCallbacks ---');
  ...
}
...
And within each callback as a result of each of the operations, on an error, we need to do the following:
  • Close the connection to the server

  • Call the callback

  • Return from the call, so that no more operations are performed

We also need to do the same when all operations are completed. The pattern of the error handling is like this:
...
    if (err) {
      client.close();
      callback(err);
      return;
    }
...
Let’s also introduce a call to the testWithCallbacks() function from the main section, supply it a callback to receive any error, and print it if any.
...
testWithCallbacks(function(err) {
  if (err) {
    console.log(err);
  }
});
...

With all the error handling and callbacks introduced, the final code in the trymongo.js file is shown in Listing 6-1.

Note

Although no effort has been spared to ensure that all code listings are accurate, there may be typos or even corrections that did not make it to the book before it went to press. So, always rely on the GitHub repository ( https://github.com/vasansr/pro-mern-stack-2 ) as the tested and up-to-date source for all code listings, especially if something does not work as expected.

const { MongoClient } = require('mongodb');
const url = 'mongodb://localhost/issuetracker';
// Atlas URL  - replace UUU with user, PPP with password, XXX with hostname
// const url = 'mongodb+srv://UUU:[email protected]/issuetracker?retryWrites=true';
// mLab URL - replace UUU with user, PPP with password, XXX with hostname
// const url = 'mongodb://UUU:[email protected]:33533/issuetracker';
function testWithCallbacks(callback) {
  console.log(' --- testWithCallbacks ---');
  const client = new MongoClient(url, { useNewUrlParser: true });
  client.connect(function(err, client) {
    if (err) {
      callback(err);
      return;
    }
    console.log('Connected to MongoDB');
    const db = client.db();
    const collection = db.collection('employees');
    const employee = { id: 1, name: 'A. Callback', age: 23 };
    collection.insertOne(employee, function(err, result) {
      if (err) {
        client.close();
        callback(err);
        return;
      }
      console.log('Result of insert: ', result.insertedId);
      collection.find({ _id: result.insertedId})
        .toArray(function(err, docs) {
        if (err) {
          client.close();
          callback(err);
          return;
        }
        console.log('Result of find: ', docs);
        client.close();
        callback(err);
      });
    });
  });
}
testWithCallbacks(function(err) {
  if (err) {
    console.log(err);
  }
});
Listing 6-1

trymongo.js: Using Node.js driver, Using the Callbacks Paradigm

Let’s clean up the collection before we test this. We could open another command shell, run the mongo shell in it, and execute db.employees.remove({}). But the mongo shell has a command line way of executing a simple command using the --eval command line option. Let’s do that instead and pass the database name to connect to; otherwise, the command will be executed on the default database test. For the local installation, the command will look like this:
$ mongo issuetracker --eval "db.employees.remove({})"
If you are using a remote server from one of the hosting providers, instead of the database name, use the connection string including the database name as suggested by the hosting provider. For example, the Atlas command may look like this (replace the hostname, user, and password with your own):
$ mongo "mongodb+srv://cluster0-xxxxx.mongodb.net/issuetracker" --username atlasUser --password atlasPassword --eval "db.employees.remove({})"
Now, we are ready to test the trial program we just created. It can be executed like this:
$ node scripts/trymongo.js
This should result in output like this (you will see a different ObjectID, otherwise the output should be the same):
--- testWithCallbacks ---
Connected to MongoDB
Result of insert:
 5bbef955580a2c313d4052f6
Result of find:
 [ { _id: 5bbef955580a2c313d4052f6,
    id: 1,
    name: 'A. Callback',
    age: 23 } ]

As you probably felt yourself, the callback paradigm is a bit unwieldy. But the advantage is that it works in the older JavaScript version (ES5), and therefore, older versions of Node.js. The callbacks are bit too deeply nested and the error handling makes for repetitive code. ES2015 started supporting Promises, which is supported by the Node.js MongoDB driver as well, and this was an improvement over callbacks. But in ES2017 and Node.js from version 7.6, full support for the async/await paradigm appeared, and this is the recommended and most convenient way to use the driver.

Let’s implement another function called testWithAsync() within trymongo.js that uses the async/await paradigm. All asynchronous calls with a callback can now be replaced by a call to the same method, but without supplying a callback. Using await before the method call will simulate a synchronous call by waiting for the call to complete and return the results. For example, instead of passing a callback to the connect() method, we can just wait for it to complete like this:
...
    await client.connect();
...
Right in the next line, we can do whatever needs to be done after the operation is completed, in this case, get a connection to the database:
...
    await client.connect();
    const db = client.db();
...
The same pattern can be used for the other asynchronous calls, with one change: the result of the call, which was originally the second argument of the callback, can directly be assigned to a variable like a return value from the function call. So, the result of insertOne() can be captured like this:
...
    const result = await collection.insertOne(employee);
...

Errors will be thrown and can be caught. We can place all the operations in a single try block and catch any error in one place (the catch block) rather than after each call. There is no need for the function to take a callback, because if the caller needs to wait for the result, an await can be added before the call to this function, and errors can be thrown.

The new function using await before each of the operations—connect(), insertOne(), and find()—is shown in Listing 6-2.
async function testWithAsync() {
  console.log(' --- testWithAsync ---');
  const client = new MongoClient(url, { useNewUrlParser: true });
  try {
    await client.connect();
    console.log('Connected to MongoDB');
    const db = client.db();
    const collection = db.collection('employees');
    const employee = { id: 2, name: 'B. Async', age: 16 };
    const result = await collection.insertOne(employee);
    console.log('Result of insert: ', result.insertedId);
    const docs = await collection.find({ _id: result.insertedId })
      .toArray();
    console.log('Result of find: ', docs);
  } catch(err) {
    console.log(err);
  } finally {
    client.close();
  }
}
Listing 6-2

trymongo.js, testWithAsync Function

Finally, let’s modify the main part of the program to call testWithAsync() within the callback that handles the return value from testWithCallbacks():
...
testWithCallbacks(function(err) {
  if (err) {
    console.log(err);
  }
  testWithAsync();
});
...
If you clear the collection using remove() as described previously and test these changes, you will see this result (the ObjectIDs that you see will be different than the ones shown here):
--- testWithCallbacks ---
Connected to MongoDB
Result of insert:
 5bbf25dcf50e97340be0f01f
Result of find:
 [ { _id: 5bbf25dcf50e97340be0f01f,
    id: 1,
    name: 'A. Callback',
    age: 23 } ]
--- testWithAsync ---
Connected to MongoDB
Result of insert:
 5bbf25dcf50e97340be0f020
Result of find:
 [ { _id: 5bbf25dcf50e97340be0f020,
    id: 2,
    name: 'B. Async',
    age: 16 } ]

A good way to test whether errors are being caught and displayed is by running the program again. There will be errors because we have a unique index on the field id, so MongoDB will throw a duplicate key violation. If you have dropped the collection after creating the index, you could run the createIndex() command to reinstate this index.

As you can see, the async/await paradigm is much smaller in terms of code, as well as a lot clearer and easier to read. In fact, although we caught the error within this function, we didn’t have to do it. We could as well have let the caller handle it.

Given the benefits of the async/await paradigm, let’s use this in the Issue Tracker application when interacting with the database.

Schema Initialization

The mongo shell is not only an interactive shell, but is also a scripting environment. Using this, scripts can be written to perform various tasks such as schema initialization and migration. Because the mongo shell is in fact built on top of a JavaScript engine, the power of JavaScript is available in the scripts, just as in the shell itself.

One difference between the interactive and the non-interactive mode of working is that the non-interactive shell does not support non-JavaScript shortcuts, such as use <db> and show collections commands. The script has to be a regular JavaScript program adhering to the proper syntax.

Let’s create a schema initialization script called init.mongo.js within the script directory. Since MongoDB does not enforce a schema, there is really no such thing as a schema initialization as you may do in relational databases, like creation of tables. The only thing that is really useful is the creation of indexes, which are one-time tasks. While we’re at it, let’s also initialize the database with some sample documents to ease testing. We will use the same database called issuetracker that we used to try out the mongo shell, to store all the collections relevant to the Issue Tracker application.

Let’s copy the array of issues from server.js and use the same array to initialize the collection using insertMany() on a collection called issues. But before that, let’s clear existing issues it by calling a remove() with an empty filter (which will match all documents) on the same collection. Then, let’s create a few indexes on useful fields that we will be using to search the collection with.

Listing 6-3 shows the complete contents of the initialization script, init.mongo.js. There are comments in the beginning of the file that indicate how to run this script for different types of databases—local, Atlas, and mLab.
/*
 * Run using the mongo shell. For remote databases, ensure that the
 * connection string is supplied in the command line. For example:
 * localhost:
 *   mongo issuetracker scripts/init.mongo.js
 * Atlas:
 *   mongo mongodb+srv://user:[email protected]/issuetracker
     scripts/init.mongo.js
 * MLab:
 *   mongo mongodb://user:[email protected]:33533/issuetracker
     scripts/init.mongo.js
 */
db.issues.remove({});
const issuesDB = [
  {
    id: 1, status: 'New', owner: 'Ravan', effort: 5,
    created: new Date('2019-01-15'), due: undefined,
    title: 'Error in console when clicking Add',
  },
  {
    id: 2, status: 'Assigned', owner: 'Eddie', effort: 14,
    created: new Date('2019-01-16'), due: new Date('2019-02-01'),
    title: 'Missing bottom border on panel',
  },
];
db.issues.insertMany(issuesDB);
const count = db.issues.count();
print('Inserted', count, 'issues');
db.issues.createIndex({ id: 1 }, { unique: true });
db.issues.createIndex({ status: 1 });
db.issues.createIndex({ owner: 1 });
db.issues.createIndex({ created: 1 });
Listing 6-3

init.mongo.js: Schema Initialization

You should be able to run this script using the mongo shell, with the name of the file as an argument in the command line, if you are using the local installation of MongoDB like this:
$ mongo issuetracker scripts/init.mongo.js

For the other methods of using MongoDB, there are instructions as comments on the top of the script. In essence, the entire connection string has to be specified in the command line, including the username and password that you use to connect to the hosted service. Following the connection string, you can type the name of the script, scripts/init.mongo.js.

You can run this any time you wish to reset the database to its pristine state. You should see an output that indicates that two issues were inserted, among other things such as the MongoDB version and the shell version. Note that creating an index when one already exists has no effect, so it is safe to create the index multiple times.

Exercise: Schema Initialization

  1. 1.

    The same schema initialization could have been done using a Node.js script and the MongoDB driver. What are the pros and cons of each of these methods: using the mongo shell vs. the Node.js MongoDB driver?

     
  2. 2.

    Are there any other indexes that may be useful? Hint: What if we needed a search bar in the application? Read about MongoDB index types at https://docs.mongodb.com/manual/indexes/#index-types .

     

Answers are available at the end of the chapter.

Reading from MongoDB

In the previous section, you saw how to use the Node.js driver to perform basic CRUD tasks. With this knowledge, let’s now change the List API to read from the MongoDB database rather than the in-memory array of issues in the server. Since we’ve initialized the database with the same initial set of issues, while testing, you should see the same set of issues in the UI.

In the trial that we did for the driver, we used the connection to the database in a sequence of operations and closed it. In the application, instead, we will maintain the connection so that we can reuse it for many operations, which will be triggered from within API calls. So, we’ll need to store the connection to the database in a global variable. Let’s do that in addition to the import statement and other global variable declarations and call the global database connection variable db:
...
const url = 'mongodb://localhost/issuetracker';
// Atlas URL  - replace UUU with user, PPP with password, XXX with hostname
// const url = 'mongodb+srv://UUU:[email protected]/issuetracker?retryWrites=true';
// mLab URL - replace UUU with user, PPP with password, XXX with hostname
// const url = 'mongodb://UUU:[email protected]:33533/issuetracker';
let db;
...
Next, let’s write a function to connect to the database, which initializes this global variable. This is a minor variation of what we did in trymongo.js. Let’s not catch any errors in this function, instead, let the caller deal with them.
...
async function connectToDb() {
  const client = new MongoClient(url, { useNewUrlParser: true });
  await client.connect();
  console.log('Connected to MongoDB at', url);
  db = client.db();
}
...
Now, we have to change the setup of the server to first connect to the database and then start the Express application. Since connectToDb() is an async function, we can use await to wait for it to finish, then call app.listen(). But since await cannot be used in the main section of the program, we have to enclose it within an async function and execute that function immediately.
...
(async function () {
  await connectToDb();
  app.listen(3000, function () {
    console.log('App started on port 3000');
  });
})();
...
But we also have to deal with errors. So, let’s enclose the contents of this anonymous function within a try block and print any errors on the console in the catch block:
...
(async function () {
  try {
    ...
  } catch (err) {
    console.log('ERROR:', err);
  }
})();
...
Now that we have a connection to the database set up in the global variable called db, we can use it in the List API resolver issueList() to retrieve a list of issues by calling the find() method on the issues collection. We need to return an array of issues from this function, so let’s just use the toArray() function on the results of find() like this:
...
  const issues = await db.collection('issues').find({}).toArray();
...
The changes to server.js are shown in Listing 6-4.
...
const { Kind } = require('graphql/language');
const { MongoClient } = require('mongodb');
const url = 'mongodb://localhost/issuetracker';
// Atlas URL  - replace UUU with user, PPP with password, XXX with hostname
// const url = 'mongodb+srv://UUU:[email protected]/issuetracker?retryWrites=true';
// mLab URL - replace UUU with user, PPP with password, XXX with hostname
// const url = 'mongodb://UUU:[email protected]:33533/issuetracker';
let db;
let aboutMessage = "Issue Tracker API v1.0";
...
async function issueList() {
  return issuesDB;
  const issues = await db.collection('issues').find({}).toArray();
  return issues;
}
...
async function connectToDb() {
  const client = new MongoClient(url, { useNewUrlParser: true });
  await client.connect();
  console.log('Connected to MongoDB at', url);
  db = client.db();
}
const server = new ApolloServer({
...
(async function () {
  try {
    await connectToDb();
    app.listen(3000, function () {
      console.log('App started on port 3000');
    });
  } catch (err) {
    console.log('ERROR:', err);
  }
})();
Listing 6-4

server.js: Changes for Reading the Issue List from MongoDB

Note

We did not have to do anything special due to the fact that the resolver issueList() is now an async function, which does not immediately return a value. The graphql-tools library handles this automatically. A resolver can return a value immediately or return a Promise (which is what an async function returns immediately). Both are acceptable return values for a resolver.

Since the issues from the database now contain an _id in addition to the id field, let’s include that in the GraphQL schema of the type Issue. Otherwise, clients who call the API will not be able to access this field. Let’s use ID as its GraphQL data type and make it mandatory. This change is shown in Listing 6-5.
...
type Issue {
  _id: ID!
  id: Int!
  ...
}
...
Listing 6-5

schema.graphql: Changes to add _id as a Field in Issue

Now, assuming that the server is still running (or that you have restarted the server and the compilation), if you refresh the browser, you will find that the two initial sets of issues are listed in a table, as before. The UI itself will show no change, but to convince yourself that the data is indeed coming from the database, you could modify the documents in the collection using the mongo shell and the updateMany() method on the collection. If, for example, you update effort to 100 for all the documents and refresh the browser, you should see that the effort is indeed showing 100 for all the rows in the table.

Exercise: Reading from MongoDB

  1. 1.

    We are saving the connection in a global variable. What happens when the connection is lost? Stop the MongoDB server and start it again to see what happens. Does the connection still work?

     
  2. 2.

    Shut down the MongoDB server, wait for a minute or more, and then start the server again. Now, refresh the browser. What happens? Can you explain this? What if you wanted a longer period for the connection to work even if the database server is down? Hint: Look up the connection settings parameters at http://mongodb.github.io/node-mongodb-native/3.1/reference/connecting/connection-settings/ .

     
  3. 3.

    We used toArray() to convert the list of issues into an array. What if the list is too big, say, a million documents? How would you deal with this? Hint: Look up the documentation for the MongoDB Node.js driver's Cursor at http://mongodb.github.io/node-mongodb-native/3.1/api/Cursor.html . Note that the find() method returns a Cursor.

     

Writing to MongoDB

In order to completely replace the in-memory database on the server, we’ll also need to change the Create API to use the MongoDB database. As you saw in the MongoDB CRUD Operations section, the way to create a new document is to use the insertOne() method on the collection.

We used the size of the in-memory array to generate the new document’s id field. We could do the same, using the count() method of the collection to get the next ID. But there is a small chance when there are multiple users using the application that a new document is created between the time we call the count() method and the time we call the insertOne() method. What we really need is a reliable way of generating a sequence of numbers that cannot give us duplicates, much like sequences in popular relational databases.

MongoDB does not provide such a method directly. But it does support an atomic update operation, which can return the result of the update. This method is called findOneAndUpdate(). Using this method, we can update a counter and return the updated value, but instead of using the $set operator, we can use the $inc operator, which increments the current value.

Let’s first create a collection with the counter that holds a value for the latest Issue ID generated. To make it a bit generic, let’s assume we may have other such counters and use a collection with an ID set to the name of the counter and a value field called current holding the current value of the counter. In the future, we could add more counters in the same collections, and these would translate to one document for each counter.

To start, let’s modify the schema initialization script to include a collection called counters and populate that with one document for the counter for issues. Since there are insertions that create a few sample issues, we’ll need to initialize the counter’s value to the count of inserted documents. The changes are in init.mongo.js, and Listing 6-6 shows this file.
...
print('Inserted', count, 'issues');
db.counters.remove({ _id: 'issues' });
db.counters.insert({ _id: 'issues', current: count });
...
Listing 6-6

init.mongo.js: Initialize Counters for Issues

Let’s run the schema initialization script again to make this change take effect:
$ mongo issuetracker scripts/init.mongo.js

Now, a call to findOneAndUpdate() that increments the current field is guaranteed to return a unique value that is next in the sequence. Let’s create a function in server.js that does this, but in a generic manner. We’ll let it take the ID of the counter and return the next sequence. In this function, all we have to do is call findOneAndUpdate(). It identifies the counter to use using the ID supplied, increments the field called current, and returns the new value. By default, the result of the findOneAndUpdate() method returns the original document. To make it return the new, modified document instead, the option returnOriginal has to be set to false.

The arguments to the method findOneAndUpdate() are (a) the filter or match, for which we used _id, then (b) the update operation, for which we used a $inc operator with value 1, and finally, (c) the options for the operation. Here’s the code that will do the needful:
...
async function getNextSequence(name) {
  const result = await db.collection('counters').findOneAndUpdate(
    { _id: name },
    { $inc: { current: 1 } },
    { returnOriginal: false },
  );
  return result.value.current;
}
...

Note

The option for returning the current or new value is called differently in the Node.js driver and in the mongo shell. In the mongo shell, the option is called returnNewDocument and the default is false. In the Node.js driver, the option is called returnOriginal and the default is true. In both cases, the default behavior is to return the original, so the option must be specified to return the new document.

Now, we can use this function to generate a new ID field and set it in the supplied issue object in the resolver issueAdd(). We can then write to the collection called issues using insertOne(), and then read back the newly created issue using findOne().
...
  issue.id = await getNextSequence('issues');
  const result = await db.collection('issues').insertOne(issue);
  const savedIssue = await db.collection('issues')
    .findOne({ _id: result.insertedId });
  return savedIssue;
...
Finally, we can get rid of the in-memory array of issues in the server. Including this change, the complete set of changes in server.js is represented in Listing 6-7.
...
const issuesDB = [
  {
    id: 1, status: 'New', owner: 'Ravan', effort: 5,
    ...
  },
  ...
  },
];
...
async function getNextSequence(name) {
  const result = await db.collection('counters').findOneAndUpdate(
    { _id: name },
    { $inc: { current: 1 } },
    { returnOriginal: false },
  );
  return result.value.current;
}
async function issueAdd(_, { issue }) {
  const errors = [];
  ...
  issue.created = new Date();
  issue.id = issuesDB.length + 1;
  issue.id = await getNextSequence('issues');
  issuesDB.push(issue);
  const result = await db.collection('issues').insertOne(issue);
  return issue;
  const savedIssue = await db.collection('issues')
    .findOne({ _id: result.insertedId });
  return savedIssue;
...
Listing 6-7

server.js: Changes for Create API to Use the Database

Testing this set of changes will show that new issues can be added, and even on a restart of the Node.js server, or the database server, the newly added issues are still there. As a cross-check, you could use the mongo shell to look at the contents of the collection after every change from the UI.

Exercise: Writing to MongoDB

  1. 1.

    Could we have just added the _id to the passed-in object and returned that instead of doing a find() for the inserted object?

     

Answers are available at the end of the chapter.

Summary

In this chapter, you learned about the installation and other ways of getting access to an instance of a database in MongoDB. You saw how to use the mongo shell and the Node.js driver to access the basic operations in MongoDB: the CRUD operations. We then modified the Issue Tracker application to use some of these methods to read and write to the MongoDB database, thus making the issue list persistent.

I covered only the very basics of MongoDB, only the capabilities and features that will be useful to build the Issue Tracker application, which is a rather simple CRUD application. In reality, the capabilities of the database as well as the Node.js driver and the mongo shell are vast, and many more features of MongoDB may be required for a complex application. I encourage you to take a look at the MongoDB documentation ( https://docs.mongodb.com/manual/ ) and the Node.js driver documentation ( http://mongodb.github.io/node-mongodb-native/ ) to familiarize yourself with what else the database and the Node.js drivers are capable of.

Now that we have used the essentials of the MERN stack and have a working application, let’s take a break from implementing features and get a bit organized instead. Before the application gets any bigger and becomes unwieldy, let’s modularize the code and use tools to improve our productivity.

We’ll do this in the next chapter, by using Webpack, one of the best tools that can be used to modularize both the front-end and the back-end code.

Answers to Exercises

Exercise: MongoDB Basics

  1. 1.

    As per the mongo shell documentation under "Access the mongo shell Help", you can find that there is a method called help() on many objects, including the cursor object. The way to get help on this is using db.collection.find().help().

    But since this is also a JavaScript shell like Node.js, pressing Tab will auto-complete and a double-Tab will show a list of possible completions. Thus, if you assign a cursor to a variable and press Tab twice after typing the variable name and a dot after that, the shell will list the possible completions, and that is a list of methods available on the cursor.

     

Exercise: MongoDB CRUD Operations

  1. 1.

    This can be done using the $exists operator like this:

    > db.employees.find({ "name.middle": { $exists: true } })

     
  2. 2.

    The filter specification is not a JSON document, because it is not a string. It is a regular JavaScript object, which is why you are able to skip the quotes around the property names. You will also be able to have real Date objects as field values, unlike a JSON string.

     
  3. 3.
    The $unset operator in an update can be used to unset a field (which is actually different from setting it to null). Here is an example:
    > db.employees.update(({_id: ObjectId("57b1caea3475bb1784747ccb")},
    {"name.middle": {$unset: null}})

    Although we supplied null as the value for $unset, this value is ignored. It can be anything.

     
  4. 4.

    The 1 indicates an ascending sort order for traversing the index. -1 is used to indicate a descending sort order. This is useful only for compound (aka composite) indexes, because a simple index on one field can be used to traverse the collection in both directions.

     

Exercise: Schema Initialization

  1. 1.

    The advantage of using the Node.js driver is that there is one way of doing things across the application and the scripts, and the familiarity will help prevent errors. But running the program requires a proper Node.js environment, including npm modules installed, whereas the mongo shell script can be run from anywhere, provided the machine has the mongo shell installed.

     
  2. 2.

    A search bar is quite helpful when searching for issues. A text index (an index based on the words) on the title field would be useful in this case. We’ll implement a text index toward the end of the book.

     

Exercise: Reading from MongoDB

  1. 1.

    The connection object is in fact a connection pool. It automatically determines the best thing to do: reuse an existing TCP connection, reestablish a new connection when the connection is broken, etc. Using a global variable (at least, reusing the connection object) is the recommended usage.

     
  2. 2.

    If the database is unavailable for a short period (less than 30 seconds), the driver retries and reconnects when the database is available again. If the database is unavailable for a longer period, the read throws an error. The driver is also unable to reestablish a connection when the database is restored. The application server needs to be restarted in this case.

    The default interval of 30 seconds can be changed using the connection settings reconnectTries or reconnectInterval.

     
  3. 3.

    One option is to use limit() on the result to limit the return value to a maximum number of records. For example, find().limit(100) returns the first 100 documents. If you were to paginate the output in the UI, you could also use the skip() method to specify where to start the list.

    If, on the other hand, you think the client can handle large lists but you don’t want to expend that much memory in the server, you could deal with one document at a time using hasNext() and next() and stream the results back to the client.

     

Exercise: Writing to MongoDB

  1. 1.

    Adding the _id and returning the object passed in would have worked, so long as you know for a fact that the write was a success and the object was written to the database as is. In most cases, this would be true, but it’s good practice to get the results from the database, as that is the ultimate truth.

     
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.84.29