Chapter 18. Data Administration

This chapter covers how to administrate your collections and databases. Generally the things covered in this section are not daily tasks but can be critically important for your application’s performance, for instance:

  • Setting up authentication and user accounts

  • Creating indexes on a running system

  • “Preheating” a new server to allow it to come online quickly

  • Defragmenting data files

  • Preallocating new data files manually

Setting Up Authentication

One of the first priorities for systems administrators is to ensure their system is secure. The best way to handle security with MongoDB is to run it in a trusted environment, ensuring that only trusted machines are able to connect to the server. That said, MongoDB supports per-connection authentication, albeit with a fairly coarse-grained permissions scheme.

Note

There are more sophisticated security features in MongoDB Enterprise. See http://docs.mongodb.org/manual/security for the most up-to-date information about authentication and authorization.

Authentication Basics

Each database in a MongoDB instance can have any number of users. When security is enabled, only authenticated users of a database are able to perform read or write operations.

There are two special databases: users in the admin and local databases can perform operations on any database. A user that belongs to either one of these databases can be thought of as a superuser. After authenticating, admin users are able to read or write from any database and are able to perform certain admin-only commands, such as listDatabases or shutdown.

Before starting the database with security turned on, it’s important that at least one admin user has been added. Let’s run through a quick example, starting from a shell connected to a server without authentication turned on:

> use admin
switched to db admin
> db.addUser("root", "abcd");
{
    "user" : "root",
    "readOnly" : false,
    "pwd" : "1a0f1c3c3aa1d592f490a2addc559383"
}
> use test
switched to db test
> db.addUser("test_user", "efgh");
{
    "user" : "test_user",
    "readOnly" : false,
    "pwd" : "6076b96fc3fe6002c810268702646eec"
}
> db.addUser("read_user", "ijkl", true);
{
    "user" : "read_user",
    "readOnly" : true,
    "pwd" : "f497e180c9dc0655292fee5893c162f1"
}

Here we’ve added an admin user, root, and two users on the test database. One of those users, "read_only", has read permissions only and cannot write to the database. From the shell, a read-only user is created by passing true as the third argument to addUser. To call addUser, you must have write permissions for the database in question; in this case we can call addUser on any database because we have not enabled security yet.

Note

The addUser method is useful for more than just adding new users: it can be used to change a user’s password or read-only status. Just call addUser with the username and a new password or read-only setting for the user.

Now let’s restart the server, this time adding the --auth command-line option to enable security. After enabling security, we can reconnect from the shell and try it:

> use test
switched to db test
> db.test.find();
error: { "$err" : "unauthorized for db [test] lock type: -1 " }
> db.auth("read_user", "ijkl");
1
> db.test.find();
{ "_id" : ObjectId("4bb007f53e8424663ea6848a"), "x" : 1 }
> db.test.insert({"x" : 2});
unauthorized
> db.auth("test_user", "efgh");
1
> db.test.insert({"x": 2});
> db.test.find();
{ "_id" : ObjectId("4bb007f53e8424663ea6848a"), "x" : 1 }
{ "_id" : ObjectId("4bb0088cbe17157d7b9cac07"), "x" : 2 }
> show dbs
assert: assert failed : listDatabases failed:{
    "assertion" : "unauthorized for db [admin] lock type: 1
",
    "errmsg" : "db assertion failure",
    "ok" : 0
}
> use admin
switched to db admin
> db.auth("root", "abcd");
1
> show dbs
admin
local
test

When we first connect, we are unable to perform any operations (read or write) on the test database. After authenticating as the read_user user, however, we are able to perform a simple find. When we try to insert data, we are again met with a failure because of the lack of authorization. test_user, which was not created as read-only, is able to insert data normally. As a non-admin user, though, test_user is not able to list all the available databases using the show dbs helper. The final step is to authenticate as an admin user, root, who is able to perform operations on any database.

Setting Up Authentication

If authentication is enabled, clients must be logged in to perform any reads or writes. However, there is one oddity in MongoDB’s authentication scheme: before you create a user in the admin database, clients that are “local” to the server can perform reads and writes on the database.

Generally, this is not an issue: create your admin user and use authentication normally. The only exception is sharding. With sharding, the admin database is kept on the config servers, so shard mongods have no idea it even exists. Therefore, as far as they know, they are running with authentication enabled but no admin user. Thus, shards will allow a local client to read and write from them without authentication.

Hopefully this wouldn’t be an issue: optimally your network will be configured so that only mongos processes are accessible to clients. However, if you are worried about clients running locally on shards and connecting directly to them instead of going through the mongos, you may wish to add admin users to your shards.

Note that you do not want the sharded cluster to know about these admin users: it already has an admin database. The admin databases you’re creating on the shards are for your use only. To do this, connect to the primary of each shard and run the addUser() function:

> db.addUser("someUser", "theirPassword")

Make sure that the replica sets you create users on are already shards in the cluster. If you create an admin user and then try to add the mongods as a shard the addShard command will not work (because the cluster already contains an admin database).

How Authentication Works

Users of a given database are stored as documents in its system.users collection. The structure of a user document is {"user" : username, "readOnly" : true, "pwd" : password hash}. The password hash is a hash based on the username and password chosen.

Knowing where and how user information is stored makes performing some common administration tasks trivial. For example, to remove a user, simply remove the user document from the system.users collection:

> db.auth("test_user", "efgh");
1
> db.system.users.remove({"user" : "test_user"});
> db.auth("test_user", "efgh");
0

When a user authenticates, the server keeps track of that authentication by tying it to the connection used for the authenticate command. This means that if a driver or tool is employing connection pooling or fails over to another node, authenticated users will need to reauthenticate on new connections. This should be handled invisibly by the driver.

Creating and Deleting Indexes

Chapter 5 covered what commands to run to create an index, but it didn’t go into the operational aspects of doing so. Creating an index is one of the most resource-intensive operations you can do on a database, so schedule index creations carefully.

Building an index requires MongoDB to find the indexed field (or lack thereof) in every document in the collection, then sort all the values found. As you might imagine, this becomes a very expensive task as your collection gets bigger. Thus, indexing should be done in a way that affects your production server as little as possible.

Creating an Index on a Standalone Server

On a standalone server, build the index in the background during an off-time. There isn’t much else you can do to minimize impact. To build an index in the background, run ensureIndex with the "background" : true option:

> db.foo.ensureIndex({"someField" : 1}, {"background" : true})

Any type of index can be built in the background.

A foreground index build takes less time than a background index build but locks the database for the duration of the process. Thus, no other operations can read or write to the database while a foreground index is building. Background indexes yield the write lock regularly to allow other operations to go. This means that they can take longer—much longer on write-heavy servers—but the server can serve other clients while building an index.

Creating an Index on a Replica Set

The easiest way to create an index on a replica set is to create it on the primary and wait for it to be replicated to the secondaries. On small collections, this should have minimal impact.

If you have a large collection, this can lead to the situation where all of your secondaries start building the index at the same time. Suddenly all of your secondaries will be unavailable for client reads and may fall behind in replication. Thus, for larger collections, this is the preferred technique:

  1. Shut down a secondary.

  2. Start it up as a standalone node, as described in Chapter 6.

  3. Build the index on that server.

  4. Reintroduce the member into the replica set.

  5. Repeat for each secondary.

When you have finished this process, only the primary should be left without an index. Now there are two options: you can either build the index on the primary in the background (which will put more strain on the primary) or you can step down the primary and then follow steps 1 through 4 to build the index on it as you did with the other members of the set. This involves a failover, which may be more or less preferable than adding load to the primary.

You can also use this isolate-and-build technique to build an index on a member of the set that isn’t configured to build indexes (one that has the "buildIndexes" : false option set): start it as a standalone member, build the index, and add it back into the set.

If you cannot use the rotation method for whatever reason, plan to build new indexes during an off time (at night, a holiday, a weekend, etc.).

Creating an Index on a Sharded Cluster

To create indexes on a sharded cluster, we want to follow the same procedure as described for replica sets and build the index on one shard at a time.

First, turn off the balancer. Then follow the procedure outlined in the previous section for each shard, treating it as a individual replica set. Finally, run ensureIndex through the mongos and turn the balancer back on again.

This procedure is only required for adding an index to existing shards: new shards will pick up on the index when they start receiving chunks from a collection.

Removing Indexes

If an index is no longer necessary, you can remove it with the dropIndexes command and the index name. Query the system.indexes collection to figure out what the index name is, as even the autogenerated names vary somewhat from driver to driver:

> db.runCommand({"dropIndexes" : "foo", "index" : "alphabet"})

You can drop all indexes on a collection by passing in "*" as the value for the "index" key:

> db.runCommand({"dropIndexes" : "foo", "index" : "*"})

This leaves the "_id" index. The only way to get rid of that one is to drop the whole collection. Removing all the documents in a collection (with remove) does not affect the indexes; they will be repopulated when new documents are inserted.

Beware of the OOM Killer

The Linux out-of-memory killer will kill processes that are using a lot of memory. Because of the way MongoDB uses memory, it is not usually an issue, but index creations are the one time it can be. If you are building an index and your mongod suddenly disappears, check /var/log/messages for notices from the OOM killer. Running a background index build or adding some swap space can prevent this. If you have administrative permissions on the machine, you may want to make MongoDB unkillable.

See The OOM Killer for more information.

Preheating Data

When you restart a machine or bring up a new server, it can take a while for MongoDB to get all the right data from disk into memory. If your performance constraints require that data be in RAM, it can be disastrous to bring a new server online and then let your application hammer it while it gradually pages in the data it needs.

There are a couple of ways to get your data into RAM before officially bringing the server online, to prevent it from messing up your application.

Note

Restarting MongoDB does change what’s in RAM. RAM is managed by the OS and the OS won’t evict data from memory until the space is needed for something else. Thus, if the mongod process needs to be restarted it should not affect what data is in memory. (However, mongod will report low resident memory until it has a chance to ask the OS for the data it needs.)

Moving Databases into RAM

If you need a database in RAM, you can use the UNIX dd tool to load it before starting the mongod:

$ for file in /data/db/brains.*
> do
> dd if=$file of=/dev/null
> done

Replace brains with the name of the database you want to load.

Replacing /data/db/brains.* with /data/db/* will load the whole data directory (all databases) into RAM (assuming there’s enough room for all of them). If you load a database or set of databases into memory and it takes up more memory than you have, some of its data will fall back out of memory immediately. In this situation, you may want to use one of the techniques outlined in the next section to move more specific parts of your data into memory.

When you start the mongod, it will ask the operating system for the data files and the operating system, knowing that the data files are in memory, will be able to quickly access it.

However, this technique is only helpful if your entire database fits in memory. If it does not, you can do more fine-grained preheating using the following techniques.

Moving Collections into RAM

MongoDB has a command for preheating data called touch. Start mongod (perhaps on a different port or firewalled off from your application) and touch a collection to load it into memory:

> db.runCommand({"touch" : "logs", "data" : true, "index" : true})

This will load all the documents and indexes into memory for the logs collection. You can specify to only load documents or only load indexes. Once touch completes, you can allow your application to access MongoDB.

However, an entire collection (even just the indexes) can still be too much data. For example, your application might only require one index to be in memory, or only a small fraction of the documents. In that case, you’ll have to custom preheat the data.

Custom-Preheating

When you have more complex preheating needs, you may have to roll your own preheating scripts. Here are some common preheating requirements and how to deal with them:

Load a specific index

Let’s say we have an index such as {"friends" : 1, "date" : 1} that must be in RAM. We can load this index into memory by doing a covered query (see Using covered indexes):

> db.users.find({}, {"_id" : 0, "friends" : 1, "date" : 1}).
... hint({"friends" : 1, "date" : 1}).explain()

The explain forces mongod to iterate through all of the results for you. You must specify that you only want to return the indexed fields (the second argument to find) or the query will end up loading all the documents into memory, too (which you may want, but it’s good to be aware of). Note that this will always load the index and documents into memory for indexes that cannot be covered, such as multikey indexes.

Load recently updated documents

If you have an index on a date field that you update when you update the document, you can query on recent dates to load recent documents.

If you don’t have an index on the date field, this query will end up loading all documents in the collection into memory, so don’t bother. If you don’t have a date field, you might be able to use the "_id" field if you’re mostly concerned with recent inserts (see below).

Load recently created documents

If you are using ObjectIds for your "_id" field, you can use the fact that recently created documents contain a timestamp to query for them. For example, suppose we wanted to find all documents created in the last week. We could create an "_id" that was less than every ObjectId created in the last week:

> lastWeek = (new Date(year, month, day)).getTime()/1000
1348113600

Replace year, month, and date appropriately and this gives you the date, in seconds. Now we need to get an ObjectId from this time. First, convert it into a hex string, then append 16 zeros to it:

> hexSecs = lastWeek.toString(16)
505a94c0
> minId = ObjectId(hexSecs+"0000000000000000")
ObjectId("505a94c00000000000000000")

Now we just have to query for it:

> db.logs.find({"_id" : {"$gt" : minId}}).explain()

This will load all the docs (and some right-hand branches of the "_id" index) from the last week.

Replay application usage

MongoDB has a facility for recording and replaying traffic called the diaglog. Enabling the diaglog incurs a performance penalty, so it is best to use it temporarily to gather a “representative” slice of traffic. To gather traffic, run the following command in the mongo shell:

> db.adminCommand({"diagLogging" : 2})

The 2 option means “capture reads.” The 1 option will capture writes and the 3 option captures both (0 is the default: off). You probably don’t want to capture writes because you don’t want extra writes applied to your new member when you replay the diaglog.

Now let the diaglog record operations by letting mongod run for however long you want while sending it traffic. Reads will be stored in the diaglog file in the data directory. Reset diagLogging to 0 when you’re done:

> db.adminCommand({"diagLogging" : 0})

To use your diaglog files, start up your new server and, from the server where the diaglog files live, run:

$ nc hostname 27017 < /data/db/diaglog* | hexdump -c

Replace the IP, port, and data directory, if necessary. This sends the recorded operations to hostname:27017 as a series of normal queries.

Note that the diaglog will capture the command turning on the diaglog, so you’ll have to log into the server and turn it off when you’re done replaying the diaglog (you also might want to delete the diaglog files it generates from the replay).

These techniques can be combined: you could load a couple of indexes while replaying the diaglog, for example. You can also run them all at the same time if you aren’t bottlenecked on disk IO, either through multiple shells or the startParallelShell command (if the shell is local to the mongod):

> p1 = startParallelShell("db.find({}, {x:1}).hint({x:1}).explain()", port)
> p2 = startParallelShell("db.find({}, {y:1}).hint({y:1}).explain()", port)
> p3 = startParallelShell("db.find({}, {z:1}).hint({z:1}).explain()", port)

Replace port with the port on which mongod is running.

Compacting Data

MongoDB uses a lot of disk space. Sometimes, if you have deleted or updated a lot of data, you’ll end up with collection fragmention. Fragmentation occurs when your data files have a lot of empty space that MongoDB can’t reuse because the individual chunks of free space are too small. In this case, you’ll see messages like this in the log:

Fri Oct  7 06:15:03 [conn2] info DFM::findAll(): extent 0:3000 was empty, 
    skipping ahead. ns:bar.foo

This message is, in and of itself, harmless. However, it means that an entire extent had no documents in it. To get rid of empty extents and repack collections efficiently, use the compact command:

> db.runCommand({"compact" : "collName"})

Compaction is very resource-intensive: you should not plan to do a compaction on a mongod that’s serving clients. Thus, the recommended procedure is similar to that of building indexes: compact data on each of the secondaries, then step down the primary and run the final compaction on it.

When you run a compact on a secondary it will drop into recovering state, which means that it will return errors if sent read requests and it cannot be used as a sync source. When the compaction is finished, it will return to secondary state.

Compaction fits documents as closely as it can, as though the padding factor was 1. If you need a higher padding factor for the collection, you can specify it as an argument to compact:

> db.runCommand({"compact" : "collName", "paddingFactor" : 1.5})

You can specify a padding factor between 1 and 4. This does not permanently affect the padding factor, just how MongoDB rearranges documents. Once the compaction is finished, the padding factor will go back to whatever it was originally.

Compacting does not decrease the amount of disk space a collection uses: it just puts all of the documents at the “beginning” of a collection, on the assumption that the collection will expand again to use the available space. Thus, compaction is only a brief reprieve if you are running out of disk space: it will not decrease the amount of disk space MongoDB is using, although it may make MongoDB not need to allocate new space for longer.

You can reclaim disk space by running a repair. Repair makes a full copy of the data so you must have free space equal to the size of your current data files. This is often annoying, as the most common reason to do a repair is that their machine is running low on disk space. However, if you can mount another disk, you can specify a repair path, which is the directory (your newly-mounted drive) that repair should use for the new copy of the data.

Since it makes a totally clean copy of your data you can interrupt a repair at any time with no effect on your original data set. If you run into the problem in the middle of a repair, you can delete the temporary repair files without affecting your actual data files.

To repair, start mongod with the --repair option (and --repairpath, if desired).

You can run repair on a single database in the shell by calling db.repairDatabase().

Moving Collections

You can rename a collection using the renameCollection command. This cannot move collections between databases, but it can change the collection’s name. This operation is almost instantaneous, regardless of the size of the collection being renamed. On busy systems, it can take a few nerve-wracking seconds, but it can be performed in production with no performance penalty:

> db.sourceColl.renameCollection("newName")

You can optionally pass a second argument: what to do with the newName collection if it already exists. The options are true (drop it) or false (the default: error out if it exists).

To move a collection to another database, you must either dump/restore it or manually copy the documents (do a find and iterate over the results, inserting them into the new database).

You can move a collection to a different mongod using the cloneCollection command:

> db.runCommand({"cloneCollection" : "collName", "from" : "hostname:27017"})

You cannot use cloneCollection move a collection within a mongod: it can only move collections between servers.

Preallocating Data Files

If you know that your mongod will need certain data files, you can run the following script to preallocate them before your application goes online. This can be especially helpful for the oplog (which you know will be a certain size) and any databases that you know will be a given size, at least for a while:

#!/bin/bash

# Make sure db name was passed in
if test $# -lt 2 || test $# -gt 3
then
    echo "$0 <db> <number-of-files>"                                                                                                                         
fi

db=$1
num=$2

for i in {0..$num}
do
      echo "Preallocating $db.$i"
      head -c 2146435072 /dev/zero > $db.$i
done

Store this in a file (say, preallocate), and make the file executable. Go to your data directory and allocate the files that you need:

$ # create test.0-test.100
$ preallocate test 100
$
$ # create local.0-local.4
$ preallocate local 4

Once you start the database and it accesses the datafiles for the first time, you cannot delete any of the data files. For example, if you allocated data files test.0 through test.100 and then start the database and realize that you only need test.0 through test.20, you should not delete test.21-test.100. Once MongoDB is aware of them, it will be unhappy if they go away.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.162.179