Chapter 2. Data Modeling with MongoDB

Data modeling is a very important process during the conception of an application since this step will help you to define the necessary requirements for the database's construction. This definition is precisely the result of the data understanding acquired during the data modeling process.

As previously described, this process, regardless of the chosen data model, is commonly divided into two phases: one that is very close to the user's view and the other that is a translation of this view to a conceptual schema. In the scenario of relational database modeling, the main challenge is to build a robust database from these two phases, with the aim of guaranteeing updates to it with any impact during the application's lifecycle.

A big advantage of NoSQL compared to relational databases is that NoSQL databases are more flexible at this point, due to the possibility of a schemaless model that, in theory, can cause less impact on the user's view if a modification in the data model is needed.

Despite the flexibility NoSQL offers, it is important to previously know how we will use the data in order to model a NoSQL database. It is a good idea not to plan the data format to be persisted, even in a NoSQL database. Moreover, at first sight, this is the point where database administrators, quite used to the relational world, become more uncomfortable.

Relational database standards, such as SQL, brought us a sense of security and stability by setting up rules, norms, and criteria. On the other hand, we will dare to state that this security turned database designers distant of the domain from which the data to be stored is drawn.

The same thing happened with application developers. There is a notable divergence of interests among them and database administrators, especially regarding data models.

The NoSQL databases practically bring the need for an approximation between database professionals and the applications, and also the need for an approximation between developers and databases.

For that reason, even though you may be a data modeler/designer or a database administrator, don't be scared if from now on we address subjects that are out of your comfort zone. Be prepared to start using words common from the application developer's point of view, and add them to your vocabulary. This chapter will present the MongoDB data model along with the main concepts and structures available for the development and maintenance of this model.

This chapter will cover the following:

  • Introducing your documents and collections
  • The document's characteristics and structure
  • Showing the document's design and patterns

Introducing documents and collections

MongoDB has the document as a basic unity of data. The documents in MongoDB are represented in JavaScript Object Notation (JSON).

Collections are groups of documents. Making an analogy, a collection is similar to a table in a relational model and a document is a record in this table. And finally, collections belong to a database in MongoDB.

The documents are serialized on disk in a format known as Binary JSON (BSON), a binary representation of a JSON document.

An example of a document is:

{
   "_id": 123456,
   "firstName": "John",
   "lastName": "Clay",
   "age": 25,
   "address": {
      "streetAddress": "131 GEN. Almério de Moura Street",
      "city": "Rio de Janeiro",
      "state": "RJ",
      "postalCode": "20921060"
   },
   "phoneNumber":[
      {
         "type": "home",
         "number": "+5521 2222-3333"
      },
      {
         "type": "mobile",
         "number": "+5521 9888-7777"
      }
   ]
}

Unlike the relational model, where you must declare a table structure, a collection doesn't enforce a certain structure for a document. It is possible that a collection contains documents with completely different structures.

We can have, for instance, on the same users collection:

{
   "_id": "123456",
   "username": "johnclay",
   "age": 25,
   "friends":[
      {"username": "joelsant"},
      {"username": "adilsonbat"}
   ],
   "active": true,
   "gender": "male"
}

We can also have:

{
   "_id": "654321",
   "username": "santymonty",
   "age": 25,
   "active": true,
   "gender": "male",
   "eyeColor": "brown"
}

In addition to this, another interesting feature of MongoDB is that not just data is represented by documents. Basically, all user interactions with MongoDB are made through documents. Besides data recording, documents are a means to:

  • Define what data can be read, written, and/or updated in queries
  • Define which fields will be updated
  • Create indexes
  • Configure replication
  • Query the information from the database

Before we go deep into the technical details of documents, let's explore their structure.

JSON

JSON is a text format for the open-standard representation of data and that is ideal for data traffic. To explore the JSON format deeper, you can check ECMA-404 The JSON Data Interchange Standard where the JSON format is fully described.

Note

JSON is described by two standards: ECMA-404 and RFC 7159. The first one puts more focus on the JSON grammar and syntax, while the second provides semantic and security considerations.

As the name suggests, JSON arises from the JavaScript language. It came about as a solution for object state transfers between the web server and the browser. Despite being part of JavaScript, it is possible to find generators and readers for JSON in almost all the most popular programming languages such as C, Java, and Python.

The JSON format is also considered highly friendly and human-readable. JSON does not depend on the platform chosen, and its specification are based on two data structures:

  • A set or group of key/value pairs
  • A value ordered list

So, in order to clarify any doubts, let's talk about objects. Objects are a non-ordered collection of key/value pairs that are represented by the following pattern:

{
   "key" : "value"
}

In relation to the value ordered list, a collection is represented as follows:

["value1", "value2", "value3"]

In the JSON specification, a value can be:

  • A string delimited with " "
  • A number, with or without a sign, on a decimal base (base 10). This number can have a fractional part, delimited by a period (.), or an exponential part followed by e or E
  • Boolean values (true or false)
  • A null value
  • Another object
  • Another value ordered array

The following diagram shows us the JSON value structure:

JSON

Here is an example of JSON code that describes a person:

{
   "name" : "Han",
   "lastname" : "Solo",
   "position" : "Captain of the Millenium Falcon",
   "species" : "human",
   "gender":"male",
   "height" : 1.8
}

BSON

BSON means Binary JSON, which, in other words, means binary-encoded serialization for JSON documents.

Note

If you are seeking more knowledge on BSON, I suggest you take a look at the BSON specification on http://bsonspec.org/.

If we compare BSON to the other binary formats, BSON has the advantage of being a model that allows you more flexibility. Also, one of its characteristics is that it's lightweight—a feature that is very important for data transport on the Web.

The BSON format was designed to be easily navigable and both encoded and decoded in a very efficient way for most of the programming languages that are based on C. This is the reason why BSON was chosen as the data format for MongoDB disk persistence.

The types of data representation in BSON are:

  • String UTF-8 (string)
  • Integer 32-bit (int32)
  • Integer 64-bit (int64)
  • Floating point (double)
  • Document (document)
  • Array (document)
  • Binary data (binary)
  • Boolean false (x00 or byte 0000 0000)
  • Boolean true (x01 or byte 0000 0001)
  • UTC datetime (int64)—the int64 is UTC milliseconds since the Unix epoch
  • Timestamp (int64)—this is the special internal type used by MongoDB replication and sharding; the first 4 bytes are an increment, and the last 4 are a timestamp
  • Null value ()
  • Regular expression (cstring)
  • JavaScript code (string)
  • JavaScript code w/scope (code_w_s)
  • Min key()—the special type that compares a lower value than all other possible BSON element values
  • Max key()—the special type that compares a higher value than all other possible BSON element values
  • ObjectId (byte*12)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.37.12