© Charles Bell 2018
Charles BellIntroducing the MySQL 8 Document Storehttps://doi.org/10.1007/978-1-4842-2725-1_1

1. Introducing MySQL 8—A New Beginning

Charles Bell1 
(1)
Warsaw, Virginia, USA
 

It is a testament to the dedication of the Oracle MySQL engineers (and Oracle itself) that MySQL continues to improve with new features. The drive within the MySQL engineering division is to continue to develop disruptive database technologies for the Internet. Oracle has not only fostered this aggressiveness but has continued to live up to its promise to invest in and expand their MySQL business. The newest version, MySQL 8, proves conclusively that Oracle has fulfilled the promise to ensure MySQL will remain the world’s most popular open source database system.

Previous versions of MySQL have added some new and interesting features since MySQL 5.01 making MySQL a better product. Although the features have been well received and used to solve a lot of problems, the changes were largely evolutionary improvements rather than revolutionary changes.

This tendency is not unique to MySQL nor is it unusual in a stable, mature product. That doesn’t mean evolutionary development is bad—it isn’t. However, given that several competitive technologies have emerged, the MySQL engineers realized they must reach higher and further if they are to continue to dominate the industry.

Thus, this new release of MySQL breaks many of the molds of previous versions adding new, revolutionary features that change the firmament of how some will use MySQL. Indeed, the version number alone has jumped from 5.x to 8.02 signifying the jump in technological sophistication and the break from continuous development of the 5.x codebase, which lasted for over 13 years.

The changes to MySQL 8.0 include changes to existing features as well as some new, game changing features. This book examines one of the most important and newest features: the MySQL Document Store. However, there are other equally as important features such as Group Replication and the InnoDB Cluster. Although I focus on the document store, I will also see how these other features can be leveraged to take your MySQL installation into the future.

MySQL—What Does it Mean?

The name MySQL is a combination of a proper name and an acronym. SQL is structured query language. The My part isn’t a possessive form—it is a name. In this case, My is the name of the founder’s daughter. As for pronunciation, MySQL experts pronounce it “My-S-Q-L”—not “my sequel.”

In this chapter, I examine some of the new features of MySQL 8 including a short introduction to some of the features that were emerging technologies from previous versions, new features unique to MySQL 8, and those revolutionary features that make MySQL 8 the greatest MySQL release to date.

Note

This book is based on the MySQL 8.0.11 release with a focus on the document store. There are many more new features than those that are listed in this chapter. Be sure to consult the latest MySQL online MySQL reference manual ( https://dev.mysql.com/doc/refman/8.0/en/ ) for a complete list of the new, updated, and removed features.3

The new features have a great deal of sophistication. As you will see, some of the features are designed to work together and others are designed as add-ons. Rather than explain every minor detail or list features and benefits, the following sections present the basics of the various features available today in MySQL 8 so that you can get an idea of what is available. You will also see just how far the new version has advanced beyond the traditional MySQL mechanism of storage and retrieval.

Let’s begin with a look at some of the features that were part of earlier releases but are now refined and more fully integrated into the server.

Old Features New Again

The first category of features includes those under development in MySQL 5.7 either as a separate, experimental development project; a plugin; or as a planned feature for a later stable release. Thus, these features had already been released in some limited form. Most were considered “development releases” and were accompanied with a disclaimer that strongly suggested they not be used in a production environment. Some had been included in the latest release candidate (RC) versions of the server.

To be more precise, Oracle released these features as early releases so that systems and database administrators, information technology architects, and other planners could try out the features and provide feedback to help the feature mature. It also allows customers to adapt the technologies early in development environments in case the features required changes to the infrastructure or applications.

What is a Plugin?

Plugins are means that add functionality to the server without having to compile and rebuild the server proper to incorporate the new features. Plugin technology has been around for a long time. In fact, MySQL originally supported pluggable storage engines that allowed you to add and remove storage engine options on the fly. The MySQL plugin technology has evolved since those days, but the concept is the same. As long as the plugin is compatible with the server version, you can download MySQL plugins from Oracle and install them on your server for immediate use.

Plugins are also a convenient way for Oracle to release new features into existing, stable releases. For example, new features, such as Group Replication, have been introduced as plugins (but are included in the latest release). Even if a plugin is released as a development release (think early beta), you could still use it with the compatible GA (generally available) release of the server. This allows Oracle to produce features much more quickly than having to bundle them with a major server release. In the case of Group Replication and similar technologies, this has saved Oracle years of development work by making the features available to users in near record time.

There are several features that have evolved in the MySQL 5.7 code base. The following are some of the key features that I explore in this book. These include the JSON data type and the MySQL Shell.

JSON Data Type

As of MySQL version 5.7.8, MySQL supports a native JSON data type that enables efficient access to data in JSON documents in a table row. Thus, you can have columns in your table of the JSON data type. JSON stands for JavaScript Object Notation.4 The new JSON data type is a key component to using MySQL as a document store. In short, JSON is a markup language used to exchange data. Not only is it human readable, it can be used directly in your applications to store and retrieve data to and from other applications, servers, and even MySQL.

Note

I give a very brief overview of the JSON data type and JSON documents in this section. I give an in-depth look at JSON in Chapter 3.

In fact, JSON looks familiar to programmers because it resembles other markup schemes. JSON is also very simple in that it supports only two types of structures: 1) a collection containing pairs (name, value), and 2) an ordered list (or array). Of course, you can also mix and match the structures in an object. When we create a JSON object, we call it a JSON document.

The JSON data type, unlike the normal data types in MySQL, allows you to store JSON formatted objects (documents) in a column for a row. You can have more than one JSON column (field) in a single table. Although you could do this with TEXT or BLOB fields (and many people do), there is no facility built into MySQL to interact with the data in TEXT and BLOB fields. Thus, manipulation of the data is largely application dependent. In addition, the data is normally structured such that every row has the same “format” for the column. Storing data in TEXT and BLOB fields is not new and has been done for years.

This could work by using a single string or even a binary representation of data and storing it in the TEXT or BLOB field. If the data is small enough, you could store it in a VARCHAR and similar string column. To store and retrieve data in this manner, you have to encode then decode the data—something that could be tedious—especially if you're trying to ingest data from someone else.

With the JSON data type , you don't have to write specialized code to store and retrieve data. This is because JSON documents are well understood and many programming environments and scripting languages support it natively. Think of JSON as an outgrowth or extension of what XML documents were supposed to be. That is, they offer a flexible way to store data that may differ from one application to another. JSON allows you to store the data that you have at the time. Unlike a typical database table, you don't have to worry about default values (they’re not allowed) or whether you have enough columns or even master/detail relationships to normalize and store all of the data in a nice, neat, structured package.

Let's take a look at a simple JSON document that we can store in MySQL. Let's say we have a contact list in which each contact may or may not have an address on file, may or may not have an email, multiple phone numbers, and so forth. If you were to create a typical database table to store this information, you may go so far as to store a lot of empty columns for entries that you only have a name and a single phone number.

In fact, we can add new data items any time we want without having to alter the underlining table structure. For example, if you find you later need to add a Skype Id to some of the records, you can do that in your code adding the key for those entries you want without having to go back and change any existing data. The only catch is that your code for reading the data will have to change to test for the existence of the key before accessing it. I show an example of this in Chapters 8 and 9.

Let’s consider an example contact list that contains several people who perform a service for me who live in my area. All I need to store is their name and phone number. Sometimes I only know (or care to store) their first name. I don't need their address because I never send them anything and they're just down the street after all. Listing 1-1 demonstrates what some of the entries could look like. I chose to demonstrate what JSON looks like by using SQL INSERT statements so that you can see one way unstructured data can be inserted in our database.

INSERT INTO rolodex.contacts (contact_info) VALUES ('
{
  "name": "Allen",
  "phones": [
    {
      "work": "212-555-1212"
    }
  ]
}
');
INSERT INTO rolodex.contacts (contact_info) VALUES ('
{ "name": {
    "first": "Joe",
    "last": "Wheelerton"
  },
  "phones": [
    {
      "work": "212-555-1213"
    },
    {
      "home": "212-555-1253"
    }
  ],
  "address": {
      "street": "123 main",
      "city": "oxnard",
      "state": "ca",
      "zip": "90125"
  },
  "notes": "Excellent car detailer. Referrals get $20 off next detail!"
}
');
Listing 1-1

Example of JSON Documents

Note that I used a bit of formatting with newlines and whitespace to make the JSON easier to read. However, that is not necessary. Indeed, if we query a table with JSON data as in the rows in Listing 1-1, we would see the data would display a bit differently. Listing 1-2 shows the output of a typical SELECT query.

mysql> SELECT * FROM rolodex.contacts G
*************************** 1. row ***************************
          id: 1
contact_info: {"name": "Allen", "phones": [{"work": "212-555-1212"}]}
*************************** 2. row ***************************
          id: 2
contact_info: {"name": {"last": "Wheelerton", "first": "Joe"}, "notes": "Excellent car detailer. Referrals get $20 off next detail!", "phones": [{"work": "212-555-1213"}, {"home": "212-555-1253"}], "address": {"zip": "90125", "city": "oxnard", "state": "ca", "street": "123 main"}}
2 rows in set (0.00 sec)
Listing 1-2

SELECT with JSON Columns

That isn’t very easy to read, is it? Not to worry because your applications can ingest this data easily (those languages that support JSON) so it doesn’t matter so much.

If you want to experiment with this example, you will need to create the structure and data. In this case, you will need a schema (think database) and collection (think table). The following are the SQL statements that you need to create the schema and collection. However, you normally would not use SQL statements with the document store, but you can since the underlining storage for a collection in MySQL is a specially formed table shown in the following.

CREATE DATABASE `rolodex`;
CREATE TABLE `rolodex`.`contacts` (
   `id` INT NOT NULL AUTO_INCREMENT,
   `contact_info` json DEFAULT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

The JSON data type enables you to build flexibility into your data storage through the support built into MySQL for working with the JSON documents as well as additional facilities to enable interaction with JSON through the MySQL Shell, X Plugin, and X Protocol. Let’s look at the MySQL Shell.

MySQL Shell

The MySQL Shell is another feature that was added during the MySQL 5.7 timeframe. In this case, it was in the form of a new, separate product. The MySQL Shell is the next generation of command-line client for MySQL. Not only can you execute traditional SQL commands, you can also interact with the server using one of several programming languages including Python and JavaScript. Furthermore, if you also have the X Plugin installed, you can use MySQL Shell to work with both traditional relational data as well as JSON documents. How cool is that?

Tip

You can download the MySQL Shell from http://dev.mysql.com/downloads/shell/ .

If you’re thinking, “It is about time!” that Oracle has made a new MySQL client, you’re not alone. The MySQL Shell represents a bold new way to interact with MySQL. There are many options and even different ways to configure and use the shell. And although we will see more about the shell in Chapter 4, let’s see how to use the shell to execute the same query shown previously. Figure 1-1 shows a snapshot of the new MySQL Shell. Note that it provides a very familiar interface albeit a bit more modern and far more powerful.
../images/432285_1_En_1_Chapter/432285_1_En_1_Fig1_HTML.jpg
Figure 1-1

The MySQL Shell

Listing 1-3 shows how to start the shell and execute a SELECT statement displaying the results. Note that the command used to launch the shell. In this case, we specify that we want to use the shell in a manner that resembles the old client in SQL mode (--sql).

$ mysqlsh -uroot --sql
Creating a session to 'root@localhost'
Enter password:
Your MySQL connection id is 281 (X protocol)
Server version: 8.0.11 MySQL Community Server (GPL)
No default schema selected; type use <schema> to set one.
MySQL Shell 8.0.11
Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help' or '?' for help; 'quit' to exit.
 MySQL  localhost:33060+ ssl  SQL > SELECT * FROM rolodex.contacts G
*************************** 1. row ***************************
doc: {"_id": "9801A79DE093991311E7FFCB243C3451", "name": {"first": "Allen"}, "phones": [{"work": "212-555-1212"}]}
_id: 9801A79DE093991311E7FFCB243C3451
*************************** 2. row ***************************
doc: {"_id": "9801A79DE0939E0411E7FFCB243DCDE3", "name": {"last": "Wheelerton", "first": "Joe"}, "notes": "Excellent car detailer. Referrals get $20 off next detail!", "phones": [{"work": "212-555-1213"}, {"home": "212-555-1253"}], "address": {"zip": "90125", "city": "oxnard", "state": "ca", "street": "123 main"}}
_id: 9801A79DE0939E0411E7FFCB243DCDE3
2 rows in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > exit
Bye!
Listing 1-3

Querying JSON data in the MySQL Shell

Note

These examples are executed with a server that has the X Plugin installed and enabled. Chapter 2 demonstrates how to do this.

Although that is indeed nice, it is not so different than the old client. What makes the shell really powerful is you can use a scripting language to process the data. Listing 1-4 shows how to launch the shell in Python mode (--python) and execute Python code to retrieve the same result set. I also demonstrate a nice option that allows us to improve the JSON output format (--json=pretty). Aha, so now we see that there is a nicer way to see JSON in results! This option does tend to be rather verbose. I’ve suppressed some of the more verbose output for clarity.

$ mysqlsh -uroot --python --json=pretty
...
 MySQL  localhost:33060+ ssl  Py > use rolodex
 MySQL  localhost:33060+ ssl  rolodex  Py > contacts = db.get_collection("contacts")
 MySQL  localhost:33060+ ssl  rolodex  Py > contacts.find()
{
    "documents": [
        {
            "_id": "9801A79DE093991311E7FFCB243C3451",
            "name": {
                "first": "Allen"
            },
            "phones": [
                {
                    "work": "212-555-1212"
                }
            ]
        },
        {
            "_id": "9801A79DE0939E0411E7FFCB243DCDE3",
            "address": {
                "city": "oxnard",
                "state": "ca",
                "street": "123 main",
                "zip": "90125"
            },
            "name": {
                "first": "Joe",
                "last": "Wheelerton"
            },
            "notes": "Excellent car detailer. Referrals get $20 off next detail!",
            "phones": [
                {
                    "work": "212-555-1213"
                },
                {
                    "home": "212-555-1253"
                }
            ]
        }
    ],
    "executionTime": "0.00 sec",
    "warningCount": 0,
    "warnings": []
}
 MySQL  localhost:33060+ ssl  rolodex  Py > exit
Bye!
Listing 1-4

Using the MySQL Shell with Python

Ok, now we’re starting to see how much the shell changes our MySQL experience. Note that the output is formatted to make it read better and the commands we used were quite a bit different than the SQL commands previously. If you’re thinking, that looks like application code, you’re on the right track! We’ll see more about the MySQL Shell in Chapter 4. Let’s now discover what makes the shell powerful by examining the new X Plugin and X Protocol.

X Plugin, X Protocol, and X DevAPI

MySQL has introduced a new protocol and API to work with JSON documents. Along with supporting the JSON data type, we have three technologies prefixed with the simple name “X”: the X Plugin, X Protocol, and X DevAPI. The X Plugin is a plugin that enables the X Protocol. The X Protocol is designed to communicate with the server using the X DevAPI. The X DevAPI is an application programming interface that (among other things) permits you to develop NoSQL solutions for MySQL and use MySQL as a document store.

I Know SQL, But What Is NoSQL?

If you have worked with relational databases systems, you are no doubt very familiar with SQL (structured query language) in which we use special statements (commands) to interact with the data. In fact, most database systems have their own version of SQL that includes commands to manipulating the data (DML; data manipulation language) as well as defining the objects to store data (DDL; data definition language) and even administrative commands to manage the server.5

That is, you get result sets that have to use commands to search for the data then convert results into internal programming structures making the data seem like an auxiliary component rather than an integral part of the solution. NoSQL interfaces break this mold by allowing you to use APIs (application programming interfaces) to work with the data. More specific, you use programming interfaces rather than command based interfaces.

It is unfortunate that NoSQL can mean a number of things depending on your perspective including “non-SQL,” “not only SQL,” or “nonrelational.” But they all refer to the fact that the mechanism you’re using is not using a command based interface and most uses of the term indicate you’re using a programming interface. For MySQL 8, access to JSON documents can be either through SQL or NoSQL using the X Protocol and X DevAPI through the X Plugin.

The X Plugin is a great example of how Oracle makes use of the plugin technology to enable new features. In this case, the X Plugin is a gateway from within the server to allow communication using the X Protocol. The MySQL X Plugin comes with the server, and is enabled by default. If you have an older release of MySQL Server, you can use the MySQL Shell to enable the plugin with the following command.

$ mysqlsh -u root -h localhost --mysql --dba enableXProtocol
Creating a Classic session to 'root@localhost'
Enter password:
Your MySQL connection id is 527
Server version: 8.0.11 MySQL Community Server (GPL)
No default schema selected; type use <schema> to set one.
enableXProtocol: X Protocol plugin is already enabled and listening for connections on port 33060

Any client (not just the MySQL Shell) that supports the X Protocol can use the associated X DevAPI to use MySQL as a document store. In fact, the X Protocol is designed to expose the ACID (atomicity, consistency, isolation, and durability) compliant storage abilities of MySQL as a document store enabling you to execute Create, Read, Update, and Delete (CRUD) operations against JSON documents. The X Protocol also supports the normal SQL interface to MySQL so you can build your applications to use both SQL and NoSQL interfaces!

You may have wondered how the shell and the plugin interact with the server. Figure 1-2 demonstrates how the components are “stacked.”
../images/432285_1_En_1_Chapter/432285_1_En_1_Fig2_HTML.jpg
Figure 1-2

X Protocol stack

Note that the shell permits the use of the X DevAPI, which is communicated over the wire to the server via the X Plugin. Thus, the X Plugin is an enabling technology with the real power consisting of X Protocol and X DevAPI.

Now that we’ve seen the technologies that enable using MySQL as a document store, let’s look at how the InnoDB storage engine has changed in recent releases.

InnoDB Improvements

Since MySQL 5.6, InnoDB has been the flagship storage engine (and the default engine) for MySQL. Oracle has slowly evolved away from the multiple storage engine model focusing on what a modern database server should do—support transactional storage mechanisms. InnoDB is the answer to that requirement and much more.

What is a Storage Engine?

A storage engine is a mechanism to store data in various ways. For example, there is a storage engine that allows you to interact with comma separated values (text) files (CSV), another that is optimized for writing log files (Archive), one that stores data in memory only (Memory), and even one that doesn’t store anything at all (Blackhole). You can use them with your tables by using the ENGINE= table option. Along with InnoDB, the MySQL server ships with the Archive, Blackhole, CSV, Memory, MyISAM storage engines. The InnoDB storage engine is the only one that supports transactions. For more information about the other storage engines including the features of each and how they are used, see the “Alternative Storage Engines” section in the online MySQL reference manual.

In the early days, InnoDB was a separate company and thus a separate product that was neither part of MySQL nor was it owned by MySQL AB (the original owner of MySQL now fully owned by Oracle). Eventually, Oracle came to own both InnoDB and MySQL so it made sense to combine the two efforts because they have mutually inclusive goals. Although there still is a separate InnoDB engineering team, they are fully integrated with the core server development team.

This tight integration has led to many improvements in InnoDB including a host of performance enhancements. This is readily apparent in how InnoDB continues to evolve with those refinements.

The list of refinements has grown since the 5.6 releases and although most of the improvements are rather subtle in the sense you won’t notice them (except through better performance and reliability, which are not to be taken lightly), most show a dedication to making InnoDB the best transactional storage mechanism and through extension MySQL a strong transactional database system. The following list a number of the more interesting improvements to InnoDB that you will find in MySQL 8. Some of these may seem to be very deep into the depths of the code, but those who have optimized or otherwise tuned their InnoDB installation may need to take note of these when planning to move to MySQL 8. What is not listed here are dozens of minor improvements in reliability and performance.
  • Crash recovery: Should the index tree become corrupt, InnoDB writes a corruption flag to the redo log. This makes the corruption flag crash safe (it is not lost on a forced restart). Likewise, InnoDB also writes an in-memory corruption flag on each checkpoint. When crash recovery is initiated, InnoDB can read the flags and use them to adjust recovery operations.

  • InnoDB memcached Plugin: Has been improved by permitting fetching of multiple (key, value) pairs in a single memcached query.

  • Deadlock detection: There are several new options, but the most promising includes an option to dynamically configure deadlock detection (innodb_deadlock_detect). This could permit additional tuning control for high usage systems in which deadlock detection is a detriment to performance.

  • New INFORMATION_SCHEMA views: There are new views for InnoDB, which includes the following:
    • INNODB_CACHED_INDEXES is used to discover the number of index pages cached in the InnoDB buffer pool for each index.

    • INNODB_TABLESPACES_BRIEF is used to see the space, name, path, flags, and space type for tablespaces.

  • AUTO_INCREMENT: There are several minor improvements with auto-increment fields including the following:
    • The current maximum auto-increment value is now persistent across server restarts.

    • A restart no longer cancels the effect of the AUTO_INCREMENT = N table option.

    • A server restart immediately following a ROLLBACK operation no longer results in the reuse of auto-increment values that were allocated to the rolled-back transaction.

    • Setting an AUTO_INCREMENT column value to a value larger than the current maximum is persisted and later new values (say after a restart) start with the new, larger value.

  • Temporary tables: All temporary tables are now created in the shared temporary tablespace named ibtmp1.

Although this list seems focused on minor improvements, some of these are very important to system administrators looking for help tuning and planning their database server installations. If you would like to know more about any of these improvements or see a list of all the latest changes, see the online MySQL reference manual.6

I also should note that this list is likely to grow as MySQL 8 matures and new features are added. Indeed, the InnoDB Cluster is one such new feature that we discuss in the section "InnoDB Cluster."

The next section describes those features that have been added to and are unique to MySQL 8.

New Features

Aside from those features that have been in development during the 5.7 server releases, there are features that are unique to MySQL 8. That is to say, they are not currently (or even likely to be incorporated) in the older releases. Part of this is because of how much the server code base was changed to accommodate the new features. Those new features available in MySQL 8.0 include the new data dictionary and a new account management system.

Note

Some features are available as a separate download as a plugin that you can install and may be released separately with a different rating than the server. Some, such as Group Replication, can also be used with MySQL 5.7.

Data Dictionary

If you have ever worked with MySQL trying to get information about the objects contained in the databases; either to discover which objects are there, searching for objects with a specific name prefix, or trying to discover which indexes exist, chances are you have had to access the numerous tables in the mysql database or you’ve had to navigate the views in INFORMATION_SCHEMA.

Although this has been the default for many years, there are a number of problems with this mechanism. Most notable, there is no easy way to find things (you have to “learn” where things are and then how to search them). More important, because the data was in nontransactional tables (and metadata files), the mechanisms were not transactional and, by extension, not crash safe.

Indeed, many a MySQL DBA has earned their salary by recovering data in the mysql database, fixing corrupt or missing .frm files, and a host of other small plagues that can visit a large MySQL installation. Happily, those days are gone with the addition of the data dictionary!

What’s An Frm File?

If you examine the data directory of a MySQL installation for version 5.7 and earlier, you will see a folder named data that contains subfolders named for each database created. In these folders, you will see files named with the table names and a file extension of .frm. Many MySQL developers call these files “FRM files.” The file is a specially formatted binary file that describes the table's format (definition). Thus, a table named table1 in database1 has an FRM file named /data/database1/table1.frm.

Sadly, because FRM files are binary files, they are not readable by normal means. In fact, the format has been a mystery for many years (it uses a layout called Unireg). Because the FRM files contain the metadata for the table, all the column definitions and table options, including index definitions, are stored in the file. This means it should be possible to extract the data needed to reconstruct the CREATE TABLE statement from a FRM file. Unfortunately, given the interface and uniqueness of Unireg, it is not easy to parse these files for the information.

Fortunately, you can decipher the FRM files via a Python utility that is part of the MySQL Utilities product. If you need to read an FRM file to recover a table, see the online MySQL Utilities documentation for more details: http://dev.mysql.com/doc/mysql-utilities/1.6/en/utils-task-get-structure.html .

What you may find curious and even a bit strange is the fact that the data dictionary implementation is hidden and very much behind the scenes. That is, data dictionary tables are invisible and cannot be accessed directly. You won’t find the data dictionary tables easily (although it is possible if you look hard enough). This was done primarily to make the data dictionary crash safe and something you don’t have to manage. Fortunately, you can access the information stored in the data dictionary via the INFORMATION_SCHEMA database and even the SHOW commands. The mysql database still exists, but it mainly contains extra information such as time zones, help, and similar nonvital information.

Tip

The data dictionary is one of the key factors that you must understand when planning any upgrades from older versions of MySQL. I examine a number of these issues in Chapter 10.

For more information about the data dictionary, see the section "MySQL Data Dictionary" in the online MySQL reference manual.

Adding the data dictionary has finally made possible a number of features that many have wanted to implement for some time. One of the newest is a change in account management.

Account Management

If you have ever managed a MySQL database server (or many servers), chances are you have encountered a situation where you need to assign the same privileges to a group of users. For example, your server may support several applications or databases with sets (groups) of users that have specific rights to database objects. In most cases, savvy database administrators (DBAs) make a copy of the user privileges (often in the form of GRANT statements) so that they can reuse them when they need to create another user with the same privileges.

Although the MySQL Utilities product has a Python utility to help manage this tedium (see “mysqluserclone” in http://dev.mysql.com/doc/mysql-utilities/1.6/en/ ), having to create dozens of different “types” of users can be quite a challenge. What is really needed is a way to create a role and tailor the privileges to the role then grant the role to users. Fortunately, with the advent of the data dictionary, supporting roles in MySQL has become a reality in MySQL 8!

Roles can be created, dropped, privileges granted or revoked. We also can grant or revoke roles to/from users. Roles finally make the tedium of managing user accounts on MySQL much easier. For more information about roles, see Using Roles in the online MySQL reference manual.

There also have been changes in the SSL (secure sockets layer) support in the server.

Removed Options, Variables, and Features

The first thing you may notice about MySQL 8 is a host of small changes to startup options, variables, and so forth. Fortunately, most of these are related to supporting the newest features and the removal of old and obsolete settings. Also, many of those options, variables, and features marked as deprecated in MySQL 5.7 (and prior) are officially removed in MySQL 8. Some of the more familiar items removed in MySQL 8 include the following.
  • --bootstrap: was used to control how the server started and was typically used to create the MySQL privilege tables without having to start a full MySQL server.

  • --innodb_file_format_* : was used to configure the file format for the InnoDB storage engine.

  • --partition and --skip partition: was used to control user-defined partitioning support in the MySQL server.

One of the consequences of the new data dictionary is removal of the need for .frm files (FRM). Because the data dictionary contains all of the information about every object in all of the databases hosted in a reliable, recoverable storage mechanism, there is no longer a need to store such information in a separate file. Those of us who have often fought with or otherwise had the unique frustration to attempt to repair a server whose FRM files were lost or corrupt, the removal of the FRM files is a long overdue and most welcome omission.

For those using SSL, one area that may be of concern is the removal of some of the SSL options and the introduction of a new authentication plugin (caching_sha2_password) to improve secure connections. The new authentication plugin was introduced in release 8.0.4. Most installation packages give you the option to choose the older authentication method should you require it, but it is strongly recommended that you use the new authentication plugin.

Error codes are another area where you will see some changes. Many error codes were changed in the latest release including the removal of dozens of lesser known (used) error codes. If your applications use the MySQL server error codes, you should check the documentation to ensure the error codes have not changed or been removed.

There were also many minor items removed including the mysql_plugin utility,7 the embedded server (libmysqld), the generic partition engine (InnoDB now has native partitioning), the mysql_install_db script (this has been replaced with the --initialize option), and more.

As I mentioned in the previous sections, the list of features that were removed in MySQL 8 will likely grow as more features become mature and are added. If you have defined tuning procedures, stored procedures, DevOps,8 or other mechanisms that use or interact with options and variables, you should carefully examine the entry in the MySQL 8 documentation to ensure you can modify your tools.

Tip

See http://dev.mysql.com/doc/refman/8.0/en/added-removed-variables-options.html for a complete list of features to be removed in MySQL 8.

Paradigm Shifting Features

When the MySQL engineers and product management teams decided to develop ground breaking high availability features and a new way to store unstructured data, they knew they were on to something that would change the MySQL world in dramatic fashion.

In this section, we look at two high availability features that are poised to change MySQL high availability in a new and dramatic way. We will also see how the new structured storage mechanism will change what you can store and indeed how you can interact with MySQL to store data for applications where data can change allowing your application to adapt without having to rebuild the storage layers.

Let’s begin with the high availability solutions.

Group Replication

If you have used MySQL replication, you are no doubt very familiar with how to leverage it when building high availability solutions. Indeed, it is likely you have discovered a host of ways to improve availability in your applications with MySQL replication.

What Is Replication? And How Does it Work?

MySQL replication is an easy-to-use feature and yet a complex and major component of the MySQL server. This section presents a bird’s-eye view of replication for the purpose of explaining how it works and how to set up a simple replication topology. For more information about replication and its many features and commands, see the online MySQL reference manual (http://dev.mysql.com/doc/refman/8.0/en/replication.html).

Replication requires two or more servers. One server must be designated as the origin or master. The master role means all data changes (writes) to the data are sent to the master and only the master. All other servers in the topology maintain a copy of the master data and are by design and requirement read-only servers. Thus, when your sensors send data for storage, they send it to the master. Applications you write to use the sensor data can read it from the slaves.

The copy mechanism works using a technology called the binary log that stores the changes in a special format, thereby keeping a record of all the changes. These changes are then shipped to the slaves and executed there. Thus, once the slave executes the changes (called events), the slave has an exact copy of the data.

The master maintains a binary log of the changes, and the slave maintains a copy of that binary log called the relay log. When a slave requests data changes from the master, it reads the events from the master and writes them to its relay log; then another thread in the slave executes those events from the relay log. As you can imagine, there is a slight delay from the time a change is made on the master to the time it is made on the slave. Fortunately, this delay is almost unnoticeable except in topologies with high traffic (lots of changes).

Moreover, it has become apparent that the more your high availability needs and your solution expands (grows in sophistication), the more you need to employ better ways to manage the loss of nodes, data integrity, and general maintenance of the clusters (groups of servers replicating data—sometimes called replicasets). In fact, most high availability solutions have outgrown the base master and slaves topology evolving into tiers consisting of clusters of servers. Some have replicated a portion of the data for faster throughput and for compartmental storage. All of these have led many to discover they need more from MySQL replication. Oracle has answered these needs and more with Group Replication.

Group Replication was released as GA in December 2016 and is bundled with the server in the form of a plugin. Although it is a GA release, I list it here as a paradigm-shifting feature because of the promise it provides for allowing MySQL high availability to grow well beyond the confines of the original MySQL replication feature and thus empower MySQL 8 to become an important component in high availability database solutions.

Note

I touch on only the very basics of Group Replication to give you an idea of its complexity and benefits. A deeper dive into using Group Replication and its implementation is beyond the scope of this book.

Group Replication makes the topology eventually synchronous replication (among the nodes belonging to the same group) a reality, whereas the existing MySQL Replication feature is asynchronous (or at most semi-synchronous). Therefore, better high availability guaranties can be provided, because transactions are delivered to all members in the same order (despite being applied at its own pace in each member after being accepted).

Group Replication does this via a distributed state machine with strong coordination among the servers assigned to a group. This communication allows the servers to coordinate replication automatically within the group. More specific, groups maintain membership so that the data replication among the servers is always consistent at any point in time. Even if servers are removed from the group, when they are added, the consistency is initiated automatically. Further, there is also a failure detection mechanism for servers that go offline or become unreachable. Figure 1-3 shows how you would use Group Replication with our applications to achieve high availability.
../images/432285_1_En_1_Chapter/432285_1_En_1_Fig3_HTML.jpg
Figure 1-3

Using Group Replication with applications for high availability (Courtesy of Oracle)

Note that Group Replication can be used with the MySQL Router to allow your applications to have a layer of isolation from the cluster. We will see a bit about the router when we examine the InnoDB Cluster.

Another important distinction between Group Replication and standard replication is that all of the servers in the group can participate in updating the data with conflicts resolved automatically. Yes, you no longer have to carefully craft your application to send writes (updates) to a specific server! However, you can configure Group Replication to allow updates by only one server (called the primary) with the other servers acting as secondary servers or as a backup (for failover).

All of these capabilities and more are made possible using three specific technologies built into Group Replication: group membership, failure detection, and fault tolerance.9
  • Group membership: This manages whether servers are active (online) and participating in the group. Also, ensures every server in the group has a consistent view of the membership set. That is, every server knows the complete list of servers in the group. When servers are added to the group, the group membership service reconfigures the membership automatically.

  • Failure detection: A mechanism that is able to find and report which servers are offline (unreachable) and assumed to be dead. The failure detector is a distributed service that allows all servers in the group to test the condition of the presumed dead server and in that way, the group decides if a server is unreachable (dead). This allows the group to reconfigure automatically by coordinating the process of excluding the failed server.

  • Fault tolerance: This service uses an implementation of the Paxos10 distributed algorithm to provide distributed coordination among the servers. In short, the algorithm allows for automatic promotion of roles within the group to ensure the group remains consistent (data is consistent and available) even if a server (or several) fail or leave the group. As with similar fault tolerance mechanisms, the number of failures (servers that fail) is limited. Currently, Group Replication fault tolerance is defined as n = 2f + 1, where n is the number of servers needed to tolerate f failures. For example, if you want to tolerate up to 5 servers failing, you need at least 11 servers in the group.

Although Group Replication is a plugin, it is bundled with the server installation today with MySQL 5.7 (starting with the 5.7.17 release) as well as MySQL 8.11

Tip

To learn more about the internal mechanisms, designs, implementation as well as how to setup and use Group Replication, see the developer documentation at http://mysqlhighavailability.com/mysqlha/gr/doc/index.html .

Rather than demonstrate Group Replication by itself, we will see just how powerful this feature is when we explore another new feature named InnoDB Cluster in the following section. As you will see in the demonstration of InnoDB Cluster, Group Replication is easy to use and when part of InnoDB Cluster, both technologies change the way we use MySQL replication in a most dramatic way.

InnoDB Cluster

Another new and emerging feature is called InnoDB Cluster. It is designed to make high availability easier to setup, use, and maintain. InnoDB Cluster works with the X AdminAPI via the MySQL Shell and the Admin API, Group Replication, and the MySQL Router12 to take high availability and read scalability to a new level. That is, it combines new features in InnoDB for cloning data with Group Replication and the MySQL Shell and MySQL Router to provide a new way to setup and manage high availability.

Note

The AdminAPI is a special API available via the MySQL Shell for configuring and interacting with InnoDB Cluster. Therefore, the Admin API has features designed to make working with InnoDB Cluster easier.

In this use case, the cluster is setup with a single primary (think master in standard replication parlance), which is the target for all write (updates). Multiple secondary servers (slaves) maintain replicas of the data, which can be read from and thus enable reading data without burdening the primary thus enabling read out scalability (but all servers participate in consensus and coordination). The incorporation of Group Replication means the cluster is fault tolerant and group membership is managed automatically. The MySQL router caches the metadata of the InnoDB Cluster and performs high availability routing to the MySQL server instances making it easier to write applications to interact with the cluster.

You may wonder what makes this different from a readout scalability setup with standard replication. At a high level, it may seem that the solutions are solving the same use case. However, with InnoDB Cluster, you can create, deploy, and configure servers in your cluster from the MySQL Shell providing a complete high availability solution that can be managed easily. That is, you can use the InnoDB Cluster AdminAPI via the shell to create and administer an InnoDB Cluster programmatically using either JavaScript or Python.

Let us now see these new technologies in action. What follows is a demonstration of deploying three servers, configuring them as a cluster via Group Replication using JavaScript commands in the new MySQL Shell. Although that sounds like a lot of effort, it really isn’t and in fact is really easy.

Note

The following commands were run using InnoDB Cluster on a system with MySQL 8.0.11, InnoDB Cluster, and MySQL Router installed.

Let’s begin by starting the shell and deploying three servers using the AdminAPI. In this case, we will use the deploySandboxInstance() method in the dba object to create new instances for each server. All of these will run on our localhost. Listing 1-5 demonstrates how to deploy three servers. I highlight the commands used to help identify the commands from the messages.

$ mysqlsh
MySQL Shell 8.0.11
Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help' or '?' for help; 'quit' to exit.
 MySQL  JS > dba.deploySandboxInstance(3307)
A new MySQL sandbox instance will be created on this host in
/Users/cbell/mysql-sandboxes/3307
Please enter a MySQL root password for the new instance:
Deploying new MySQL instance...
Instance localhost:3307 successfully deployed and started.
Use shell.connect('root@localhost:3307'); to connect to the instance.
 MySQL  JS > dba.deploySandboxInstance(3308)
A new MySQL sandbox instance will be created on this host in
/Users/cbell/mysql-sandboxes/3308
Please enter a MySQL root password for the new instance:
Deploying new MySQL instance...
Instance localhost:3308 successfully deployed and started.
Use shell.connect('root@localhost:3308'); to connect to the instance.
 MySQL  JS > dba.deploySandboxInstance(3309)
A new MySQL sandbox instance will be created on this host in
/Users/cbell/mysql-sandboxes/3309
Please enter a MySQL root password for the new instance:
Deploying new MySQL instance...
Instance localhost:3309 successfully deployed and started.
Use shell.connect('root@localhost:3309'); to connect to the instance.
 MySQL  JS >
Listing 1-5

Creating Local Server Instances

Note that the text explains that we are using a sandbox , which is a term applied to running servers on the localhost in a special directory: the mysql-sandboxes folder in the user home. In particular in this case, we use /Users/cbell/mysql-sandboxes. Note that we now have three servers running on ports 3307, 3308, and 3309. Note also that the shell will prompt you for the new password.

Tip

JavaScript is case sensitive so make sure you use the correct spelling for variables, objects, and methods. That is, a variable named abc is not the same variable named Abc.

The next thing we need to do is setup a new cluster. We do this with the createCluster() method in the dba object. But first, we must connect to the server we want to make our primary server. Listing 1-6 demonstrates how to create the cluster. Note that this is a continuation of our shell session and demonstrates how to create a new cluster.

MySQL  JS > connect root@localhost:3307
Creating a session to 'root@localhost:3307'
Enter password:
Your MySQL connection id is 12
Server version: 8.0.11 MySQL Community Server (GPL)
No default schema selected; type use <schema> to set one.
 MySQL  localhost:3307 ssl  JS > my_cluster = dba.createCluster('my_cluster')
A new InnoDB cluster will be created on instance 'root@localhost:3307'.
Creating InnoDB cluster 'my_cluster' on 'root@localhost:3307'...
Adding Seed Instance...
Cluster successfully created. Use Cluster.addInstance() to add MySQL instances.
At least 3 instances are needed for the cluster to be able to withstand up to
one server failure.
<Cluster:my_cluster>
Listing 1-6

Creating a Cluster in InnoDB Cluster

Note that we named the cluster my_cluster and used a variable of the same name to store the object returned from the createCluster() method. Note that the first server we connected has become the primary (master).

Next, we add the other two server instances to complete the cluster using the addInstance() of our new my_cluster object. These servers automatically become secondary servers (slaves) in the group. Listing 1-7 shows how to add the instances to the cluster.

MySQL  localhost:3307 ssl  JS > my_cluster = dba.getCluster('my_cluster')
<Cluster:my_cluster>
MySQL  localhost:3307 ssl  JS > my_cluster.addInstance('root@localhost:3308')
A new instance will be added to the InnoDB cluster. Depending on the amount of data on the cluster this might take from a few seconds to several hours.
Please provide the password for 'root@localhost:3308':
Adding instance to the cluster ...
The instance 'root@localhost:3308' was successfully added to the cluster.
 MySQL  localhost:3307 ssl  JS > my_cluster.addInstance('root@localhost:3309')
A new instance will be added to the InnoDB cluster. Depending on the amount of data on the cluster this might take from a few seconds to several hours.
Please provide the password for 'root@localhost:3309':
Adding instance to the cluster ...
The instance 'root@localhost:3309' was successfully added to the cluster.
Listing 1-7

Adding Instances to the Cluster

Once the cluster is created and the instances are added, we can get the status of the cluster using the status() method of our my_cluster object as shown in Listing 1-8.

MySQL  localhost:3307 ssl  JS > my_cluster.status()
{
    "clusterName": "my_cluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "localhost:3307",
        "ssl": "REQUIRED",
        "status": "OK",
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
        "topology": {
            "localhost:3307": {
                "address": "localhost:3307",
                "mode": "R/W",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "localhost:3308": {
                "address": "localhost:3308",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "localhost:3309": {
                "address": "localhost:3309",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            }
        }
    }
}
 MySQL  localhost:3307 ssl  JS > exit
Bye!
Listing 1-8

Getting the Status of the Cluster

At this point, we’ve seen how InnoDB Cluster can setup servers and add them to the group. What you do not see behind the scenes is all of the Group Replication mechanisms—you get them for free! How cool is that?

Now that we have a cluster, there is one more thing we need to do to enable applications to use the fault tolerance features of Group Replication. That is, we need to be able to connect to the cluster and interact with MySQL even if one of the servers fails. Note that because we only have three servers, we can only tolerate one failure. For example, solving for f in the number of faults tolerated by Group Replication, we get 3 = 2f + 1 or f = 1.

We must now use MySQL Router to manage the connections for our application. Although we don’t have an application to demonstrate, we can see this in action using the shell. Now let’s see how easy it is to set up the router. Listing 1-9 shows how to start the router in bootstrap mode. Note that by connecting to the cluster, the router automatically gets the members of the group. Recall from the previous section, this is one of the tenets of Group Replication via the membership service.

& mysqlrouter --bootstrap localhost:3307 --user=cbell
Please enter MySQL password for root:
Bootstrapping system MySQL Router instance...
MySQL Router  has now been configured for the InnoDB cluster 'my_cluster'.
The following connection information can be used to connect to the cluster.
Classic MySQL protocol connections to cluster 'my_cluster':
- Read/Write Connections: localhost:6446
- Read/Only Connections: localhost:6447
X protocol connections to cluster 'my_cluster':
- Read/Write Connections: localhost:64460
- Read/Only Connections: localhost:64470
& mysqlrouter &
Listing 1-9

Setting Up the MySQL Router

Okay, now we have the router running. Our applications can use the features of the router to automatically reroute our application connections should something happen to one of the servers in the cluster.

Let’s see a short demonstration of this feature. In this case, we will use the shell to connect to the cluster via the router on port 6446 as shown in Listing 1-9. We use this port because the router is used to forward connections automatically. That is, if the server we’re connected to goes down—for instance the one on port 3307—we do not have to restart our application to reconnect to a server on another port. Thus, the router, routes the communications for us. Let’s see this in action.

Listing 1-10 demonstrates connecting to the cluster via the router. We switch to SQL mode in the shell and use an SQL command to see the port of the server where we’re connected. We then switch back to JavaScript and use the AdminAPI to kill the instance. We then attempt to issue the SQL command again and now notice that, once the shell has automatically reconnected, we are now connected to another server. Cool!

$ mysqlsh --uri root@localhost:6446 --sql
Creating a session to 'root@localhost:6446'
Enter password:
Your MySQL connection id is 47
Server version: 8.0.11 MySQL Community Server (GPL)
No default schema selected; type use <schema> to set one.
MySQL Shell 8.0.11
Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help' or '?' for help; 'quit' to exit.
 MySQL  localhost:6446 ssl  SQL > SELECT @@port;
+--------+
| @@port |
+--------+
|   3307 |
+--------+
1 row in set (0.00 sec)
 MySQL  localhost:6446 ssl  SQL > js
Switching to JavaScript mode...
 MySQL  localhost:6446 ssl  JS > dba.killSandboxInstance(3307)
The MySQL sandbox instance on this host in
/Users/cbell/mysql-sandboxes/3307 will be killed
Killing MySQL instance...
Instance localhost:3307 successfully killed.
 MySQL  localhost:6446 ssl  JS > sql
Switching to SQL mode... Commands end with ;
 MySQL  localhost:6446 ssl  SQL > SELECT @@port;
ERROR: 2006 (HY000): MySQL server has gone away
The global session got disconnected.
Attempting to reconnect to 'root@localhost:6446'..
The global session was successfully reconnected.
 MySQL  localhost:6446 ssl  SQL > SELECT @@port;
+--------+
| @@port |
+--------+
|   3308 |
+--------+
1 row in set (0.00 sec)
 MySQL  localhost:6446 ssl  SQL > quit
Bye!
Listing 1-10

Fault Tolerance Demonstration

Note that although the shell had lost the connection it automatically reconnected so that we can retry the command. Very nice.

Finally, let’s discover how to put the instance that failed back into service. In this case, we simulate recovering a downed server adding it back to the cluster where Group Replication ensures that the new server becomes consistent by applying any missing transactions. Listing 1-11 shows the commands you can use to recover the server.

$ mysqlsh --uri root@localhost:6446
 MySQL  localhost:6446 ssl  JS > dba.startSandboxInstance(3307)
The MySQL sandbox instance on this host in
/Users/cbell/mysql-sandboxes/3307 will be started
Starting MySQL instance...
Instance localhost:3307 successfully started.
 MySQL  localhost:6446 ssl  JS > my_cluster = dba.getCluster('my_cluster')
<Cluster:my_cluster>
MySQL  localhost:6446 ssl  JS > my_cluster.rejoinInstance('root@localhost:3307')
Rejoining the instance to the InnoDB cluster. Depending on the original
problem that made the instance unavailable, the rejoin operation might not be
successful and further manual steps will be needed to fix the underlying
problem.
Please monitor the output of the rejoin operation and take necessary action if the instance cannot rejoin.
Please provide the password for 'root@localhost:3307':
Rejoining instance to the cluster ...
The instance 'root@localhost:3307' was successfully rejoined on the cluster.
The instance 'localhost:3307' was successfully added to the MySQL Cluster.
 MySQL  localhost:6446 ssl  JS > q
Bye!
Listing 1-11

Recovering a Lost Server

It is clear that using the shell to setup and manage a cluster is a lot easier than setting up and managing a standard Group Replication setup. In particular, you don’t have to manually configure replication! Better still, should a server fail, you don’t have to worry about reconfiguring your application or the topology to ensure the solution remains viable—InnoDB Cluster does this automatically for you.

To learn more about InnoDB Cluster, see the online documentation at https://dev.mysql.com/doc/mysql-innodb-cluster/en/ .

Summary

MySQL has come a long way since the days when developers downloaded the code, modified it, and put it into use on their rapidly developed platforms. As one who has watched and participated in its evolution, it is with some pride that I look back on the bad old days and see just how far MySQL has come.

The journey hasn’t been easy. The engineering team alone has weathered two acquisitions (Sun Microsystems and Oracle) in rapid succession and a host of smaller team development and minor personnel changes. Through all of this, the engineering team continued to improve features and add new technologies remaining dedicated to making MySQL the best possible solution.

Users also have grown in how they use MySQL from stand alone, single database server installations to massive high availability server farms. Through all of this, the MySQL product has remained poised for something greater. Now, with MySQL 8.0, Oracle has shown its hand and it’s loaded with top-notch technologies. Indeed, the MySQL world is poised to discover new ways to leverage MySQL in a yet unknown variety of methods. I am certain by the time you read this book you will have your own ideas of how to revamp your use of MySQL.

In this chapter, we explored some of the highlights of the new MySQL server version 8.0. We discovered those features originally introduced in earlier versions that have been adapted to the new paradigm that is version 8.0, features that are new, and those new features that are truly revolutionary such as the document store, Group Replication, and InnoDB Cluster.

In Chapter 2, I take a short detour into a brief primer on installing and using MySQL. If you have not used MySQL before or any form of a relational database system, Chapter 2 will prepare you for how MySQL works in the more traditional manner via SQL commands. If you have been using older versions of MySQL, you may still want to skim the chapter to learn how to install and configure MySQL 8 for use with the document store. I discuss more about the MySQL Shell in Chapter 4 and upgrading to MySQL 8 in Chapter 10.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.67.5