Chapter 1. MySQL Architecture

MySQL’s architectural characteristics make it useful for a wide range of purposes. Although it is not perfect, it is flexible enough to work well in both small and large environments. These range from a personal website up to large scale enterprise applications. To get the most from MySQL, you need to understand its design so that you can work with it, not against it.

This chapter provides a high-level overview of the MySQL server architecture, the major differences between the storage engines, and why those differences are important. We’ve tried to explain MySQL by simplifying the details and showing examples. This discussion will be useful for those new to database servers as well as readers who are experts with other database servers.

MySQL’s Logical Architecture

A good mental picture of how MySQL’s components work together will help you understand the server. Figure 1-1 shows a logical view of MySQL’s architecture.

The topmost layer contains the services that aren’t unique to MySQL. They’re services most network-based client/server tools or servers need: connection handling, authentication, security, and so forth.

Figure 1-1. A logical view of the MySQL server architecture

The second layer is where things get interesting. Much of MySQL’s brains are here, including the code for query parsing, analysis, optimization, caching, and all the built-in functions (e.g., dates, times, math, and encryption). Any functionality provided across storage engines lives at this level: stored procedures, triggers, and views, for example.

The third layer contains the storage engines. They are responsible for storing and retrieving all data stored “in” MySQL. Like the various file systems available for GNU/Linux, each storage engine has its own benefits and drawbacks. The server communicates with them through the storage engine API. This interface hides differences between storage engines and makes them largely transparent at the query layer. The API contains a couple of dozen low-level functions that perform operations such as “begin a transaction” or “fetch the row that has this primary key.” The storage engines don’t parse SQL1 or communicate with each other; they simply respond to requests from the server.

Connection Management and Security

By default, each client connection gets its own thread within the server process. The connection’s queries execute within that single thread, which in turn resides on one core or CPU. The server caches threads, so they don’t need to be created and destroyed for each new connection.2

When clients (applications) connect to the MySQL server, the server needs to authenticate them. Authentication is based on username, originating host, and password. X.509 certificates can also be used across a TLS (Transport Layer Security) connection. Once a client has connected, the server verifies whether the client has privileges for each query it issues (e.g., whether the client is allowed to issue a SELECT statement that accesses the Country table in the world database).

Optimization and Execution

MySQL parses queries to create an internal structure (the parse tree), and then applies a variety of optimizations. These can include rewriting the query, determining the order in which it will read tables, choosing which indexes to use, and so on. You can pass hints to the optimizer through special keywords in the query, affecting its decision-making process. You can also ask the server to explain various aspects of optimization. This lets you know what decisions the server is making and gives you a reference point for reworking queries, schemas, and settings to make everything run as efficiently as possible.

The optimizer does not really care what storage engine a particular table uses, but the storage engine does affect how the server optimizes the query. The optimizer asks the storage engine about some of its capabilities and the cost of certain operations, and for statistics on the table data. For instance, some storage engines support index types that can be helpful to certain queries. You can read more about indexing and schema optimization in Chapter 8.

In older versions, MySQL made use of an internal query cache to see if it can serve the results from there. However, as concurrency increased, the query cache became a notorious bottleneck. As of MySQL 5.7.20 the query cache was officially deprecated as a MySQL feature and in the 8.0 release, the query cache is fully removed. Even though the query cache is no longer a core part of the MySQL server, caching frequently served result sets is a good practice. In Chapter 5 we will go over ways you can implement such a cache using different technologies.

Concurrency Control

Anytime more than one query needs to change data at the same time, the problem of concurrency control arises. For our purposes in this chapter, MySQL has to do this at two levels: the server level and the storage engine level. Concurrency control is a big topic to which a large body of theoretical literature is devoted, so we will just give you a simplified overview of how MySQL deals with concurrent readers and writers, so you have the context you need for the rest of this chapter.

To illustrate how MySQL handles concurrent work on the same set of data, we will use a traditional spreadsheet file as an example. A spreadsheet consists of rows and columns, much like a database table. Assume the file is on your laptop and only you have access to it. There are no potential conflicts, only you can make changes to the file. Now, imagine you need to collaborate with a coworker on that spreadsheet. It is now on a shared server that both of you have access to. What happens when both of you need to make changes to this file at the same time? What if we have an entire team of people actively trying to edit, add and remove cells from this spreadsheet? We can say that they should take turns making changes but that is not efficient. We need an approach for allowing concurrent access to a high volume spreadsheet.

Read/Write Locks

Reading from the spreadsheet isn’t as troublesome. There’s nothing wrong with multiple clients reading the same file simultaneously; because they aren’t making changes, nothing is likely to go wrong. What happens if someone tries to delete cell number A25 while others are reading the spreadsheet? It depends, but a reader could come away with a corrupted or inconsistent view of the data. So, to be safe, even reading from a spreadsheet requires special care.

If you think of the spreadsheet as a database table, it’s easy to see that the problem is the same in this context. In many ways, a spreadsheet is really just a simple database table. Modifying rows in a database table is very similar to removing or changing the content of cells in a spreadsheet file.

The solution to this classic problem of concurrency control is rather simple. Systems that deal with concurrent read/write access typically implement a locking system that consists of two lock types. These locks are usually known as shared locks and exclusive locks, or read locks and write locks.

Without worrying about the actual locking mechanism, we can describe the concept as follows. Read locks on a resource are shared, or mutually non-blocking: many clients can read from a resource at the same time and not interfere with each other. Write locks, on the other hand, are exclusive—i.e., they block both read locks and other write locks—because the only safe policy is to have a single client writing to the resource at a given time and to prevent all reads when a client is writing.

In the database world, locking happens all the time: MySQL has to prevent one client from reading a piece of data while another is changing it. It performs this lock management internally in a way that is transparent much of the time.

Lock Granularity

One way to improve the concurrency of a shared resource is to be more selective about what you lock. Rather than locking the entire resource, lock only the part that contains the data you need to change. Better yet, lock only the exact piece of data you plan to change. Minimizing the amount of data that you lock at any one time lets changes to a given resource occur simultaneously, as long as they don’t conflict with each other.

Unfortunately, locks are not free - they consume resources. Every lock operation—getting a lock, checking to see whether a lock is free, releasing a lock, and so on—has overhead. If the system spends too much time managing locks instead of storing and retrieving data, performance can suffer.

A locking strategy is a compromise between lock overhead and data safety, and that compromise affects performance. Most commercial database servers don’t give you much choice: you get what is known as row-level locking in your tables, with a variety of often complex ways to give good performance with many locks.

MySQL, on the other hand, does offer choices. Its storage engines can implement their own locking policies and lock granularities. Lock management is a very important decision in storage engine design; fixing the granularity at a certain level can improve performance for certain uses, yet make that engine less suited for other purposes. Because MySQL offers multiple storage engines, it doesn’t require a single general-purpose solution. Let’s have a look at the two most important lock strategies.

Table locks

The most basic locking strategy available in MySQL, and the one with the lowest overhead, is table locks. A table lock is analogous to the spreadsheet locks described earlier: it locks the entire table. When a client wishes to write to a table (insert, delete, update, etc.), it acquires a write lock. This keeps all other read and write operations at bay. When nobody is writing, readers can obtain read locks, which don’t conflict with other read locks.

Table locks have variations for improved performance in specific situations. For example, READ LOCAL table locks allow some types of concurrent write operations. Write locks also have a higher priority than read locks, so a request for a write lock will advance to the front of the lock queue even if readers are already in the queue (write locks can advance past read locks in the queue, but read locks cannot advance past write locks).

Although storage engines can manage their own locks, MySQL itself also uses a variety of locks that are effectively table-level for various purposes. For instance, the server uses a table-level lock for statements such as ALTER TABLE, regardless of the storage engine.3

Row locks

The locking style that offers the greatest concurrency (and carries the greatest overhead) is the use of row locks. Going back to the spreadsheet analogy, row locks would be the same as locking just the row in the spreadsheet. This strategy allows multiple people to edit different rows concurrently without blocking each other. This enables the server to take more concurrent writes, but the cost is more overhead in having to keep track of who has each row lock, how long they have been open, what kind of row lock it is and cleaning up locks when they are no longer needed.

Row locks are implemented in the storage engine, not the server. The server is mostly4 unaware of locks implemented in the storage engines, and as you’ll see later in this chapter and throughout the book, the storage engines all implement locking in their own ways.

Transactions

You can’t examine the more advanced features of a database system for very long before transactions enter the mix. A transaction is a group of SQL queries that are treated atomically, as a single unit of work. If the database engine can apply the entire group of queries to a database, it does so, but if any of them can’t be done because of a crash or other reason, none of them is applied. It’s all or nothing.

Little of this section is specific to MySQL. If you’re already familiar with ACID transactions, feel free to skip ahead to “Transactions in MySQL”.

A banking application is the classic example of why transactions are necessary. Imagine a bank’s database with two tables: checking and savings. To move $200 from Jane’s checking account to her savings account, you need to perform at least three steps:

  1. Make sure her checking account balance is greater than $200.
  2. Subtract $200 from her checking account balance.
  3. Add $200 to her savings account balance.

The entire operation should be wrapped in a transaction so that if any one of the steps fails, any completed steps can be rolled back.

You start a transaction with the START TRANSACTION statement and then either make its changes permanent with COMMIT or discard the changes with ROLLBACK. So, the SQL for our sample transaction might look like this:

1 START TRANSACTION;
2 SELECT balance FROM checking WHERE customer_id = 10233276;
3 UPDATE checking SET balance = balance - 200.00 WHERE customer_id = 10233276;
4 UPDATE savings SET balance = balance + 200.00 WHERE customer_id = 10233276;
5 COMMIT;

But transactions alone aren’t the whole story. What happens if the database server crashes while performing line 4? Who knows? The customer probably just lost $200. And what if another process comes along between lines 3 and 4 and removes the entire checking account balance? The bank has given the customer a $200 credit without even knowing it.

Transactions aren’t enough unless the system passes the ACID test. ACID stands for Atomicity, Consistency, Isolation, and Durability. These are tightly related criteria that a well-behaved transaction processing system must meet:

Atomicity
A transaction must function as a single indivisible unit of work so that the entire transaction is either applied or rolled back. When transactions are atomic, there is no such thing as a partially completed transaction: it’s all or nothing.
Consistency
The database should always move from one consistent state to the next. In our example, consistency ensures that a crash between lines 3 and 4 doesn’t result in $200 disappearing from the checking account. Because the transaction is never committed, none of the transaction’s changes are ever reflected in the database.
Isolation
The results of a transaction are usually invisible to other transactions until the transaction is complete. This ensures that if a bank account summary runs after line 3 but before line 4 in our example, it will still see the $200 in the checking account. When we discuss isolation levels, you’ll understand why we said usually invisible.
Durability
Once committed, a transaction’s changes are permanent. This means the changes must be recorded such that data won’t be lost in a system crash. Durability is a slightly fuzzy concept, however, because there are actually many levels. Some durability strategies provide a stronger safety guarantee than others, and nothing is ever 100% durable (if the database itself were truly durable, then how could backups increase durability?). We discuss what durability really means in MySQL in Chapter 6.

ACID transactions and the guarantees provided through them in the InnoDB engine specifically are one of the strongest and most mature features in MySQL. While they come with certain throughput trade offs, when applied appropriately, they can save you from implementing a lot of complex logic in the application layer.

Isolation Levels

Isolation is more complex than it looks. The ANSI SQL standard defines four isolation levels. If you are new to the world of databases, we highly recommend you get familiar with the general standard of ANSI SQL5 before coming back to reading about the specific MySQL implementation. The goal of this standard is to define the rules for which changes are and aren’t visible inside and outside a transaction. Lower isolation levels typically allow higher concurrency and have lower overhead.

Note

Each storage engine implements isolation levels slightly differently, and they don’t necessarily match what you might expect if you’re used to another database product (thus, we won’t go into exhaustive detail in this section). You should read the manuals for whichever storage engines you decide to use.

Let’s take a quick look at the four isolation levels:

READ UNCOMMITTED
In the READ UNCOMMITTED isolation level, transactions can view the results of uncommitted transactions. At this level, many problems can occur unless you really, really know what you are doing and have a good reason for doing it. This level is rarely used in practice, because its performance isn’t much better than the other levels, which have many advantages. Reading uncommitted data is also known as a dirty read.
READ COMMITTED
The default isolation level for most database systems (but not MySQL!) is READ COMMITTED. It satisfies the simple definition of isolation used earlier: a transaction will see only those changes made by transactions that were already committed when it began, and its changes won’t be visible to others until it has committed. This level still allows what’s known as a non-repeatable read. This means you can run the same statement twice and see different data.
REPEATABLE READ
REPEATABLE READ solves the problems that READ UNCOMMITTED allows. It guarantees that any rows a transaction reads will “look the same” in subsequent reads within the same transaction, but in theory it still allows another tricky problem: phantom reads. Simply put, a phantom read can happen when you select some range of rows, another transaction inserts a new row into the range, and then you select the same range again; you will then see the new “phantom” row. InnoDB and XtraDB solve the phantom read problem with multiversion concurrency control, which we explain later in this chapter. REPEATABLE READ is MySQL’s default transaction isolation level.
SERIALIZABLE
The highest level of isolation, SERIALIZABLE, solves the phantom read problem by forcing transactions to be ordered so that they can’t possibly conflict. In a nutshell, SERIALIZABLE places a lock on every row it reads. At this level, a lot of timeouts and lock contention can occur. We’ve rarely seen people use this isolation level, but your application’s needs might force you to accept the decreased concurrency in favor of the data stability that results.

Table 1-1 summarizes the various isolation levels and the drawbacks associated with each one.

Table 1-1. ANSI SQL isolation levels
Isolation level Dirty reads possible Non-repeatable reads possible Phantom reads possible Locking reads
READ UNCOMMITTED Yes Yes Yes No
READ COMMITTED No Yes Yes No
REPEATABLE READ No No Yes No
SERIALIZABLE No No No Yes

Deadlocks

A deadlock is when two or more transactions are mutually holding and requesting locks on the same resources, creating a cycle of dependencies. Deadlocks occur when transactions try to lock resources in a different order. They can happen whenever multiple transactions lock the same resources. For example, consider these two transactions running against the StockPrice table:

Example 1-1. First Transaction
          START TRANSACTION;
          UPDATE StockPrice SET close = 45.50 WHERE stock_id = 4 and date = '2020-05-01';
          UPDATE StockPrice SET close = 19.80 WHERE stock_id = 3 and date = '2020-05-02';
          COMMIT;
Example 1-2. Second Transaction
          START TRANSACTION;
          UPDATE StockPrice SET high = 20.12 WHERE stock_id = 3 and date = '2020-05-02';
          UPDATE StockPrice SET high = 47.20 WHERE stock_id = 4 and date = '2020-05-01';
          COMMIT;

Each transaction will execute its first query and update a row of data, locking it in the process. Each transaction will then attempt to update its second row, only to find that it is already locked. The two transactions will wait forever for each other to complete, unless something intervenes to break the deadlock.

To combat this problem, database systems implement various forms of deadlock detection and timeouts. The more sophisticated systems, such as the InnoDB storage engine, will notice circular dependencies and return an error instantly. This can be a good thing—otherwise, deadlocks would manifest themselves as very slow queries. Others will give up after the query exceeds a lock wait timeout, which is not always good. The way InnoDB currently handles deadlocks is to roll back the transaction that has the fewest exclusive row locks (an approximate metric for which will be the easiest to roll back).

Lock behavior and order are storage engine–specific, so some storage engines might deadlock on a certain sequence of statements even though others won’t. Deadlocks have a dual nature: some are unavoidable because of true data conflicts, and some are caused by how a storage engine works.

Once they occur, deadlocks cannot be broken without rolling back one of the transactions, either partially or wholly. They are a fact of life in transactional systems, and your applications should be designed to handle them. Many applications can simply retry their transactions from the beginning.

Transaction Logging

Transaction logging helps make transactions more efficient. Instead of updating the tables on disk each time a change occurs, the storage engine can change its in-memory copy of the data. This is very fast. The storage engine can then write a record of the change to the transaction log, which is on disk and therefore durable. This is also a relatively fast operation, because appending log events involves sequential I/O in one small area of the disk instead of random I/O in many places. Then, at some later time, a process can update the table on disk. Thus, most storage engines that use this technique (known as write-ahead logging) end up writing the changes to disk twice.

If there’s a crash after the update is written to the transaction log but before the changes are made to the data itself, the storage engine can still recover the changes upon restart. The recovery method varies between storage engines.

Transactions in MySQL

Storage engines are the software that drives how data will be stored and retrieved from disk. While MySQL has traditionally offered a number of storage engines that support transactions, InnoDB is now the golden standard and the recommended engine to use. Transaction primitives described here will be based on transactions in the InnoDB engine.

AUTOCOMMIT

By default, a single INSERT, UPDATE or DELETE statement is implicitly wrapped in a transaction and committed immediately. This is known as AUTOCOMMIT mode. By disabling this mode, you can execute a series of statements within a transaction and at conclusion, COMMIT or ROLLBACK.

You can enable or disable the AUTOCOMMIT variable for the current connection by using a SET command. The values 1 and ON are equivalent, as are 0 and OFF. When you run with AUTOCOMMIT=0, you are always in a transaction, until you issue a COMMIT or ROLLBACK. MySQL then starts a new transaction immediately. Additionally, with AUTOCOMMIT enabled, you can begin a multi-statement transaction by using they keyword BEGIN or START TRANSACTION. Changing the value of AUTOCOMMIT has no effect on non-transactional tables which have no notion of committing or rolling back changes.

Certain commands, when issued during an open transaction, cause MySQL to commit the transaction before they execute. These are typically Data Definition Language (DDL) commands that make significant changes, such as ALTER TABLE, but LOCK TABLES and some other statements also have this effect. Check your version’s documentation for the full list of commands that automatically commit a transaction.

MySQL lets you set the isolation level using the SET TRANSACTION ISOLATION LEVEL command, which takes effect when the next transaction starts. You can set the isolation level for the whole server in the configuration file, or just for your session:

mysql> SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;

MySQL recognizes all four ANSI standard isolation levels, and InnoDB supports all of them.

Mixing storage engines in transactions

MySQL doesn’t manage transactions at the server level. Instead, the underlying storage engines implement transactions themselves. This means you can’t reliably mix different engines in a single transaction.

If you mix transactional and non transactional tables (for instance, InnoDB and MyISAM tables) in a transaction, the transaction will work properly if all goes well.

However, if a rollback is required, the changes to the non transactional table can’t be undone. This leaves the database in an inconsistent state from which it might be difficult to recover and renders the entire point of transactions moot. This is why it is really important to pick the right storage engine for each table and to avoid mixing storage engines in your application logic at all costs.

MySQL will not usually warn you or raise errors if you do transactional operations on a non transactional table. Sometimes rolling back a transaction will generate the warning “Some non transactional changed tables couldn’t be rolled back,” but most of the time, you’ll have no indication you’re working with non transactional tables.

Warning

It is best practice to not mix storage engines in your application. Failed transactions can lead to inconsistent results as some parts can roll back and others cannot.

Implicit and explicit locking

InnoDB uses a two-phase locking protocol. It can acquire locks at any time during a transaction, but it does not release them until a COMMIT or ROLLBACK. It releases all the locks at the same time. The locking mechanisms described earlier are all implicit. InnoDB handles locks automatically, according to your isolation level.

However, InnoDB also supports explicit locking, which the SQL standard does not mention at all:6

SELECT ... FOR SHARE
SELECT ... FOR UPDATE
(

SELECT … FOR SHARE is a MySQL 8.0 feature which replaces SELECT … LOCK IN SHARE MODE of previous versions.

)

MySQL also supports the LOCK TABLES and UNLOCK TABLES commands, which are implemented in the server, not in the storage engines. These have their uses, but they are not a substitute for transactions. If you need transactions, use a transactional storage engine. Because InnoDB supports row level locking, LOCK TABLES is unnecessary.

Tip

The interaction between LOCK TABLES and transactions is complex, and there are unexpected behaviors in some server versions. Therefore, we recommend that you never use LOCK TABLES unless you are in a transaction and AUTOCOMMIT is disabled, no matter what storage engine you are using.

Multiversion Concurrency Control

Most of MySQL’s transactional storage engines don’t use a simple row-locking mechanism. Instead, they use row-level locking in conjunction with a technique for increasing concurrency known as multiversion concurrency control (MVCC). MVCC is not unique to MySQL: Oracle, PostgreSQL, and some other database systems use it too, although there are significant differences because there is no standard for how MVCC should work.

You can think of MVCC as a twist on row-level locking; it avoids the need for locking at all in many cases and can have much lower overhead. Depending on how it is implemented, it can allow nonlocking reads, while locking only the necessary rows during write operations.

MVCC works by keeping a snapshot of the data as it existed at some point in time. This means transactions can see a consistent view of the data, no matter how long they run. It also means different transactions can see different data in the same tables at the same time! If you’ve never experienced this before, it might be confusing, but it will become easier to understand with familiarity.

Each storage engine implements MVCC differently. Some of the variations include optimistic and pessimistic concurrency control. We’ll illustrate one way MVCC works by explaining a simplified version of InnoDB’s behavior.

InnoDB implements MVCC by storing with each row two additional, hidden values that record when the row was created and when it was expired (or deleted). Rather than storing the actual times at which these events occurred, the row stores the system version number at the time each event occurred. This is a number that increments each time a transaction begins. Each transaction keeps its own record of the current system version, as of the time it began. Each query has to check each row’s version numbers against the transaction’s version. Let’s see how this applies to particular operations when the transaction isolation level is set to REPEATABLE READ:

SELECT
InnoDB must examine each row to ensure that it meets two criteria: InnoDB must find a version of the row that is at least as old as the transaction (i.e., its version must be less than or equal to the transaction’s version). This ensures that either the row existed before the transaction began, or the transaction created or altered the row. The row’s deletion version must be undefined or greater than the transaction’s version. This ensures that the row wasn’t deleted before the transaction began. Rows that pass both tests may be returned as the query’s result.
INSERT
InnoDB records the current system version number with the new row.
DELETE
InnoDB records the current system version number as the row’s deletion ID.
UPDATE
InnoDB writes a new copy of the row, using the system version number for the new row’s version. It also writes the system version number as the old row’s deletion version.

The result of all this extra record keeping is that most read queries never acquire locks. They simply read data as fast as they can, making sure to select only rows that meet the criteria. The drawbacks are that the storage engine has to store more data with each row, do more work when examining rows, and handle some additional housekeeping operations.

MVCC works only with the REPEATABLE READ and READ COMMITTED isolation levels. READ UNCOMMITTED isn’t MVCC-compatible7 because queries don’t read the row version that’s appropriate for their transaction version; they read the newest version, no matter what. SERIALIZABLE isn’t MVCC-compatible because reads lock every row they return.

Replication

MySQL is designed for accepting writes on one node at any given time. This has advantages in managing consistency but leads to trade offs when you need the data written in multiple servers or multiple locations. MySQL offers a native way to distribute writes that one node takes to additional nodes. This is referred to as replication. In MySQL, replication works in a “pull model” meaning replicas periodically query the source host for the latest binlog location and pull binary logs as needed. In this diagram we show a simple example of this setup usually called “a topology tree” of multiple MySQL servers in a source and replica setup.

Figure 1-2. A simplified view of a MySQL server replication topology

For any data you run in production, you should use replication and have at least 3 more replicas, ideally distributed in different locations (in cloud hosted environments, known as regions) for disaster recovery planning.

Over the years, replication in MySQL gained more sophistication. From global transaction identifiers, multi source replication, parallel replication on replicas, and semi sync replication being some of the major updates. We will be covering replication in a lot more detail in Chapter 9.

MySQL’s Storage Engines

This section gives an overview of MySQL’s storage engines. Storage engine is the layer in MySQL that handles persistence of your data. The different engines in MySQL mean you can have different behaviours based on your application requirements. We will be talking about choosing engines in the next chapter. Since engine support and how MySQL stores information about the tables change dramatically between 5.7 and 8.0, we will refer explicitly to version names as we go on. Even this book, though, isn’t a complete source of documentation; you should read the MySQL manuals for the storage engines you decide to use.

In versions before 8.0, MySQL stored each database (also called a schema) as a subdirectory of its data directory in the underlying filesystem. When you created a table, MySQL stored the table definition in an .frm file and an .ibd file that stored the data. Both files are named after the table they represent. Thus, when you create a table named MyTable, MySQL stores the table definition in MyTable.frm and the data in MyTable.ibd. If you used partitions you would also see MyTable.par.

In version 8.0, MySQL redesigned table metadata into a data dictionary that is included with a table’s .ibd file. This makes information on the table structure support transactions and atomic data definition changes. Instead of relying only on information_schema for retrieving table definition and metadata during operations, we are introduced to the dictionary object cache which is an LRU based in memory cache of partition definitions, table definitions, stored program definitions, charset and collation information. This major change in how the server accesses metadata about tables reduces IO and is efficient especially if a subset of tables is what sees the most activity and therefore is in the cache most often. The .ibd and .frm files are replaced with serialized dictionary information (.sdi) per table.

Now that we have covered how table metadata and file structure has changed between the recent major versions, we’ll go into an overview of the most recommended and used native storage engine in MySQL, Innodb.

The InnoDB Engine

InnoDB is the default transactional storage engine for MySQL and the most important and broadly useful engine overall. It was designed for processing many short-lived transactions that usually complete rather than being rolled back. Its performance and automatic crash recovery make it popular for non-transactional storage needs, too.If you want to study storage engines, it is also well worth your time to study InnoDB in depth to learn as much as you can about it, rather than studying all storage engines equally.

Note

It is best practice to use the InnoDB storage engine as the default engine for any application

InnoDB is the default MySQL general purpose storage engine. By default, InnoDB stores its data in a series of data files that are collectively known as a tablespace. A tablespace is essentially a black box that InnoDB manages all by itself.

InnoDB uses MVCC to achieve high concurrency, and it implements all four SQL standard isolation levels. It defaults to the REPEATABLE READ isolation level, and it has a next-key locking strategy that prevents phantom reads in this isolation level: rather than locking only the rows you’ve touched in a query, InnoDB locks gaps in the index structure as well, preventing phantoms from being inserted.

InnoDB tables are built on a clustered index, which we will cover in detail in chapter 8 when we discuss schema design. InnoDB’s index structures are very different from those of most other MySQL storage engines. As a result, it provides very fast primary key lookups. However, secondary indexes (indexes that aren’t the primary key) contain the primary key columns, so if your primary key is large, other indexes will also be large. You should strive for a small primary key if you’ll have many indexes on a table.

InnoDB has a variety of internal optimizations. These include predictive read-ahead for prefetching data from disk, an adaptive hash index that automatically builds hash indexes in memory for very fast lookups, and an insert buffer to speed inserts. We cover these later in this book.

InnoDB’s behavior is very intricate, and we highly recommend reading the “InnoDB Locking and Transaction Model” section of the MySQL manual if you’re using InnoDB. There are many subtleties you should be aware of before building an application with InnoDB, because of its MVCC architecture. Working with a storage engine that maintains consistent views of the data for all users, even when some users are changing data, can be complex.

As a transactional storage engine, InnoDB supports truly “hot” online backups through a variety of mechanisms, including Oracle’s proprietary MySQL Enterprise Backup and the open source Percona XtraBackup.

Beginning with MySQL 5.6, InnoDB introduced Online DDL which at first had limited use cases that expanded in the 5.7 and 8.0 releases. In place schema changes allow for specific table changes without a full table lock and without using external tools which greatly improved the operationality of MySQL InnoDB tables. We will be covering options for online schema changes, both native and external tools in chapter 11.

JSON document support

One of the most notable additions to MySQL in 5.7 and 8.0 is extensive support of JSON data type that goes further than simple BLOB storage. Long requested and a feature in other relational database offerings for a long while, such as Postgres, this new rich datatype makes a class of business needs a first class citizen in MySQL and brings the guarantees of the InnoDB storage engine to a fast growing class of data modeling.

First introduced to InnoDB as part of the 5.7 release, the JSON type arrived with automatic validation of JSON documents, optimized storage that allowed for quick read access. A significant improvement to the tradeoffs of old style BLOB storage engineers used to resort to for JSON documents.

Along with the new datatype support, InnoDB also introduced SQL functions to support rich operations on JSON documents.

A further improvement in MySQL 8.0.7, adds the ability to define multi-valued indexes on JSON arrays. This feature can be a powerful way to even further speed read access queries to JSON types by matching the common access patterns to functions that can map the JSON document values.

Other Built-in MySQL Engines

MySQL has a variety of special-purpose storage engines. Many of them are deprecated in newer versions, for various reasons. Some of these are still available on the server, but must be enabled specially.

The MyISAM engine

MyISAM was the first engine in MySQL and was the default one until the release of MySQL 5.6. It does not support transactions or row based locks which makes for a much simpler concurrency model (only one write can ever happen at a given time) but much more limited use in modern applications. With the release of MySQL 5.6 in 2013, MyISAM was no longer the default engine in MySQL. In MySQL 8.0 it was completely removed.

The Archive engine

The Archive engine supports only INSERT and SELECT queries, and it does not support indexes until MySQL 5.1. It causes much less disk I/O than MyISAM, because it buffers data writes and compresses each row with zlib as it’s inserted. Also, each SELECT query requires a full table scan. Archive tables are thus best for logging and data acquisition, where analysis tends to scan an entire table, or where you want fast INSERT queries.

Archive supports row-level locking and a special buffer system for high-concurrency inserts. It gives consistent reads by stopping a SELECT after it has retrieved the number of rows that existed in the table when the query began. It also makes bulk inserts invisible until they’re complete. These features emulate some aspects of transactional and MVCC behaviors, but Archive is not a transactional storage engine. It is simply a storage engine that’s optimized for high-speed inserting and compressed storage.

The Blackhole engine

The Blackhole engine has no storage mechanism at all. It discards every INSERT instead of storing it. However, the server writes queries against Blackhole tables to its logs, so they can be replicated or simply kept in the log. That makes the Blackhole engine popular for fancy replication setups and audit logging, although we’ve seen enough problems caused by such setups that we don’t recommend them.

The CSV engine

The CSV engine can treat comma-separated values (CSV) files as tables, but it does not support indexes on them. This engine lets you copy files into and out of the database while the server is running. If you export a CSV file from a spreadsheet and save it in the MySQL server’s data directory, the server can read it immediately. Similarly, if you write data to a CSV table, an external program can read it right away. CSV tables are thus useful as a data interchange format.

The Federated engine

This storage engine is sort of a proxy to other servers. It opens a client connection to another server and executes queries against a table there, retrieving and sending rows as needed. It was originally marketed as a competitor to features supported in many enterprise-grade proprietary database servers, such as Microsoft SQL Server and Oracle, but that was always a stretch, to say the least. Although it seemed to enable a lot of flexibility and neat tricks, it has proven to be a source of many problems and is disabled by default. A successor to it, FederatedX, is available in MariaDB.

The Memory engine

Memory tables (formerly called HEAP tables) are useful when you need fast access to data that either never changes or doesn’t need to persist after a restart. Memory tables can be up to an order of magnitude faster than InnoDB tables. All of their data is stored in memory, so queries don’t have to wait for disk I/O. The table structure of a Memory table persists across a server restart, but no data survives.

Here are some good uses for Memory tables:

  • For “lookup” or “mapping” tables, such as a table that maps postal codes to state names
  • For caching the results of periodically aggregated data
  • For intermediate results when analyzing data

Memory tables support HASH indexes, which are very fast for lookup queries. Although Memory tables are very fast, they often don’t work well as a general-purpose replacement for disk-based tables. They use table-level locking, which gives low write concurrency. They do not support TEXT or BLOB column types, and they support only fixed-size rows, so they really store VARCHARs as CHARs, which can waste memory. (Some of these limitations are lifted in Percona Server.)

MySQL uses the Memory engine internally while processing queries that require a temporary table to hold intermediate results. If the intermediate result becomes too large for a Memory table, or has TEXT or BLOB columns, MySQL will convert it to a MyISAM table on disk and in 8.0 will even convert it to the safer InnoDB engine. We say more about this in chapters 7 and 12.

People often confuse Memory tables with temporary tables, which are ephemeral tables created with CREATE TEMPORARY TABLE. Temporary tables can use any storage engine; they are not the same thing as tables that use the Memory storage engine. Temporary tables are visible only to a single connection and disappear entirely when the connection closes.

The Merge storage engine

The Merge engine is a variation of MyISAM. A Merge table is the combination of several identical MyISAM tables into one virtual table. This can be useful when you use MySQL in logging and data warehousing applications, but it has been deprecated in favor of partitioning (see Chapter 7).

The NDB Cluster engine

MySQL AB acquired the NDB database from Sony Ericsson in 2003 and built the NDB Cluster storage engine as an interface between the SQL used in MySQL and the native NDB protocol. The combination of the MySQL server, the NDB Cluster storage engine, and the distributed, shared-nothing, fault-tolerant, highly available NDB database is known as MySQL Cluster.

Now that we covered all table engines included in MySQL’s releases, you should know that in chapter 2 we will cover notable third party engines and how to choose among all these options depending on your use case.

Summary

MySQL has a layered architecture, with server-wide services and query execution on top and storage engines underneath. Although there are many different plugin APIs, the storage engine API is the most important. If you understand that MySQL executes queries by handing rows back and forth across the storage engine API, you’ve grasped one of the core fundamentals of the server’s architecture.

In the past few major releases, MySQL has settled on InnoDB as its primary development focus and has moved even its internal bookkeeping around table metadata, authentication and authorization to InnoDB after years in MyISAM. This increased investment from Oracle in the InnoDB engine has led to major improvements such as atomic DDLs, more robust Online DDLs, better resilience to crashes, and better operability for security minded deployments.

1 One exception is InnoDB, which does parse foreign key definitions, because the MySQL server doesn’t yet implement them itself.

2 MySQL 5.5 and newer versions support an API that can accept thread-pooling plugins though not commonly used. The common practice for thread pooling is done at access layers which we discuss in Chapter 5

3 We will cover in chapter 13 more techniques of schema changes including ONLINE native alter statements and tools that provide online, lock free schema changes

4 There are metadata locks which are used when dealing with table name changes or changing schemas and in 8.0 we are introduced to “application level locking functions”. But in the course of run of the mill data changes, internal locking is left to the InnoDB engine.

5 Read a summary of ANSI SQL at by Adrian Coyler https://blog.acolyer.org/2016/02/24/a-critique-of-ansi-sql-isolation-levels/ and an explanation of consistency models by Kyle Kingsbury in http://jepsen.io/consistency

6 These locking hints are frequently abused and should usually be avoided.

7 There is no formal standard that defines MVCC, so different engines and databases implement it very differently, and no one can say any of them is wrong.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.228.35