The general query log

All statements sent to MariaDB are logged in the general query log or general log. They are written in the same order they were received. This order is never identical to the order of execution on multithread environments (because statements often need to wait for a lock to be released). Connections and disconnections are also written to the general log.

The general log is often suitable for finding problems that are caused by the application's bugs.

The general query log depends on the binary log format. This log will be described in Chapter 8, Backup and Disaster Recovery. While the general log is designed to be read by humans, the binary log is read by programs. A human can read its contents using the mysqlbinlog tool. This tool tracks the changes to databases and does this in different formats such as STATEMENT (SQL statements are logged), ROW (binary data is logged), or MIXED (both methods can be used). The reason for this will be clear in the later chapters, but it is important to remember that the general log only works properly if the binary log uses the STATEMENT format. If the MIXED format is used, some statements will not be logged.

The general log can be enabled with the --general_log startup option and disabled with --general_log=0. By default, it is disabled. It can also be enabled or disabled at runtime using the general_log dynamic system variable.

If the general log is enabled, by default it is written to a file. The default filename is the server's hostname with a .log extension. The default path for that file is the data directory. So, specify a different filename, and optionally a different path, to use the --general_log_file=filename option. In a replication environment, assigning identical names to log files for all servers is good practice.

The --log-output option determines the destination of both the error log and the general query log (which will be discussed later in this chapter). Note that this variable affects both these logs. It has three allowed values that can be combined in a variety of ways. The values are FILE (the logs are recorded into files), TABLE (the logs are written into a table in the MySQL database), and NONE (logging is suppressed). To log the errors in both a file and a system table, the syntax is as follows:

--log_output=FILE,TABLE

If NONE is specified, other values are ignored.

A system variable called log_output also exists. It is dynamic, which means that the destination of the error log and the general query log can be changed at runtime. It only exists at the global level, so it is not possible to change the destination of the log for the current session only.

Sometimes, a superuser may want to disable the logging of his/her queries. The most common reason to do this is to change a password (having a clearly written password in a text file or a table makes it less secure). However, whatever the reason, users with the SUPER privilege can disable the general query log and the slow query log for the current session by setting the sql_log_off server system variable to 1, shown as follows:

SET @@session.sql_log_off = 1;

This variable exists at both the session and global levels, so it can also be used to temporarily disable logging of all the queries, if it is necessary for some reason.

The file format of the general query log

The general query log starts with three lines shown as follows:

/usr/local/mysql/bin/mysqld, Version: 10.0.7-MariaDB-log (MariaDB Server). started with:
Tcp port: 0  Unix socket: (null)
Time                 Id Command    Argument

After general information about the server, you see the headers for four columns. The contents of the following lines are aligned with these headers.

These lines are rewritten at each server restart and each time the logs are flushed. (Logs' flushing is explained later in this chapter.)

The following is a sample entry:

140103 18:14:47      4  Query    SHOW TABLES

The Time column contains a date, in the YYMMDD format, and a time, in the HH:MM:SS human-readable format. In some cases, these values can be missing.

The Id column contains the connection's ID. This is the same value used in several information_schema and performance_schema tables and returned by the CONNECTION_ID() function.

The Command argument indicates what kind of action the user performed. Possible values are: Connect, Init DB, Query, and Quit.

The contents of the Argument column depend on the value of Command. In the previous example, since the user sent a statement, the Argument column contains the text of such a statement.

Let's see another example shown as follows:

140103 18:14:31        4 Connect    root@localhost as anonymous on

In the preceding statement, the Connect value means that a connection is established. The Argument value indicates the account used.

Consider the following statement:

140103 18:14:42        4 Init DB    test

In the preceding statement, the Init DB value means that the user selected a default database (typically via the USE statement). The Argument indicates the name of the selected database.

Consider the following statement:

140103 18:14:57        4 Quit

In the preceding statement, the Quit value means that the user closed the connection. If a Quit action has not been performed by a connection, it could mean the connection is still open, or another thread killed the connection.

The general_log table

As explained previously, the general query log can be written in the form of a table called general_log in the MySQL database. Its columns are as follows:

  • event_time: This column refers to the Time column in the file
  • user_host: This column contains the account that executed a statement
  • thread_id: This column refers to the Id column in the file
  • server_id: This column contains the ID of the server needed to set up the replication
  • command_type: This column refers to the Command column in the file
  • argument: This column refers to the Argument column in the file

A sample row from the table is shown as follows:

MariaDB [mysql]> SELECT * FROM general_log ORDER BY event_time DESC LIMIT 1 G
*************************** 1. row ***************************
  event_time: 2014-06-17 08:34:25.864270
   user_host: root[root] @ localhost []
   thread_id: 4
   server_id: 1
command_type: Query
    argument: SET @@session.sql_log_off = 1
1 row in set (0.01 sec)

The preceding example from the table has the following characteristics that also apply to the slow_log table.

The log tables have several restrictions. The most important restriction is that these tables cannot be modified by the user because DML statements that involve log tables produce an error. Only a few operations are allowed. Also, such tables cannot be locked by the user. A FLUSH TABLES WITH READ LOCK option can be safely used to lock all other tables. A LOCK TABLES option on those tables will produce an error.

The general_log table is a CSV table (that stands for comma separated values). This table allows users to open the data file (general_log.CSV present in the data directory) with external programs because CSV is a widely supported format. However, this slows down SQL queries because CSV tables do not support indexes.

It is possible to convert the table to the MyISAM format. This will be faster, but still sensibly slower than file-based logging. No other storage engine can be used for this table. Note that before you change the storage engine, it is necessary to disable the log. The following code can be used for this purpose:

MariaDB [(none)]> SET GLOBAL general_log = 0;
Query OK, 0 rows affected (0.06 sec)
MariaDB [(none)]> ALTER TABLE mysql.general_log ENGINE = MyISAM;
Query OK, 12 rows affected (0.28 sec)              
Records: 12  Duplicates: 0  Warnings: 0
MariaDB [(none)]> SET GLOBAL general_log = 1;
Query OK, 0 rows affected (0.00 sec)

Log tables can periodically be emptied with the TRUNCATE TABLE command if you do not care about the entry of the old log. However, since DML statements are not supported, there is no way to delete only a portion of the rows from the table. Thus, usually, a better way to prevent the log tables from growing too much is to rotate them. This operation is easy and relatively fast because the RENAME TABLE command is supported. All these operations require that the general log be temporarily disabled, as shown in the previous example. After you rename a log table, it doesn't have the restrictions of a log table anymore.

The CHECK TABLE, OPTIMIZE TABLE, and REPAIR TABLE commands are also supported. Since a repair operation can be slow, if a table is damaged, it would be a good idea to rename it and create a new one, before you start the recovery. In this way, queries will still be logged to the process.

Since the ALTER TABLE command works (with the general log disabled), it is possible to add one or more indexes to a table, after converting it to MyISAM. This will make the SELECT operations much faster. However, this will also slow down insertions, which can be a major problem. However, if you rotate the general log tables, you may choose to add indexes to the historical tables. Depending on the size of the new historical table, this operation might be very slow, and the required space might be a problem for us. However, the queries will also be much faster.

Debugging examples with the general query log

The following two examples use SQL tables to show the cross-platform code that can be executed within MariaDB. This does not mean that tables are the best way to store logs. The default destination is FILE, and each DBA will decide whether changing the default destination makes sense for this particular case. As discussed previously, The error log section shows how to investigate a log file using a Linux command line.

As a first example, let's suppose you just realized that someone dropped a table named orders history, which was still in use. You also know that this deletion happened no more than one week ago. The first thing to do is to restore the table, and Chapter 8, Backup and Disaster Recovery, shows you how to do this. However, the table could be dropped again, and you don't want this to happen. There is probably a permissions problem or a bug in the application. To find out the exact problem, you need to know who dropped the table, and probably when this was done. The SQL table to do this is quite simple, as shown in the following code snippet:

SELECT user_host, event_time, argument
    FROM mysql.general_log
    WHERE
        event_time >= NOW() - INTERVAL 1 WEEK
        AND command_type = 'Query'
        AND argument LIKE '%drop%table%orders\_history%'
    ORDER BY event_time DESC
    LIMIT 20;

In the previous code snippet, the first condition specifies that event_time must not be older than one week. The second condition does not add anything (as the third condition will be sufficient), except that it should filter out many records in a fast way. The third condition is the most interesting. You do not know the exact string that was received by the server; for example, it could contain spaces, or the database name, or back ticks around the table name. With the % sign, this is not a problem—each query that contains the tree terms you indicated is returned. Also, note that LIKE is case insensitive.

You could obtain a lot of results and, theoretically, it would be weird but still possible. For this reason, you need to order the results: the last row is the statement that most recently dropped the table. And, while you may be interested in seeing who executed similar statements and when, we will limit the results to 20 rows.

The second example follows up the first one: we found that the connection with the ID 100 executed the DROP TABLE statement this morning. Now we want to know why it did this. Probably, the connection also executed other statements, and so, we want to know its history. There are other cases when you may want to know a connection's history. For example, sometimes it may be the easiest way to debug a web application that has a bug in its SQL statements. In any case, getting the log records that involve that ID is quite easy, shown as follows:

SELECT event_time, command_type, argument
    FROM mysql.general_log
    WHERE thread_id = 100
    AND event_time > NOW() - INTERVAL 1 DAY
    ORDER BY event_time G

These are the general log's most commonly used use cases. Of course, since they provide detailed information that can be read in a flexible way, many other use cases are possible. For example, all clients should close their connections with the MariaDB server as soon as they are no longer needed, in order to free memory. However, the problem here is MariaDB maintains some per-thread buffers and the temporary MEMORY tables until a connection is closed. Having multiple open connections can be a waste of memory. The general log can be used to see which connection did not issue a Quit command, that is, to determine those clients who did not close the connection properly. Note that if a connection is now closed, but it never executed a Quit command, then it probably has been terminated by the root user with the KILL statement. However, usually most of these connections expire because they exceed the timeout that has been set (using the @wait_timeout server variable). In other words, the application that created them did not close them, so they have been inactive for a longer period of time, wasting some memory. For example, the execution of the following query will be slow:

CREATE TABLE tmp.g_log ENGINE = MEMORY
    SELECT thread_id, command_type, COUNT(*)
        FROM mysql.general_log
        WHERE event_time > NOW() - INTERVAL 1 DAY
        AND command_type IN ('Connect', 'Quit')
        GROUP BY thread_id, command_type
        ORDER BY thread_id, command_type;

The result of this query is used to create a new table. The execution will be faster if you create indexes for the table; however, you only need to query it twice, so it would be a waste of time in this case. When querying such tables, you will obtain an output shown as follows:

+-----------+--------------+----------+
| thread_id | command_type | COUNT(*) |
+-----------+--------------+----------+
|        10 | Connect      |        1 |
|        10 | Quit         |        1 |
|        11 | Connect      |        1 |
|        12 | Connect      |        1 |
|        13 | Connect      |        1 |
|        13 | Quit         |        1 |
|        14 | Connect      |        1 |
|        14 | Quit         |        1 |
|        15 | Connect      |        1 |
|        15 | Quit         |        1 |
+-----------+--------------+----------+
23 rows in set (0.01 sec)

This output is very easy to inspect. The rows are sorted by thread_id; if a Quit command is missing for a certain thread_id, or if it has a lower COUNT(*) than the matching Connect command, then that thread has not been properly closed. Unless the server is restarted, the thread IDs are hardly reused within the same day, so you can expect the value to be 1 or to be missing. In the previous example, the 11 and 12 threads IDs never issued a Quit command. So, we want to know who these threads belonged to.

The following query returns the accounts used by these threads:

SELECT m.thread_id, m.user_host
    FROM mysql.general_log m LEFT JOIN tmp.g_log g
    ON m.thread_id = g.thread_id
    AND g.command_type = 'Quit'
    WHERE m.event_time > NOW() - INTERVAL 1 HOUR
    AND g.thread_id IS NULL;

In the previous example, the JOIN command returns the rows in the general log that do not have a matching Quit row in the tmp.g_log table we previously created. The time limit here is very important because you do not want to use rows that have been written before the last server restart. For each row we obtain, we see the used account:

+-----------+------------------------------+
| thread_id | user_host                    |
+-----------+------------------------------+
|        11 | [arancia] @ localhost []     |
|        12 | [arancia] @ localhost []     |
+-----------+------------------------------+

Now, you know which user has not closed his connections. If it is used by an application, you can ask its developers to fix the problem.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.236.27