Administrators familiar with the Linux /proc
filesystem know that it a valuable source for both device status and performance information. The meminfo
entry in this directory will always provide copious data regarding the status, contents, and state of the memory in our server.
We care about this as DBAs because file cache and write buffering can drastically affect disk I/O. We are not especially interested in analyzing PostgreSQL's memory usage itself. At the time of writing this book, current recommendations suggest that PostgreSQL's performance doesn't really improve after shared buffers reach 8 GB. However, for client connections, inode caches, and dirty page flushing, it's more than relevant.
On a modern Linux kernel, there are over 40 different lines of information in /proc/meminfo
. Much of this data is not exceptionally useful to a DBA, so this recipe will focus on important fields only.
We will be using the watch
and grep
commands in this recipe. It will be a good idea to experiment with them and, perhaps, skim the man
pages before continuing.
Follow these steps to capture an interesting memory status from /proc/meminfo
:
grep -A3 MemTotal /proc/meminfo
grep -A1 Dirty /proc/meminfo
grep -A1 Active /proc/meminfo
grep Swap /proc/meminfo
The first command we execute is nothing but a basic summary of the current memory state. For a test system with 2 GB of RAM running PostgreSQL, it would resemble this:
MemTotal: 2050908 kB MemFree: 840088 kB Buffers: 9288 kB Cached: 1102228 kB
This output is similar to what we would learn using the free
command. The MemTotal
row should speak for itself, as it is the total size of the memory in the system. The MemFree
row is the total amount of completely unallocated system memory, including buffers or cache. The Buffers
row in this context is mostly related to internal kernel bookkeeping, so we can ignore it. If we examine the value reported by the Cached
row, we can see that over 1 GB of data is cached in memory.
The second command outlines dirty memory. Dirty memory, in this case, is the memory that is modified and awaiting synchronization to disk. On the same 2 GB test system, a long pgbench
test might produce results like this:
Dirty: 29184 kB Writeback: 40 kB
As we've said, the Dirty
row details how much memory is waiting to be written to disk. On systems with very large amounts of RAM, this value can indicate that too much RAM is dirty. The consequences of this can include long query execution times or system stalls if the underlying storage is unable to quickly absorb that many disk writes. In practice, this should rarely be larger than the size of the disk controller's write cache.
However, what about the Writeback
row? This field details how much of the dirty memory is currently being written to disk. When storage is overwhelmed, the amount reflected here will rise, as the write-back buffer fills with more write requests. This is a definite sign that the system has encountered far more writes than it was designed to handle. In this essence, each of these fields is a warning sign that the application must be modified to reduce write workload or the database needs faster storage with a bigger write cache.
With our next command, we examine the contents of the cache itself. Still using our 2 GB test system, the cache looks like this:
Active: 1105760 kB Inactive: 32764 kB Active(anon): 207696 kB Inactive(anon): 9340 kB Active(file): 898064 kB Inactive(file): 23424 kB
We won't get into too much detail regarding how the kernel actually works, but we will note that all the fields named Inactive
are something of a misnomer. Any time something is loaded into cache, it first gets included in the Inactive
list. Based on the subsequent amount and timing of requests for this data, it might be promoted into the Active
set. Once it is in that list, various aging algorithms might eventually return it to the inactive list. Inactive cache data is always a candidate for replacement by more important data.
In the context of PostgreSQL, we need to pay attention to the Active(file)
entry. This is the amount of disk pages in cache. Disk reads are expensive, and as databases process data from disk, this is very important to us. We want as many disk pages as possible to be in the Active(file)
list, but this doesn't mean we discount Inactive(file)
. Remember, inactive cache is still in memory and eligible for database use; it simply hasn't been promoted to the active list. Thus, we want the total amount of file cache to be as high as possible, reflecting the prioritization of disk reads for database processing.
We include Active(anon)
and Inactive(anon)
for one reason: database clients. Temporary data allocated to database clients is often assigned to anonymous cache. This is good for the client program, but with enough of these, we lose valuable memory from use as disk cache. One remedy for this is to buy more memory, but another more scalable solution is to utilize database connection pooling. This book includes a chapter specifically dedicated to optimizing the connection count, as this helps preserve memory for data caching.
The last extract we obtain from /proc/meminfo
is related to swap usage and looks like this:
SwapCached: 0 kB SwapTotal: 2093052 kB SwapFree: 2093052 kB
Again, we can get this kind of data using the free
command as well. We mainly include it here in case any readers want to search for all of these fields with a single command for monitoring purposes.
The watch
utility will execute any command and its arguments until it is canceled with Ctrl + C. Instead of using those grep
statements every time we want to see interesting fields in the /proc/meminfo
file, we can simply use watch
. For example, to observe the state of dirty buffers waiting to be committed to disk, we can use the following command:
watch -n 5 grep -A1 Dirty /proc/meminfo
The Linux kernel documentation is somewhat verbose. Nonetheless, more technically apt readers can find much more information regarding /proc/meminfo
at https://www.kernel.org/doc/Documentation/filesystems/proc.txt.
18.223.209.180