Chapter 3 demonstrated how log events can be captured and how helper plugins such as parsers come into play. But capturing data is only of value if we can do something meaningful with it, such as delivery to an endpoint formatted so the log events can be used—for example, storing the events in a log analytics engine or sending a message to an operations (Ops) team to investigate. This chapter is about showing how Fluentd enables us to do that. We look at how Fluentd output plugins can be used from files, as well as how Fluentd works with MongoDB and collaboration/social tools for rapid notifications with Slack.
This chapter will continue to use the LogSimulator, and we will also use a couple of other tools, such as MongoDB and Slack. As before, complete configurations are available in the download pack from Manning or via the GitHub repository, allowing us to focus on the configuration of the relevant plugin(s). Installation steps for MongoDB and Slack are covered in appendix A.
Compared to the tail (file input) plugin, we are less likely to use the file output plugin, as typically we will want to output to a tool that allows us to query, analyze, and visualize the events. There will, of course, be genuine cases where file output is needed for production. However, it is one of the best options as a stepping-stone to something more advanced, as it is easy to see outcomes and the impact of various plugins, such as parsers and filters. Logging important events to file also lends itself to easily archiving the log events for future reference if necessary (e.g., audit log events to support legal requirements). To that end, we will look at the file output before moving on to more sophisticated outputs.
With the file output (and, by extension, any output that involves directly or indirectly writing physical storage), we need to consider several factors:
Where can we write to in the file system, as dictated by storage capacity and permissions?
Does that location have enough capacity (both allocated and physical capacity)?
Is there latency on data access (NAS and SAN devices are accessed through networks)?
While infrastructure performance isn’t likely to impact development work, it is extremely important in preproduction (e.g., performance testing environments) and production environments. It is worth noting that device performance is essential for the file plugin. Other output plugins are likely to be using services that will include logic to optimize I/O (e.g., database caching, optimization of allocated file space). With output plugins, we have likely consolidated multiple sources of log events. Therefore, we could end up with a configuration that has Fluentd writing all the inputs to one file or location. The physical performance considerations can be mitigated using buffers (as we will soon see) and caching.
Let’s start with a relatively basic configuration for Fluentd. In all the previous chapters’ examples, we have just seen the content written to the console. Now, rather than a console, we should simply push everything to a file. To do this, we need a new match
directive in the configuration, but we’ll carry on using the file source for log events.
To illustrate that an output plugin can handle multiple inputs within the configuration, we have included the self-monitoring source configuration illustrated in the previous chapter, in addition to a log file source. To control the frequency of log events generated by the Fluentd’s self_monitor
, we can define another attribute called emit_interval
, which takes a duration value—for example, 10s (10 seconds). The value provided by emit_interval
is the time between log events being generated by Fluentd. Self-monitoring can include details like how many events have been processed, how many worker processes are managed, and so on.
At a minimum, the file output plugin simply requires the type
attribute to be defined and a path pointing to a location for the output using the path
attribute. In the following listing, we can see the relevant parts of our Chapter4/Fluentd/rotating-file-read-file-out.conf
file. The outcome of this configuration may surprise you, but let’s see what happens.
<source> @type monitor_agent bind 0.0.0.0 port 24220 @id in_monitor_agent include_config true tag self emit_interval 10s </source> <match *> @type file ❶ path ./Chapter4/fluentd-file-output ❷ </match>
❶ Changes the plugin type to file
❷ The location of the file to be written
The outcome of using this new match
directive can be seen if the LogSimulator and Fluentd are run with the following commands:
fluentd -c ./Chapter4/Fluentd/rotating-file-read-file-out.conf
groovy LogSimulator.groovy ./Chapter4/SimulatorConfig/jul-log-output2.properties./TestData/medium-source.txt
The easy presumption would be that all the content gets written to a file called fluentd-file-output
. However, what has happened is that a folder is created using the last part of the path as its name (i.e., fluentd-file-output
), and you will see two files in that folder. The file will appear with a semi-random name (to differentiate the different buffer files) and a metadata file with the same base name. What Fluentd has done is to implicitly make use of a buffering mechanism. The adoption of a buffer with a default option is not unusual in output plugins; some do forgo the use of buffering—for example, the stdout plugin.
Buffering, as you may recall from chapter 1, is a Fluentd helper plugin. Output plugins need to be aware of the impact they can have on I/O performance. As a result, most output plugins use a buffer plugin that can behave synchronously or asynchronously.
The synchronous approach means that as log events are collected into chunks, as soon as a chunk is full, it is written to storage. The asynchronous approach utilizes an additional queue stage. The queue stage interacting with the output channel is executed in a separate thread, so the filling of chunks shouldn’t be impacted by any I/O performance factors.
The previous example had not explicitly defined a buffer; we saw the output plugin applying a default to using a file buffer. This makes more sense when you realize that the file output plugin supports the ability to compress the output file using gzip, increasing effectiveness the more content you compress at once.
In figure 4.1, we have numbered the steps. As the arrows’ varying paths indicate, steps in the life cycle can be bypassed. All log events start in step 1, but if no buffering is in use, the process immediately moves to step 5, where the physical I/O operation occurs, and then we progress to step 6. If there is an error, step 6 can send the logic back to the preceding step to try again. This is very much dependent upon the plugin implementation but is common to plugins such as database plugins. If the retries fail or the plugin doesn’t support the concept, some plugins support the idea of a secondary plugin to be specified.
A secondary plugin is another output plugin that can be called (step 7). Typically, a secondary plugin would be as simple as possible to minimize the chance of a problem so that the log events could be recovered later. For example, suppose the output plugin called a remote service from the Fluentd node (e.g., on a different network, in a separate server cluster, or even a data center). In that case, the secondary plugin could be a simple file output to a local storage device.
NOTE We would always recommend that a secondary output be implemented with the fewest dependencies on software and infrastructure. Needing to fall to a secondary plugin strongly suggests broader potential problems. So the more straightforward, less dependent it is on other factors, the more likely that the output won’t be disrupted. Files support this approach very well.
If buffering has been configured, then steps 1 through 3 would be performed. But then the following action would depend upon whether the buffering was asynchronous. If it was synchronous, then the process would jump to step 5, and we would follow the same steps described. For asynchronous buffering, the chunk of logs goes into a separate process managing a queue of chunks to be written. Step 4 represents the buffer operating asynchronously. As the chunks fill, they are put into a queue structure, waiting for the output mechanism to take each chunk to output the content. This means that the following log event to be processed is not held up by the I/O activity of steps 5 onward.
Out of the box, Fluentd provides the following buffer types:
As you may have realized, the path is used as the folder location to hold its buffered content and contains both the content and a metadata file. Using file I/O for the buffer doesn’t give much of a performance boost in terms of the storage device unless you establish a RAM disk (aka, RAM drive). A file-based buffer still provides some benefits; the way the file is used is optimized (keeping files open, etc.). It also acts as a staging area to accumulate content before applying compression (as noted earlier, the more data involved in a zip compression, the greater the compression possible). In addition, the logged content won’t be lost as a result of some form of process or hardware failure, and the buffer can be picked back up when the server and/or Fluentd restart.
NOTE RAM drives work by allocating a chunk of memory for storage and then telling the OS’s file system that it is an additional storage device. Applications that use this storage believe they are writing to a physical device like a disk, but the content is actually written to memory. More information can be found at www.techopedia.com/definition/2801/ram-disk.
With buffers in many plugins being involved by default, or explicitly, we should look at how to start configuring buffer behaviors. We know when the events move from the buffer to the output destination, how frequently such actions occur, and how these configurations impact performance. The life cycle illustrated in figure 4.1 provides clues as to the configuration possibilities.
As figure 4.1 shows, the buffer’s core construct is the idea of a chunk. The way we configure chunks, aside from synchronous and asynchronous, will influence the performance. Chunks can be controlled through the allocation of storage space (allowing for a reserved piece of contiguous memory or disk to be used) or through a period. For example, all events during a period go into a single chunk, or a chunk will continue to fill with events until a specific number of log events or the chunk reaches a certain size. Separation of the I/O from the chunk filling is beneficial if there is a need to provide connection retries for the I/O, as might be the case with shared services or network remote services, such as databases.
In both approaches, it is possible through the configuration to set attributes so that log events don’t end up lingering in the buffer because a threshold is never fully met. Which approach to adopt will be influenced by the behavior of your log event source and the tradeoff of performance against resources available (e.g., memory), as well as the amount of acceptable latency in the log events moving downstream.
Personally, I tend to use size constraints, which provides a predictable system behavior; this may reflect my Java background and preference not to start unduly tuning the virtual machine.
Table 4.1 shows the majority of the possible controls on a buffer. Where the control can have some subtlety in how it can behave, we’ve included more elaboration.
We can amend the configuration with the understanding of buffer behavior (or use the prepared one). As we recommend flushing on shutdown, we should set this to true (flush_at_shutdown true
). As we want to quickly see the impact of the changes, let’s set the maximum number of records to 10 (chunk_limit_records 10
) and the maximum time before flushing a chunk to 30 seconds (flush_interval 30
). Otherwise, if we have between 1 and 9 log events in the buffer, they’ll never get flushed if the sources stop creating log events. Finally, we have added an extra layer of protection in our configuration by imposing a time-out of the buffer write process. We can see this in the following listing.
<match *> @type file path ./Chapter4/fluentd-file-output <buffer> flush_at_shutdown true ❶ delayed_commit_timeout 10 chunk_limit_records 10 ❷ flush_interval 30 ❸ </buffer> </match>
❶ By default, the file buffer doesn’t flush on shutdown, as events won’t be lost by stopping the Fluentd instance. However, it is desirable to see all events completed at shutdown. There is the risk that a configuration change will mean the file buffer isn’t picked up on restart, resulting in log events effectively being in limbo.
❷ As we understand our log content and want to see things happen very quickly, we will use several logs rather than the capacity to control the chunk size, which indirectly influences how soon events are moved from the buffer to the output destination.
❸ As the buildup of self-monitoring events will be a lot slower than our file source, forcing a flush on time as well will ensure we can see these events come through to the output at a reasonable frequency.
To run this scenario, let’s reset (delete the structured-rolling-log.*
and rotating-file-read.pos_file
files and the fluentd-file-output
folder) and run again, using each of these commands in a separate shell:
fluentd -c Chapter4/Fluentd/rotating-file-read-file-out2.conf
groovy LogSimulator.groovy Chapter4/SimulatorConfig/jul-log-file2.properties ./TestData/medium-source.txt
Do not shut down Fluentd once the log simulator has completed. We will see that the folder fluentd-file-output
is still created with both buffer files as before. But at the same time, we will see files with the naming of fluentd-file-output.<date>_<incrementing number>.log
(e.g., fluentd-file-output.20200505_12.log
). Open any one of these files, and you will see 10 lines of log data. You will notice that the log data is formatted as a date timestamp, tag name, and then the payload body, reflecting the standard composition of a log event. If you scan through the files, you will find incidents where the tag is not simpleFile but self. This reflects that we have kept the source reporting on the self-monitoring and matches with our simpleFile, which is tracking the rotating log files.
Finally, close down Fluentd gracefully. In Windows, the easiest way to do this is in the shell: press CTRL-c
once (and only once), and respond yes to the shutdown prompt (in Linux, the interrupt events can be used). Once we can see that Fluentd has shut down in the console, look for the last log file and examine it. There is an element of timing involved, but if you inspect the last file, chances are it will have less than 10 records in it, as the buffer will have flushed to the output file whatever log events it had at shutdown.
The use of buffers also allows Fluentd to provide a retry mechanism. In the event of issues like transient network drops, we can tell the buffer when it recognizes an issue to perform a retry rather than just losing the log events. For retries to work without creating new problems, we need to define controls that tell the buffer how long or how many times to retry before abandoning data. In addition to this, we can define how long to wait before retrying. We can stipulate retrying forever (attribute retry_forever
is set to true), but we recommend using such an option with great care.
There are two ways for retry to work using the retry_type attribute—through either a fixed interval (periodic
) or through an exponential backoff (exponential_backoff
). The exponential backoff is the default model, and each retry attempt that fails results in the retry delay doubling. For example, if the retry interval was 1 second, the second retry would be 2 seconds, the third retry 4 seconds, and so on. We can control the initial or repeated wait period between retries by defining retry_wait
with a numeric value representing seconds (e.g., 1 for 1 second and 60 for 1 minute).
Unless we want to retry forever, we need to provide a means to determine whether or not to keep retrying. For the periodic retry model, we can control this by number or time. This is done by either setting a maximum period to retry writing each chunk (retry_timeout
) or a maximum number of retry attempts (retry_max_attempts
).
For the backoff approach, we can stipulate a number of backoffs (retry_exponential_backoff_base
) or the maximum duration that a backoff can go before stopping (retry_max_interval
).
Suppose we wanted to configure the buffer retry to be an exponential backoff starting at 3 seconds. With a maximum of 10 attempts, we could end up with a peak retry interval of nearly 26 minutes. The configuration attributes we would need to configure are
The important thing with the exponential backoff is to ensure you’re aware of the possible total time. Once that exponential curve gets going, the time extends very quickly. In this example, the first 5 retries would happen inside a minute, but intervals really start to stretch out after that.
You’ve been asked to help the team better understand buffering. There is an agreement that an existing configuration should be altered to help this. Copy the configuration file /Chapter4/Fluentd/rotating-file-read-file-out2.conf
and modify it so that the configuration of the buffer chunks is based on the size of 500 bytes (see appendix B for how to express storage sizes).
Run the modified configuration to show the impact on the output files.
As part of the discussion, one of the questions that came up is if the output plugin is subject to intermittent network problems, what options do we have to prevent the loss of any log information?
The following listing shows the buffer configuration that would be included in the result.
<match *> @type file @id bufferedFileOut path ./Chapter4/fluentd-file-output <buffer> ❶ delayed_commit_timeout 10 flush_at_shutdown true chunk_limit_size 500 ❷ flush_interval 30 </buffer> </match>
❶ Note that we’ve retained the delay and flush time, so if the buffer stops filling, it will get forced out anyway.
❷ Size-based constraint rather than time-based
A complete configuration file is provided at Chapter4/ExerciseResults/rotating -file-read-file-out2-Answer.conf
. The rate of change to the log files is likely to appear different, but the same content will be there. To address the question of mitigating the risk of log loss, several options could be applied:
Configure the retry and backoff parameters on the buffer to retry storing the events rather than losing the information.
Use the capability of defining a secondary log mechanism, such as a local file, so the events are not lost. Provide a means for the logs to be injected into the Kafka stream at a later date. This could even be an additional source.
How we structure the output of the log events is as important as how we apply structure to the input. Not surprisingly, a formatter plugin can be included in the output plugin. With a formatter plugin, it’s reasonable to expect several prebuilt formatters. Let’s look at the out-of-the-box formatters typically encountered; the complete set of formatters are detailed in appendix C.
This is probably the simplest formatter available and has already been implicitly used. The formatter works using a delimiter between the values time
, tag,
and record
. By default, the delimiter is a tab character. This can only be changed to a comma or space character using the values comma
or space
for the attribute delimiter; for example, delimiter comma
. Which fields are output can also be controlled with Boolean-based attributes:
output_tag
—This takes a true or false value to determine whether the tag(s) are included in the line (by default, the second value in the line).
output_time
—This takes true or false to define whether the time is included (by default, the first value in the line). You may wish to omit the time if you have it incorporated into the core event record.
time_format
—This can be used to define the way the date and time are used in the path. If undefined and the timekey is set, then the timekey will inform the structure of the format. Specifically:
NOTE If the formatter does not recognize the attribute value set, it will ignore the value provided and use the default value.
This formatter treats the log event record as a JSON payload on a single line. The timestamp and tags associated with the event are discarded.
As with the out_file
formatter and the parser, the limiters can be changed using delimiter (each labeled value) and label_delimiter
for separating the value and label. For example, if delimiter was set to delimiter ;
and label_delimiter =
was set, then if the record was expressed as {"my1stValue":" blah", "secondValue": "more blah", "thirdValue": "you guessed – blah"}
, the output would become my1st Value=blah; secondValue= more blah; thirdValue=you guessed – blah.
As the values are label values, the need for separating records by lines is reduced, and therefore it is possible to switch off the use of new lines to separate each record with add_newline false
(by default, the value is set to true).
Just like ltsv and out_file
the delimiter can be defined by setting the attribute delimiter
. The attribute has a default, which is a comma. Additionally, the csv output allows us to define which values can be included in the output using the fields
attribute. If I use the record again to illustrate if my event is {"my1stValue":" blah"
, "secondValue":" more blah", "thirdValue":"you guessed – blah}
, and I set the fields attribute to be the secondValue, thirdValue fields
, then the output would be "more blah", "you guessed – blah"
. If desired, the quoting of each value can be disabled with a Boolean value for force_quotes
.
As with the msgpack parser, the formatter works with the MessagePack framework, which takes the core log event record and uses the MessagePack library to compress the content. Typically, we would only expect to see this used with HTTP forward output plugins where the recipient expects MessagePack content. To get similar compression performance gains, we can use gzip for file and block storage.
NOTE You may have noticed that most formatters (out_file
being an exception) omit adding the time and keys from the log event. As a result, if you want those to be carried through, it will be necessary to ensure that they are incorporated into the log event record or that the output plugin utilizes the values explicitly. Adding additional data into the log event payload can be done using the inject plugin, which can be used within a match or filter directive. We will pick up the inject feature when we address filtering in chapter 6.
We can extend our existing configuration to move from the current implicit configuration to being explicit. Let’s start with the simplest output using the default out_ file
formatter, using a comma as a delimiter (delimiter comma
) and excluding the log event tag (output_tag false
). We will continue to use the same sources as before to demonstrate the effect of formatters. This out_file
formatter configuration is illustrated in the following listing.
<match *> @type file @id bufferedFileOut path ./Chapter4/fluentd-file-output <buffer> delayed_commit_timeout 10 flush_at_shutdown true chunk_limit_records 50 ❶ flush_interval 30 flush_mode interval </buffer> <format> @type out_file ❷ delimiter comma ❸ output_tag false ❹ </format> </match>
❶ To reduce the number of files being generated, we have made the number of records a lot larger per file.
❷ We explicitly define the output formatter to be out_file, as we want to override the default formatter behavior.
❸ Replace the tab delimiter with a comma.
❹ Exclude the tag information from the output.
Assuming the existing log files and outputs have been removed, we can start the example using the following commands:
fluentd -c Chapter4/Fluentd/rotating-file-read-file-out3.conf
groovy LogSimulator.groovy Chapter4/SimulatorConfig/jul-log-file2.properties ./TestData/medium-source.txt
Your organization has decided that as standard practice, all output should be done using JSON structures. This approach ensures that any existing or applied structural meaning to the log events is not lost. To support this goal, the configuration file /Chapter4/Fluentd/rotating-file-read-file-out3.conf
will need modifying.
The format declaration in the configuration file should be reduced to look like the fragment in the following listing.
A complete example configuration to this answer can be found in /Chapter4/ ExerciseResults/rotating-file-read-file-out3-Answer.conf
.
While outputting to some form of file-based storage is a simple and easy way of storing log events, it doesn’t lend itself well to performing any analytics or data processing. To make log analysis a practical possibility, Fluentd needs to converse with systems capable of performing analysis, such as SQL or NoSQL database engines, search tools such as Elasticsearch, and even SaaS services such as Splunk and Datadog.
Chapter 1 highlighted that Fluentd has no allegiance to any particular log analytics engine or vendor, differentiating Fluentd from many other tools. As a result, many vendors have found Fluentd to be an attractive solution to feed log events to their product or service. To make adoption very easy, vendors have developed their adaptors.
We have opted to use MongoDB to help illustrate the approach for getting events into a tool capable of analytical processing logs. While MongoDB is not dedicated to textual search like Elasticsearch, its capabilities are well suited to log events with a good JSON structure. MongoDB is very flexible and undemanding in getting started, so don’t worry if you’ve not used MongoDB. The guidance for installing MongoDB can be found in appendix A.
The MongoDB input plugin is provided in the Treasure Data Agent build of Fluentd but not in the vanilla deployment. If we want Fluentd to work with MongoDB, we need to install the RubyGem if it isn’t already in place.
To determine whether the installation is needed and perform installations of gems, we can use a wrapper utility that leverages the RubyGems tool called fluent-gem. To see if the gem is already installed, run the command fluent-gem list
from the command line. The command will show the locally installed gems, which contain Fluentd and its plugins. At this stage, there should be no indication of a fluent-plugin-mongo gem. We can, therefore, perform the installation using the command fluent -gem install fluent-plugin-mongo
. This will retrieve and install the latest stable instance of the gem, including the documentation and dependencies, such as the MongoDB driver.
Within a match, we need to reference the mongo
plugin and set the relevant attributes. Like connecting to any database, we will need to provide an address (which could be achieved via a host [name] and port [number on the network]) and user and password, or via a connection string (e.g., mongodb://127.0.0.1:27017/Fluentd
). In our example configuration, we have adopted the former approach and avoided the issue of credentials. The database (schema) and collection (table comparable to a relational table) are needed to know where to put the log events.
Within a MongoDB output, there is no use of a formatter, as the MongoDB plugin assumes all content is already structured. As we have seen, without configuration, a default buffer will be adopted. As before, we’ll keep the buffer settings to support only low volumes so we can see things changing quickly. We can see the outcome in the following listing.
<match *> @type mongo ❶ @id mongo-output host localhost ❷ port 27017 ❸ database Fluentd ❹ collection Fluentd ❺ <buffer> ❻ delayed_commit_timeout 10 flush_at_shutdown true chunk_limit_records 50 flush_interval 30 flush_mode interval </buffer> </match>
❷ Identifies the MongoDB host server. In our dev setup, that’s simply the local machine; this could alternatively be defined by using a connection attribute.
❸ The port on the target server to communicate with. We can also express the URI by combining hostp, port, and database into a single string.
❹ As a MongoDB installation can support multiple databases, we need to name the database.
❺ The collection within the database we want to add log events to—a rough analogy to an SQL-based table
❻ Buffer configuration to ensure the log events are quickly incorporated into the MongoDB
Ensure that the MongoDB instance is running (this can be done with the command mongod --version
in a shell window or by attempting to connect to the server using the Compass UI). If the MongoDB server isn’t running, then it will need to be started. The simplest way to do that is to run the command mongod
.
With Mongo running, we can then run our simulated logs and Fluentd configuration with the commands in each shell:
groovy LogSimulator.groovy Chapter4/SimulatorConfig/jul-log-file2-exercise.properties ./TestData/medium-source.txt
fluentd -c Chapter4/fluentd/rotating-file-read-mongo-out.conf
To see how effective MongoDB can be for querying the JSON log events, if you add the expression {"msg" : {$regex : ".*software.*"}}
into the FILTER field and click FIND, we’ll get some results. These results will show the log events where the msg
contains the word software. The query expression tells MongoDB to look in the documents, and if they have a top-level element called msg
, then evaluate this value using the regular expression.
If you examine the content in MongoDB, you will see that the content stored is only the core log event record, not the associated time or tags.
To empty the collection for further executions of the scenario in a command shell, run the following command:
The mongo plugin has a couple of other tricks up its sleeve. When the attribute tag_mapped
is included in the configuration, the tag name is used as the collection name, and MongoDB will create the collection if it does not exist. This makes it extremely easy to separate the log events into different collections. If the tag names have been used hierarchically, then the tag prefix can be removed to simplify the tag_ mapped feature. This can be defined by the remove_tag_prefix
attribute, which takes the name of the prefix to remove.
As collections can be established dynamically, it is possible to define characteristics within the configuration; for example, whether the collection should be capped in size.
In this configuration, we have not formally defined any username or password. This is because in the configuration of MongoDB, we have not imposed the credentials restrictions. And incorporating credentials in the Fluentd configuration file is not the best practice. Techniques for safely handling credentials within Fluentd configuration, not just for MongoDB but also for other systems needing authentication, are addressed in chapter 7.
Revise the configuration to define the connection by a connection attribute, not the host, port, and database used. This is best started by copying the configuration file Chapter4/fluentd/rotating-file-read-mongo-out.conf
. Adjust the run commands to use the new configuration. The commands should look something like this:
groovy LogSimulator.groovy Chapter4/SimulatorConfig/jul-log-file2-exercise.properties ./TestData/medium-source.txt
fluentd -c Chapter4/fluentd/my-rotating-file-read-mongo-out.conf
The configuration of the MongoDB connection should look like the configuration shown in the following listing.
<match *> @type mongo @id mongo-output connection_string mongodb://localhost:27017/Fluentd ❶ collection Fluentd <buffer> delayed_commit_timeout 10 flush_at_shutdown true chunk_limit_records 50 flush_interval 30 flush_mode interval </buffer> </match>
❶ Note the absence of the host, port, and database properties in favor of this. You could have used your host’s specific IP or 127.0.0.1 instead of localhost.
The example configuration can be seen at Chapter4/ExerciseResults/rotating-file-read-mongo-out-Answer.conf
Before we move on from MongoDB, we have focused on the output use of MongoDB so it can be used to query log events. But MongoDB can also act as an input into Fluentd, allowing new records added to MongoDB to be retrieved as log events.
In chapter 1, we introduced the idea of making log events actionable. So far, we have seen ways to unify the log events. To make log events actionable, we need a couple of elements:
The ability to send events to an external system that can trigger an action, such as a collaboration/notification platform that Ops people can see and react to log events or even invoke a script or tool to perform an action
Separating the log events that need to be made actionable from those that provide information but require no specific action
The process of separating or filtering out events that need to be made actionable is addressed later in the book. But here, we can look at how to make events actionable without waiting until all the logs have arrived in an analytics platform and running its analytics processes.
One way to make a log event actionable is to invoke a separate application that can perform the necessary remediation as a result of receiving an API call. This could be as clever as invoking an Ansible Tower REST API (http://mng.bz/Nxm1) to initiate a template job that performs some housekeeping (e.g., moving logs to archive storage or tell Kubernetes about the internal state of a pod). We would need to control how frequently an action is performed; plugins such as flow_counter and notifier can help. To invoke generic web services, we can use the HTTP output plugin, which is part of the core of Fluentd. To give a sense of the art of the possible, here is a summary of the general capabilities supported by this plugin:
Configures content type, setting the formatter automatically (a formatter can also be defined explicitly)
Defines headers so extra header values required can be defined (e.g., API keys)
Configures the connection to use SSL and TLS certificates, including defining the location of the certificates to be used, versions, ciphers to be used, etc.
Creates log errors when nonsuccessful HTTP code responses are received
Supports basic authentication (no support for OAuth at the time of writing)
While automating problem resolution in the manner described takes us toward self-healing systems, not many organizations are prepared for or necessarily want to be this advanced. They would rather trust a quick human intervention to determine cause and effect than rely on an automated diagnosis where a high degree of certainty can be difficult. People with sufficient knowledge and the appropriate information can quickly determine and resolve such issues. As a result, Fluentd has a rich set of plugins for social collaboration mechanisms. Here are just a few examples:
IRC (Internet Relay Chat) (https://tools.ietf.org/html/rfc2813)
Twilio (supporting many different comms channels) (www.twilio.com)
Jabber (https://xmpp.org)
Redmine (www.redmine.org)
Typetalk (www.typetalk.com)
PagerDuty (www.pagerduty.com)
Slack (https://slack.com/)
Obviously, we do need to include the relevant information in the social channel communications. A range of things can be done to help that, from good log events that are clear and can be linked to guidance for resolution, to Fluentd being configured to extract relevant information from log events to share.
Slack has become a leading team messaging collaboration tool with a strong API layer and a free version that is simply limited by the size of conversation archives. As a cloud service, it is a great tool to illustrate the intersection of Fluentd and actionable log events through social platforms. While the following steps are specific to Slack, the principles involved here are the same for Microsoft Teams, Jabber, and many other collaboration services.
If you are already using Slack, it is tempting to use an existing group to run the example. To avoid irritating the other users of your Slack workspace as a result of them receiving notifications and messages as you fill channels up with test log events, we suggest setting up your own test workspace. If you aren’t using Slack, that’s not a problem; in appendix A, we explain how to get a Slack account and configure it to be ready for use. Ensure you keep a note of the API token during the Slack configuration, recognizable by the prefix xoxb
.
Slack provides a rich set of configuration options for the interaction. Unlike MongoDB and many IaaS- and PaaS-level plugins, as Slack is a SaaS service, the resolution of the Slack instance is both simplified and hidden from us by using a single token
(no need for server addresses, etc.). The username
isn’t about credentials, but how to represent the bot that the Fluentd plugin behaves as; therefore, using a meaningful name is worthwhile. The channel
relates to the Slack channel in which the messages sent will be displayed. The general channel exists by default, but you may wish to create a custom channel in Slack and restrict access to that channel if you want to control who sees the messages. After all, do you want everyone within an enterprise Slack setup to see every operational message?
The message
and message_keys
attributes work together with the message using %s
to indicate where the values of the identified payload elements are inserted. The references in the message
relate to the JSON payload elements listed in the message_keys
in order sequence.
The title
and title_keys
work in a similar way to message
and message_keys
, but for the title displayed in the Slack UI with that message. In our case, we’re just going to use the tag
. The final part is the flush
attribute; this tells the plugin how quickly to push the Slack messages to the user. Multiple messages can be grouped up if the period is too long. To keep things moving quickly, let’s flush every second.
Edit the existing configuration provided (Chapter4/Fluentd/rotating-file -read-slack-out.conf
) to incorporate the details captured, in the Slack setup. This is illustrated in the following listing.
<match *> @type slack token xoxb-9999999999999-999999999999-XXXXXXXXXXXXXXXXXXXXXXXX ❶ username UnifiedFluent ❷ icon_emoji :ghost: ❸ channel general ❹ message Tell me if you've heard this before - %s ❺ message_keys msg ❻ title %s ❼ title_keys tag ❽ flush_interval 1s ❾ </match>
❶ This is where the token obtained from Slack is put to correctly identify the workspace and legitimize the connection.
❷ Username that will be shown in the Slack conversation
❸ Defines an individual emoji (icon) to associate the bot
❹ Which channel to place the messages into, based on the name. In our demo, we’re just using the default channel.
❺ Message to be displayed, referencing values in the order provided in the configuration. The message attribute works in tandem with the message_keys attribute.
❻ Names the log event’s JSON elements to be included in the log event. The named element is then taken by the message tag and inserted into the message text.
❼ Title for the message takes the values from title_keys. Using order to map between the config values.
❽ Definition of the message’s title
❾ Defines the frequency at which Fluentd will get Slack to publish the log events
Before running the solution, we also need to install the Slack plugin with the command fluent-gem install fluent-plugin-slack
. Once that is installed, we can start up the log simulator and Fluentd with the following commands:
fluentd -c Chapter4/Fluentd/rotating-file-read-slack-out.conf
groovy LogSimulator.groovy Chapter4/SimulatorConfig/social-logs.properties ./TestData/small-source.txt
Once this is started, if you open the #general channel in the web client or app, you will see messages from Fluentd flowing through.
All the details for the Slack plugin can be obtained from https://github.com/sowawa/fluent-plugin-slack. Our illustration of Slack use is relatively straightforward (figure 4.3). By using several plugins, we can quickly go from the source tag to routing the Slack messages to the most relevant individual or group directly. Alternatively, we have different channels for each application and can direct the messages to those channels.
For a long time, good security practice has told us we should not hardwire code and configuration files with credentials, as anyone can look at the file and get sensitive credentials. In fact, if you commit to GitHub, it will flag with the service provider any configuration file or code that contains a string that looks like a security token for a service GitHub knows about. When Slack is told and decides it is a valid token, then be assured that it will revoke the token.
So how do we address such a problem? There are a range of strategies, depending upon your circumstances; a couple of options include the following:
Set the sensitive credentials up in environment variables with a limited session scope. The environment variable can be configured in several ways, such as with tools like Chef and Puppet setting up the values from a keystore.
Embed a means to access a keystore or a secrets management solution, such as HashiCorp’s Vault, into the application or configuration.
The configuration files may not look like it is possible to secure credentials based on what we have seen so far in the book. We can achieve both of these approaches for securely managing credentials within a Fluentd configuration file, as Fluentd allows us to embed Ruby fragments into the configuration. This doesn’t mean we need to immediately learn Ruby. For the first of these approaches, we just need to understand a couple of basic patterns. The approach of embedding calls to Vault is more challenging but can be done.
The challenge is to set up your environment so that you have an environment variable called SlackToken, which is set to hold the token you have previously obtained. Then customize Chapter4/Fluentd/rotating-file-read-slack-out.conf
to use the environment variable, and rerun the example setup with the commands
fluentd -c Chapter4/Fluentd/rotating-file-read-slack-out.conf
groovy LogSimulator.groovy Chapter4/SimulatorConfig/social-logs.properties ./TestData/small-source.txt
Confirm that log events are arriving in Slack.
By setting up the environment variable, you’ll have created a command that looks like either
The configuration will now have changed to look like the example in the following listing.
<match *> @type slack token "{ENV["slack-token"]}" username UnifiedFluent icon_emoji :ghost: channel general message Tell me if you've heard this before - %s message_keys msg title %s title_keys tag flush_interval 1s </match>
In chapter 1, we highlighted the issue of different people wanting different tools for a range of reasons, such as the following:
To perform log analytics as different tools and to have different strengths and weaknesses
To multicloud, so specialist teams (and cost considerations of network traffic) mean using different cloud vendor tools
To make decisions that influence individual preferences and politics (previous experience, etc.)
As we have illustrated, Fluentd can support many social platforms and protocols. Of course, this wouldn’t be the only place for log events to be placed. One of the core types of destination is a log analytics tool or platform. Fluentd has a large number of plugins to feed log analytics platforms; in addition to the two we previously mentioned, other major solutions that can be easily plugged in include
Then, of course, we can send logs to a variety of data storage solutions to hold for later use or perform data analytics with; for example:
Postgres, InfluxDB, MySQL, Couchbase, DynamoDB, Aerospike, SQL Server, Cassandra
Storage areas such as AWS S3, Google Cloud Storage, Google BigQuery, WebHDFS
So, the question becomes, what are my needs and which tool(s) fit best? If our requirements change over time, then we add or remove our targets as needed. Changing the technology will probably raise more challenging questions about what to do with our current log events, not how to get the data into the solution.
Fluentd has an extensive range of output plugins covering files, other Fluentd nodes, relational and document databases such as MongoDB, Elasticsearch, and so on.
Plugin support extends beyond analytics and storage solutions to collaboration and notification tools, such as Slack. This allows Fluentd to drive more rapid reactions to significant log events.
Fluentd provides some powerful helper plugins, including formatters and buffers, making log event output configuration very efficient and easy to use.
Log events can be made easy to consume by tools such as analytics and visualization tools. Fluentd provides the means to format log events using formatter plugins, such as out_file and json.
Buffer helper plugins can support varying life cycles depending on the need, from the simple synchronous cache to the fully asynchronous. With this, the buffer storage can be organized by size or number of log events.
Buffers can be configured to flush their contents not just on shutdown, but also on other conditions, such as new events being buffered for a while.
3.22.51.241