Chapter 6. Postmortem and Improvement

This chapter covers the following topics:

Collected Incident Data

Root-Cause Analysis and Lessons Learned

Building an Action Plan

After any security incident, you should hold a postmortem. At this postmortem, you should look at the full chronology of events that took place during the incident. This chapter includes common best practices when documenting a security incident postmortem.

The postmortem is one of the most critical steps in incident management. The development of the postmortem should be based on analysis of the gaps that enabled a security incident to occur and resulting recommendations for improvements. These recommendations will impact your policies, processes, standards, and guidelines. They will also indirectly impact people—your staff and other personnel. Based on gap analysis, you should design and implement solutions as necessary.

Postmortems can also help you justify increases to your budget for technology solutions that can help you avoid damage that you experienced during the incident. This is why it is important that you identify all weaknesses and holes in systems, infrastructure defenses, or policies that allowed the incident to take place.

Collected Incident Data

The postmortem is one of the most important parts of incident response and is also the part that is most often omitted. As mentioned in the previous chapter, documenting events that occurred during the previous phases (identification, classification, traceback, and reaction) is important to effectively create a good postmortem following a security incident. The collection of this data is important because it can be used for future improvement in the process, policies, and device configuration. This data can also be used to calculate the cost and the total hours of involvement and may help you justify additional funding of the incident response team.

This also can help you to understand changes in new security threats and trends. You can use the data and lessons learned from the postmortem as input to improve security policies, processes, and system configurations. This is illustrated in Figure 6-1.

Figure 6-1. Postmortem Looped Feedback

Image

Try to address the "who, what, how, when, why" questions in your postmortem. Table 6-1 demonstrates this approach.

Table 6-1. Typical Questions Answered in a Postmortem

Image

The answers to questions like those included in Table 6-1 should be collected in a collaborative effort between the team members who help on the identification, classification, traceback, and reaction phases. Keep in mind that if you ask questions that are too broad, you may have different perspectives within your staff. This is not necessarily a problem; however, you want to collect clear and concrete facts. If you ask questions that are too narrow, you may end up limiting the input and information that you can collect and analyze from your team experience during the incident. On the other hand, you should collect data that is clear and concrete, rather than collecting data simply because it is available and may be incorrect.

The analysis of the data collected in the postmortem will also help you to measure the success of the incident response team. However, the postmortem process will fail miserably if the problem review board is used as a forum to point fingers at specific staff members or organizational divisions. The most important thing is to understand that the data collected in the initial stage of the postmortem helps you organize a list of lessons learned during the incident.

Figure 6-2 shows the first part of a basic incident response report and postmortem. In this example, Joe Doe from a fictitious company called SecureMe is the author of the report.

Figure 6-2. Incident Response Report and Postmortem Example

Image

In Figure 6-2, a member of the SecureMe incident response team reports that numerous ICMP packets were sent to a web server farm that is part of an e-commerce solution that belongs to its sales department. The fields on the form include most of the questions listed in Table 6-1. Figure 6-2 is merely a basic example. You can expand this form by incorporating more detailed information that is appropriate for your environment and organization, such as the following:

• Total person-hours spent working on the incident

• Elapsed time from the beginning of the incident to its resolution

• Elapsed time for each stage of the incident handling process

• Total hours spent by the incident response team in responding to the initial report of the incident

• Estimated monetary damage from the incident

Root-Cause Analysis and Lessons Learned

Always remember that "lessons learned" is knowledge or understanding gained by experience (in this case, by the experience during the security incident). The Lessons Learned section in your postmortem should focus on identifying incremental and innovative improvements that will measurably improve the following areas of the organization:

• Processes and policies

• Technology and configurations

The postmortem should include both negative and positive experiences. You should highlight the recurrence of successful outcomes while helping to prevent the recurrence of unsuccessful outcomes.

The Lessons Learned section in the postmortem will also help you to improve your risk management processes. You can incorporate these lessons learned into several areas of risk management. One of the key inputs to risk identification is historical information. An input to both qualitative and quantitative risk analysis is identified risks, which can be obtained in your postmortem. Each incident response team should evolve to reflect new threats, improved technology, and lessons learned.

You should establish criteria for a lessons learned process. More importantly, you should turn "lessons learned" into "applied lessons." The following section gives you tips on how to build an action plan from the lessons learned during each phase of the incident response.

Figure 6-3 shows the Lessons Learned section of the SecureMe Incident Response Report and Postmortem.

Figure 6-3. Lessons Learned Section of Report

Image

The questions and information in the form outlined in Figure 6-3 are just examples of the items you can incorporate within your Lessons Learned section in your postmortem. In addition, you can build a rating system of different areas within your incident response ecosystem. For instance, you can list several areas under several major sections, such as the following:

• Tools and resources

• Incident response policies and processes

• Incident response team

• Timeliness of resolution

• Collaboration with other teams

Under each of these categories, you can list more detailed items or subcategories and then rate them. You can use a simple scale from 1 to 5, such as the following:

1. Poor

2. Needs improvement

3. Average

4. Good

5. Excellent

Note

The rating system outlined here is just an example. The numbering scheme should be based on the needs of your organization.

At the end of this phase, you can calculate an overall average and use metrics to rate the effectiveness of your incident response process and resources.

Building an Action Plan

After you have collected all necessary information and documented the different lessons learned, you should build a comprehensive action plan to address any deficiencies in processes, policies, or technology. Some underlying causes may remain unknown at the time of the initial post-incident meetings; however, you can capture these causes as open action items to be closed when you have completed your final research.

Prioritize the gaps identified to make sure that you address the most critical first. In addition, understand the root cause of gaps and problems identified. One aspect that sometimes gets lost in the incident postmortems is exploring the reasons for the problems identified. If you do not pay attention to underlying causes, you may fix specific problems and improve particular procedures; however, you will likely encounter different consequences of the same fundamental errors that caused those particular problems.

When you build an improvement plan based on the information collected in the lessons learned, each action item should have the following (at the very minimum):

• Clear description

• Person assigned

• Due date for follow-up

• Priority

This reduces risks that could develop if you fail to follow up on items that can present future threats. This concept is illustrated in Figure 6-4.

Figure 6-4. Action Items

Image

Summary

It is highly recommended that your Computer Security Incident Response Team (CSIRT) perform a postmortem after any security incident. This postmortem should identify the strengths and weaknesses of the incident response effort. With this analysis, you can identify weaknesses in systems, infrastructure defenses, or policies that allowed the incident to take place. In addition, the postmortem can help you identify problems with communication channels, interfaces, and procedures that hampered the efficient resolution of the reported problem.

This chapter offered you several tips on how to create effective postmortems and how to execute post-incident tasks. It included guidelines for collecting post-incident data, documenting lessons learned during the incident, and building action plans to close any gaps that are identified.

It is worth mentioning that many individuals claim to always conduct post-incident analysis; however, they rarely execute and close the gaps identified. Always make sure that you follow up an incident by addressing all the gaps and communicating the lessons learned to other members of the organization. Follow up by educating employees, especially the incident coordinators. Having a group of people who know all the processes and who can guide the various departments of the company to cooperate in response to an issue is important. Work with incident coordinators to fix processes or create new ones. Incident coordinators may also be able to help educate the rest of the company on these processes. You definitely want everyone in the organization to understand at least where to report a suspected problem or concern.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.146.255.127