Chapter 15. Incident Management with ESM

 

“Gettin’ good players is easy. Gettin’ ’em to play together is the hard part.”

 
 --Casey Stengel

Incident Management Basics

In 2005, I conducted a webcast for the SANS Institute (SysAdmin, Audit, Network, Security) on incident management with a gentleman by the name of Matthew Klunder, a senior consultant with a big four consultancy firm. Together we explored the makeup of a strong incident management program and received some excellent feedback from SANS listeners. Since the webcast was tightly associated with ESM capabilities for incident management, I decided to build this chapter on the framework we used, and to include the details we garnered from listener feedback. This chapter will help summarize the specific capabilities of ESM as part of a larger incident management initiative.

Incident management is an outgrowth of incident response. It associates all the fundamentals of actually responding to an incident with the broader requirements of ensuring that the process—from beginning to end and back to the beginning again—aligns with overall business objectives.

Thus far I’ve discussed how ESM can be leveraged when addressing insider threats, and I’ve touched on a number of incident management capabilities such as notifications, reporting, and remediation. However, the subject of incident management is deep and warrants a dedicated chapter. While we won’t exhaust the subject (there are plenty of detailed books on incident response already out there), this chapter’s focus is specific to incident management with ESM, and to the teamwork required for building a successful incident management program.

I’ve never seen or heard of a situation where one person acting alone managed an insider. As I’ve indicated, an insider threat can become political and so requires multiple individuals and groups to manage it. Remediation truly takes a team effort. Choosing the right team to manage these threats, assigning responsibility, planning, practicing, and keeping the incident management program up-to-date—all this can be tremendously challenging.

Drilling on incident response programs (sometimes called war games or dry runs) is a valuable technique; valuable because questions that aren’t usually thought of with desktop reviews can be addressed; questions such as, “Do we need to go to the media before this becomes public? If so, who is our spokesperson?” These drills should be done at least once a year, and they involve the participation of all key team members. For an insider event this will include more than the IT team; it will also include human resources, legal departments, and other related groups. The best part of war games is that the organization will discover what works and what doesn’t. Often they will discover overlapping tasks, problems in the communication channel, and other things that simply don’t work or make sense in practice. This is a positive outcome because it allows the organization to refine the procedure and adapt to changes in the environment.

Another benefit is that certain personalities react particularly well to high stress situations, and through these war games leaders emerge. These leaders should then be assigned greater responsibilities during a crisis situation to help coordinate efforts. Additionally, these individuals, working with appropriate members of the security team, should meet periodically to review the incident response policies and procedures, and to make refinements as needed.

While this sounds like a lot of work with no clear place to start, applying a framework to incident management allows these programs to take shape. The framework for incident management should have risk mitigation at its core. At every step along the process, the mitigation of risk should be considered as it relates to prevention, detection, response, remediation, and their supporting processes. By having an incident management program in place, several improvements can be made and cost can be reduced.

Improved Risk Management

An incident management program can foster more effective communications. People will have a better understanding of who is involved, who is in charge, what roles everybody plays, and of their responsibilities. When this happens, remediation efforts become more focused so that issues can be resolved more quickly. With a focused strategy, time and resources aren’t misspent. Finally, by defining the incident types—such as an insider threat—and the related incident management processes, there can be more detailed subgroups defining the appropriate action. Again, responding to an external, nameless, faceless threat is considerably different than responding to an insider, and these differences should be clarified within the process.

Improved Compliance

Incident management is a key ingredient in regulatory compliance. There are potential legal exposures if sensitive information leaks. Also, the need to demonstrate long-term compliance can be associated with having an integrated incident management program. Other important aspects are: being able to report on specific compliance criteria, analyze trends, and calculate efficiencies lost or gained.

Reduced Costs

Addressing an incident requires time, human resources, and money—efforts that range from rebuilding servers and restoring data to talking with the media and playacting customers, employees, partners and shareholders. Spending too much money on these efforts has a negative impact on the bottom line and erodes confidence in the organization. An effective incident management program should minimize these costs and establish a framework that requires less time for investigation and remediation.

Current Challenges

The challenges in incident management programs are usually process, organization, and technology related.

Process

In the past, incident management programs haven’t had their processes fully linked with other IT processes. This created situations in which the programs were fundamentally prevented from taking maximum advantage of IT capabilities. With insider threats like those outlined in the book’s case studies, IT can play a crucial role. It’s also important to align incident management with two other processes—change management and compliance management.

As with IT, there is a significant overlap between incident management programs and those designed for change and compliance management. For example, an anomaly in change management—such as a system modification outside of a change management schedule, or a device bound by HIPAA regulations having extraneous services running—could be an indicator of a security incident, and perhaps of an insider incident. If these management processes are not integrated with the incident management processes—for example, through an ESM—then having a holistic view of an organization’s security posture is not possible.

Organization

Still, simply running events through an ESM—generating cases, alerts, and reports—will not by itself yield the required process results. A mechanism for analyzing and responding to discovered issues must be well defined and practiced to be effective. In order to achieve effectiveness, organizational issues must be considered.

Many organizations lack stakeholder involvement within incident management groups. Key stakeholders must be represented, and there has to be coordination among the participants. The group must choose one person to take the responsibility of leading it. Training must be conducted, and roles must be understood.

Technology

Too often, organizations have plenty of technology, but haven’t done a good job of integrating the technology into their incident management program. This is where ESM can help a great deal in incident management. It provides a central, secure repository for an organization’s events, business logic, assets, vulnerabilities, and best practices. It allows different groups to have different views of the organization, and to have visualization and reporting capabilities that make information easier to understand in less time. The ESM may alert one group within the organization by e-mail, send pages to or open cases for another group, and for yet another group, generate reports. An executive isn’t likely to be concerned with the bits and bytes of an attack, while an analyst trying to remediate the event doesn’t need to track response time trends and overall operational impact through a high-level report.

The ESM not only acts as a collection and investigation point, it can also be used to manage and track the entire incident. All actions can be tracked in the ESM case management system, and all alerts can be tracked to ensure they have been acknowledged, and if not acknowledged, escalated to another tier such as from a level-1 analyst to level-2 analyst or to a team manager. After the fact, the process can be reviewed to discover where the organization can develop greater efficiency.

Building an Incident Management Program

With the primer in place, I’ll discuss the incident management program in eight areas:

  1. Defining risk based on what is important to the business

  2. Process

  3. Training

  4. Stakeholder involvement

  5. Remediation

  6. Documentation

  7. Reporting and metrics

  8. Automation

Defining Risk

Organizations must manage several types of risk, including those related to compliance, legal, financial, and technological drivers. Past incident management programs put too much focus on the technological risk. This was, in part, because technological risk is easier to define in terms of impact, and systems are relatively easy to quantify. Often the incident management team was from an IT organization where technology was the specialty. I suppose that, had accounting departments run these programs in the past, I would be talking about too much emphasis on accounting practices and not enough on technology. But the result of a myopic, technology-centric approach to incident management was an imbalanced program where risk was not associated with the overall business; thus, the information security teams were viewed as out-of-touch with business objectives. To better address this, and for risk definition, we can follow a five-step process.

Five Steps to Risk Definition for Incident Management

Step 1. Define the risks to be managed—for example, an insider maliciously handling customer records.

Step 2. Map specific incident types to those risks—for example, brute-force login attempts, suspicious activity, or questionable patterns.

Step 3. Define what those incident types look like and what their indicators are—for example, an ESM alert to multiple failed login attempts followed by a success from the same source to the same destination, or the ESM’s alert to a removable media device having been plugged into a system under regulatory compliance.

Step 4. Identify the information you want from impacted systems—for example, the ESM should be monitoring network devices, servers and applications, and have a context for asset values, vulnerabilities, actors, data content, and policy.

Step 5. Configure systems to generate required data—for example, if the CRM system is creating audit data related to an insider, but that information isn’t being monitored by the ESM, it doesn’t add any value.

Process

Process is sometimes a dirty word, but in incident management, it helps define what should be done with the data and dictates actions for both those kinds of incidents that are already well defined, and for those that are not. To ensure that the individuals on the incident management team understand how to use the ESM in an emergency—as opposed to trying to invent a strategy as they go—an effective process will follow the points below. The process will be:

  • Reasonable

  • Flexible

  • Repeatable

  • Measurable

  • Consistent with legal and regulatory obligations

  • Agreed upon by all major stakeholders

And it will address root causes of issues for broader problem solving.

Based on the above guidelines, I suggest that a broad incident management template and a set of incident management plans for specific incident types be put in place. This will aid in risk reduction for known incident types and provide flexibility when an unexpected incident type emerges.

One can also use Meta-process development to create a generic incident management template that allows for process acceptance and provides cross-system integration with existing workflow processes. Incident-specific process development will further enable opportunities for process automation and allow for consistent risk management across the organization. Additionally, it will automatically leverage common data sources and investigative steps for mitigation.

The following Meta-process diagram defines a high-level security incident management process that combines process, people, and technology. Figure 15.1 is a good representation of an incident management workflow that can be integrated into ESM.

Figure 15.1. 

Training

The staff must be trained to respond. Many people on the incident management team have likely never been involved in an incident management program. A trained staff that practices through tabletop discussion and through acting out the events in war games can better prepare for a real incident. It also helps to point out efficiencies gained by—and flaws within—the current incident management framework. This will help ensure that the staff members understand how they can utilize ESM for their role in an effective manner. During an emergency, people should not be guessing how to use the ESM to find information.

Ongoing training is necessary for any mature incident management program. Organizations are dynamic; people come and go and change roles, and technologies change. Most of all, within a crisis situation, training is what keeps the process on track and keeps the team members working cohesively within their roles. I’ve seen this many times; without training, one or two people give up because of the stress, and some people try to run the entire effort themselves. Neither is an acceptable alternative to a well-trained, cohesive team.

Stakeholder Involvement

Incident Management involves groups outside of security and outside of IT. It is important not only to get them involved, but also to get their buy-in for the incident management process. There should be cross-departmental mechanisms for invoking the process and methods for handing off responsibilities. This will reduce the risk of confusion and minimize the possibility of the investigation’s being mishandled. It also helps define a backup strategy for key roles in case a key person isn’t present at the time of the incident. Some stakeholders who—depending on the type of incident—should be involved are:

  • Human Resources

  • Legal Counsel

  • Public Relations

  • IT, Facilities, and Telephony

  • Security and Network/System Operations

    • Typically one person or a group of people within the security or operations groups will be running the ESM at the core of the investigatory efforts.

  • At least one member of Executive Management

  • The Incident Management Team

Remediation

There are two types of remediation—technical and non-technical. In technical remediation, the incident management team relies on the ESM itself to respond with or without human intervention. This might mean blocking an IP address, turning off a port on a switch, or disabling a user account. Non-technical forms include providing training and awareness, employee reviews, formal disciplinary actions, staffing changes, and so forth.

Documentation

The incident should be well documented. Keeping a history of who was involved, what they did, and the outcome helps track the process and aids in improving skills for dealing with the next incident. ESM will provide for tracking the incident, annotating events, generating reports, and keeping a knowledge base of information. Often, the incident management program will be built into the knowledge base and, during the response period, be treated like a checklist. In this way—directly from the ESM—individuals can be notified, events can be escalated, and tracking can be centralized.

This helps preserve chain-of-custody best practices by processing the information within the ESM and assists in creating an evidence trail. Also, access to the information is tracked. Sometimes in a crisis situation the last things that one considers are audit trails. Built-in ESM audit, ACLs, and tracking capabilities ensure that integrity is achieved.

Finally, the investigation process information, the events, and notes from the incident can all be captured into the case management system. Once there, this information is archived and can be reported on to make analysis of the incident after-the-fact more understandable.

Reporting and Metrics

I’ve heard it said that if you can’t report on it or measure it, it doesn’t exist. Pre-defined reports for tracking an incident are a huge time saver. They can be high-level, or very detailed, and can be reviewed along with other process notes and cases. They help determine what needs to be improved upon. For example: How long did it take to resolve the incident? Did it take longer than the last incident? How many people were involved and who did what? It also creates ongoing proof of compliance that establishes due diligence and is a record of security posture improvements.

Summary

Enterprise-level ESMs can provide a secure, centralized, real-time event collection, event processing, incident notification, incident remediation and incident management solution. Additionally, it can apply the same capabilities to forensic information. ESM can collect data from a breadth and depth of products, correlate that information and prioritize alerts with more than just event data, but also with asset information, vulnerability information, compliance requirements, locations, geographies, and other business relationships.

ESM can provide chain-of-custody best practices along with a native case management system and/or integration with third party case management systems for seamless workflow. The ESM knowledge base can be a repository for policies, procedures, guidelines, contact information, best practices, and the like.

Another valuable concept when responding to an incident is sharing information across departments. Information that is valuable to HR is much different than that valued by IT and executive management. So they will need different forms of access or at least different forms of reports. From a security analyst’s perspective, real-time situational awareness assists with incident identification and investigation. Additionally, correlation, anomaly detection, and pattern discovery create a holistic view of the organization’s security posture and the identification of outlier events and patterned incidents. All this is extremely valuable operationally, but an executive manager may need a high-level static report that explains the net risk. The executive manager may also require metrics for measuring employee and technology effectiveness per-incident or trends over time. For a successful incident-management program, ESM must provide all these functions.

Finally, enterprise security management solutions are designed to offer enterprise-level, mission-critical solutions. They are extremely powerful, scalable, and extensible. They can be used for security management, compliance, and insider threat. They leverage correlation, anomaly detection, pattern discovery, reporting, and automation, thus reducing costs, increasing efficiencies, and delivering useful metrics. With organizations merging traditionally disparate roles such as network operations, system administration, security, compliance, and others, having visibility across an organization’s entire environment is paramount. ESM is particularly effective when leveraged as part of an overall strategy that also considers people and process along with technology.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.166.190