Overseeing Compliance of Security Operations ◾  219
© 2011 by Taylor & Francis Group, LLC
Isolate/Contain
Containing the incident means to halt the negative eect on the enterprise and to stop
the spread of the incident. Containment is a dicult balance of maintaining eective
operations, stopping the spread of the incident (e.g., the reach of an attacker within
the enterprise network), and not tipping o a potential intruder. Deployment of an
on-site team to survey the situation is most eective; get the right people to the right
place quickly to determine how to approach the containment balance. A judgment
call is necessary to continue operations as usual or to halt operations until you better
understand the nature of the anomaly. e purpose of security is to manage risk in
balance with minimized cost and maximized revenue. Disrupting a revenue stream is
a serious decision, and one that should include upper management, IT, and security.
Tipping o an intruder provides them with the ability to cover his or her tracks
(e.g., begin to delete log les or remove applications the intruder planted) as well as
informing the intruder that you have the ability to detect that particular activity, causing
the intruder to possibly return with a more sophisticated attack that is harder to detect.
Recovery and Restoration of Business Function
Upon containing the aected system, the next priority is to restore the disrupted
business function. is may or may not involve continued use of the aected sys-
tem. Perhaps an alternative system is available or a mirror system (e.g., in a scenario
where multiple systems are used for load balancing, an additional benet is redun-
dancy in the event of system failure). Moreover, perhaps an alternative process is
necessary that may achieve the same results. For example, the order-processing
department may not be able to enter orders online, but it may be able to ll out a
manual form and fax it. Is this eective? Yes, in that it produces the desired result
of continuing to take customer orders. Is this ecient? Probably not, because ll-
ing out a manual form and faxing takes more time and costs more due to the use
of paper and phone lines. However, customer orders are still being processed albeit
not as eciently as with the on-line system. is is the purpose of restoring busi-
ness functionality; restore eectiveness rst, restore eciency second, restore the
original service once you nd and treat the root cause of the problem.
Root Cause Analysis
ere is a tremendous dierence between treating a symptom and treating the
problem. For example, medicine for the common cold does not treat the problem;
it merely masks the symptoms of coughing, sneezing, and sore throat. Even though
the symptom severity is less, the real problem of the disease still exists and runs its
course. Likewise, containing an incident and restoring business functionality may
be just treating the symptoms. Further investigation into the details is necessary to
nd the root cause and x the problem.
220 ◾  Official (ISC)
2
® Guide to the ISSMP® CBK®
© 2011 by Taylor & Francis Group, LLC
One useful root cause analysis (RCA) process is DOE-NE-STD-1004-92
Department of Energy Guideline Root Cause Analysis Guidance Document, February
1992. Even though the publication is no longer ocial DoE doctrine, it is still a
good foundation for an enterprise RCA process.
Fix the Problem
To eradicate an anomaly means to remove it from the system in question.
Eradication requires knowledge of the root problem to ensure that you are indeed
treating the problem and not just a symptom. An RCA will help achieve this. Part
of the investigation may involve backing up the system under investigation. Back
up the aected system to media separate from the typical backup process. is is to
avoid the spread of the anomaly to backup media.
Problem eradication may require restoration from the last clean backup. How
are you sure exactly what the last clean backup is? e RCA may provide insight
into this. If you restore a backup, validate the security of the system by perform-
ing thorough scanning. If you cannot establish a high level of condence in any
backup restoration, then recreate the system from the original installation media
and restore the data from backups as appropriate.
Detecting a hacker or malware presence is like detecting a mouse in your house.
People dont have a mouse, people have mice; a single presence is more a clue to the
presence of many. erefore, look for additional presences starting with similar sys-
tems and increase monitoring activities in general to look for additional anomalies.
Especially monitor the recovered system for subsequent probes. Once a hacker nds
a door into the enterprise network, it’s worth checking the locks and rattling the
handle to see if it’s still open.
Enterprise Feedback
Now that you have eradicated the problem from the specic system, examine the
remainder of the enterprise for systems with the same vulnerability and x them as
well. Leverage the lessons learned across the enterprise by informing similar opera-
tions nationally and internationally.
Develop a follow-up report immediately following the incident resolution. e
results of this follow-up report will provide insight for conducting a root cause
analysis and otherwise capture lessons learned from the incident response experi-
ence. e follow-up report may prompt enterprise-wide activity as a result of the
lessons learned, including the following:
Vulnerability analysis
Local
Other parts of the enterprise as a result of
RCA
Lessons learned
Overseeing Compliance of Security Operations ◾  221
© 2011 by Taylor & Francis Group, LLC
Improve defenses … from RCA and lessons learned
Remove the cause of the incident
Modify policy, standards, procedure, practice accordingly
A vulnerability analysis will look for the same or similar vulnerabilities exploited
during the incident. An RCA is time consuming and expensive and may or may not
be necessary depending on the nature of the incident. e follow-up activities will
also provide insight into additional security services and mechanisms, or modica-
tions to existing services and mechanisms. e bottom line of follow-up is to remove
the cause of the incident, minimize the likelihood of incident recurrence, and mini-
mize enterprise impact in the event of incident recurrence.
Evaluating Incident Response Capabilities
Incident response is an expensive and valuable enterprise capability. To help discern
if incident response is achieving objectives and to generally discern the business
value of incident response evaluate the following:
Incident response workow eciency
Incident response timeliness
Incident enterprise eect
Customer satisfaction
e incident response evaluation results may be compared to industry bench-
marks that reect many other organizationsperformance. You may engage a third-
party evaluation or audit of incident response performance that has experience
in dening and evaluating against industry standard quality characteristics. e
evaluation may take on two perspectives:
1. How well the IR capability is prepared to address a potential incident
2. How well the IR capability actually handles an incident
e former method is similar to a risk assessment in that certain assumptions are
made as well as comparisons to industry standards to gauge preparedness. e lat-
ter looks at actual reality. e next section provides ideas on performance metrics
for actual reality.
Incident Response Metrics
Incident response metrics reect how well the organization is doing at incident
response. Examining metrics for each phase will help identify potential areas of
improvement. e dierence between time of event occurrence (time
0
) and time
of event detection (time
d
) provides insight into the eectiveness of the monitor and
222 ◾  Official (ISC)
2
® Guide to the ISSMP® CBK®
© 2011 by Taylor & Francis Group, LLC
detect capability. e calculation of (time
d
time
0
) is the time to detect an event.
is time may be seconds, minutes, hours, days, weeks, or months. Very sophisti-
cated attacks may even take years to rst detect. Antivirus scanning of e-mail is
intended to detect malware in incoming e-mail. is scanning is only as good
as the signature les as of the day and hour of scanning. Rescanning e-mails with a
newer signature le may nd malware that previously made it through, i.e., time to
detection is greater for a certain number of malware than for others. is provides
you a clue to look into updating signature les more often, perhaps running mul-
tiple antivirus scanners to detect more malware at initial scan, or adding retroactive
scanning of e-mail with new signatures as part of standard operating procedure.
e time of event receipt at the Help Desk or SOC is time of notication
(time
n
); (time
n
time
d
) is the time from detection to notication. A long time
between detection and notication may provide a clue that employees need better
training on notication procedures. e time for the Help Desk to triage the event
and assign remediation responsibility is triage time (time
t
); (time
t
time
n
) provides
insight into triage capabilities and the time necessary to determine the need to esca-
late the incident to a subject matter expert, the time to restore business functional-
ity (which may not include restoring the actual system), and the time to conduct an
RCA, implement the x to the real problem, and then provide enterprise feedback
for broad remediation eorts.
e more granular the incident response steps, the more focused and useful the
metrics are to show how each step is doing and where modications may be neces-
sary to improve incident response time. Similarly, there can be eciency metrics
assigned at each step that measure the quality of the response, e.g., the correct
people are notied, progress and results, the details of the notication are sucient
for them to make appropriate business decisions, the correct people are assigned for
incident handling, etc.
Additional metrics are possible for a number of known vulnerabilities, pending
patches, patches installed, and priority threats. All of these contribute to an objec-
tive assessment of the current risk posture, security posture, threat space, assets
space, and vulnerability space. All metrics contribute to establishing good business
practice, to showing due diligence in protecting customer information, and toward
reducing culpability of the enterprise in the event of litigation. No enterprise is
expected to be perfect, but all enterprises are expected to provide reasonable pre-
cautions and protective measures. Formal incident response and tracking incident
response metrics go a long way toward showing this.
Problem Management
When encountering interruptions to operations, the expedient response is to work
through or around the interruption to achieve the task at hand. Discovery of the
error or problem that causes the interruption is often secondary to achieving the task.
e error control process is an iterative process that diagnoses errors with the intent
Overseeing Compliance of Security Operations ◾  223
© 2011 by Taylor & Francis Group, LLC
to eliminate them with appropriate changes. e error control process is comple-
mentary to the problem control process where the latter is to handle problems in an
ecient manner from problem identication, notication, recording, classication,
investigation, diagnosis, and treatment.
e key to eective error and problem control is root cause analysis (RCA).
Often, the visible aspects of the operational interruptions are symptoms of the
real cause and not the root of the cause. Treating symptoms is ne with the
objective of achieving the task, but nding and treating the root cause is neces-
sary to eliminate the error or problem. ere are many RCA methods, including
the following:
Ishikawa Diagram
Failure mode and eects analysis
Rapid problem resolution (RPR)
Fault tree analysis
5 Whys
A useful resource to assist with RCA is the now inactive Department of Energy
DOE-NE-STD-1004-92 Root Cause Analysis Guidance Document. ough inac-
tive, it still remains a useful, and free, guide to developing an in-house RCA pro-
cess. e major cause categories in the DOE guide are as follows:
Equipment/Material Problem
Procedure Problem
Personnel Error
Design Problem
Training Deciency
Management Problem
External Phenomena
e DOE cause categories are complementary, and a bit of a renement, to the
Ishikawa Diagram (Figure3.3). Using multiple RCA methods or combining the best
attributes of multiple methods into one RCA approach is ne. Finding the root
cause is often very dicult, and any approach that facilitates looking at the sit-
uation from various perspectives and in as comprehensive breadth and depth as
possible increases the likelihood of nding and xing the root cause.
e Ishikawa Diagram is a shbone diagram (because it resembles the structure
of a shbone) that shows factors that may contribute to the problem. e major
categories include men, machines, and material. e smaller arrows represent sub-
causes to the major causes. For example, a subcause to “Men” may be management
and operators; a subcause to both of these may be knowledge. A aw in knowledge
may be the root cause of the problem because both groups are performing consis-
tent to an old training program and the root cause is they were not prepared with
details under the new training program.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.63.13