Chapter 7. Security Operations

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 7 Security Operations

This chapter covers the following topics:

Investigations: Concepts discussed include forensic and digital investigations and procedures, reporting and documentation, investigative techniques, evidence collection and handling, and digital forensics tools, tactics, and procedures.
Logging and Monitoring Activities: Concepts discussed include audit and review, intrusion detection and prevention, security information and event management, continuous monitoring, egress monitoring, log management, threat intelligence, and user and entity behavior analytics (UEBA).
Configuration and Change Management: Concepts discussed include configuration management and change management, resource provisioning, baselining, and automation.
Security Operations Concepts: Concepts discussed include need to know/least privilege; account, group, and role management; separation of duties and responsibilities; privileged account management; job rotation and mandatory vacation; two-person control; sensitive information procedures; record retention; information life cycle; and service-level agreements.
Resource Protection: Concepts discussed include protecting tangible and intangible assets and managing media, hardware, and software assets.
Incident Management: Concepts discussed include event versus incident, incident response team and incident investigations, rules of engagement, authorization, scope, incident response procedures, incident response management, and the steps in the incident response process.
Detective and Preventive Measures: Concepts discussed include IDS/IPS, firewalls, whitelisting/blacklisting, third-party security services, sandboxing, honeypots/honeynets, anti-malware/antivirus, clipping levels, deviations from standards, unusual or unexplained events, unscheduled reboots, unauthorized disclosure, trusted recovery, trusted paths, input/output controls, system hardening, vulnerability management systems, and machine learning and artificial intelligence (AI) based tools.
Patch and Vulnerability Management: Concepts discussed include the enterprise patch management process.
Recovery Strategies: Concepts discussed include creating recovery strategies; backup storage strategies; recovery and multiple site strategies; redundant systems, facilities, and power; fault-tolerance technologies; insurance; data backup; fire detection and suppression; high availability; quality of service; and system resilience.
Disaster Recovery: Concepts discussed include response, personnel, communications, assessment, restoration, training and awareness, and lessons learned.
Testing Disaster Recovery Plans: Concepts discussed include read-through test, checklist test, table-top exercise, structured walk-through test, simulation test, parallel test, full-interruption test, functional drill, and evacuation drill.
Business Continuity Planning and Exercises: Concepts discussed include business continuity planning and exercises.
Physical Security: Concepts discussed include perimeter security controls and internal security controls.
Personnel Safety and Security: Concepts discussed include duress, travel, monitoring, emergency management, and security training and awareness.

Security operations involves ensuring that all operations within an organization are carried out in a secure manner. It is concerned with investigating, managing, and preventing events or incidents. It also covers logging activities as they occur, provisioning and protecting resources as needed, managing events and incidents, recovering from events and disasters, and providing physical security. Ultimately, security operations involves the day-to-day operation of an organization. Security professionals should receive the appropriate training in these areas or employ experts in them to ensure that the organization’s assets are properly protected.

The Security Operations domain within CISSP addresses a broad array of topics, including investigations, logging, monitoring, provisioning, security operations concepts, resource protection, incident management, detective and preventive measures, patch and vulnerability management, change management, disaster recovery, business continuity, physical security, and personnel safety. The Security Operations domain carries an average weight of 13 percent of the CISSP certification exam, which is the third highest weight of all the eight domains and is tied with two other domains. So, pay close attention to the many details in this chapter!

Foundation Topics

Investigations

Investigations must be carried out in the appropriate manner to ensure that any evidence collected can be used in court. Without proper investigations and evidence collection, attackers will not be held responsible for their actions. In the following sections, we discuss forensic and digital investigations and evidence.

Forensic and Digital Investigations

Computer investigations require different procedures than regular investigations because the timeframe for the investigator is compressed, and a security or other technical expert might be required to assist in the investigation. Also, computer information is intangible and often requires extra care to ensure that the data is retained in its original format. Finally, the evidence in a computer crime is much more difficult to gather.

After a decision has been made to investigate a computer crime, you, as a security professional, should follow standardized procedures, including the following:

Identify what type of system is to be seized.
Identify the search and seizure team members.
Determine the risk that the suspect will destroy evidence.

After law enforcement has been informed of a computer crime, the organization’s investigator’s constraints are increased. You might need to turn the investigation over to law enforcement to ensure that evidence is preserved properly.

In the investigation of a computer crime, evidentiary rules must be addressed. Computer evidence should prove a fact that is material to the case and must be reliable. The chain of custody must be maintained, as described later in the chapter. Computer evidence is less likely to be admitted in court as evidence if the process for producing the evidence has not been properly documented.

Any forensic investigation involves the following steps:

Identification
Preservation
Collection
Examination
Analysis
Presentation
Decision

The forensic investigation process is shown in Figure 7-1.

A block diagram of a Forensic Investigation Process is as follows. 1 identification, 2 preservation, 3 collection, 4 examination, 5 Analysis, 6 Presentation, 7 decision. — **Figure 7-1** Forensic Investigation Process

The following sections cover these forensic investigation steps in detail as well as explain forensic procedures, reporting and documentation, IOCE/SWGDE and NIST, the crime scene, MOM, the chain of custody, interviewing, and investigative techniques.

Identify Evidence

The first step in any forensic investigation is to identify and secure the crime scene and identify the evidence. The evidence is identified by reviewing audit logs, monitoring systems, analyzing user complaints, and analyzing detection mechanisms. Initially, the investigators might be unsure of which evidence is important. Preserving evidence that you might not need, which is the next step in the process, is always better than wishing you had evidence that you did not retain. Therefore, a logical first step in any investigation is identifying the evidence and securing the crime scene.

In digital investigations, the attacked system is considered the crime scene. In some cases, the system from which the attack originated can also be considered part of the crime scene. However, fully capturing the attacker’s systems may not always be possible. For this reason, you should ensure that you capture any data, such as IP addresses, usernames, and other identifiers, that can point to a specific system.

Preserve and Collect Evidence

The next steps in forensic investigations include preserving and collecting evidence. This process involves making system images, implementing chain of custody (which is discussed in detail in its own section later), documenting the evidence, and recording timestamps.

Before collecting any evidence, consider the order of volatility. This order ensures that investigators collect evidence from the components that are most volatile first.

The order of volatility, from most volatile to least volatile, is as follows:

Memory contents
Swap files
Network processes
System processes
File system information
Raw disk blocks

To make system images, you need to use a tool that creates a bit-level copy of the system. In most cases, you must isolate the system and remove it from production or the live environment to create this bit-level copy. You should ensure that two copies of the image are retained. One copy of the image will be stored to ensure that an undamaged, accurate copy is available as evidence. The other copy will be used during the examination and analysis steps. Message digests should be used to ensure data integrity.

Although the system image is usually the most important piece of evidence, it is not the only piece of evidence you need. You might also need to capture data that is stored in cache, process tables, memory, and the registry. When documenting a computer attack, you should use a bound notebook to keep notes.

Remember that you might need to include experts in digital investigations to ensure that evidence is properly preserved and collected. Investigators usually assemble a field kit to help in the investigation process. This kit might include tags and labels, disassembly tools, and tamper-evident packaging. Commercial field kits are available, or you could assemble your own based on organizational needs.

Examine and Analyze Evidence

After evidence has been preserved and collected, the investigator then needs to examine and analyze the evidence. While examining evidence, any characteristics, such as timestamps and identification properties, should be determined and documented. After the evidence has been fully analyzed using scientific methods, the full incident should be reconstructed and documented.

Present Findings

After an examination and analysis of the evidence, it must be presented as evidence in court. In most cases when presenting evidence in court, presenting the findings in a format the audience can understand is best. Although an expert should be used to testify as to the findings, it is important that the expert be able to articulate to a nontechnical audience the details of the evidence.

Decide

At the end of the court proceeding, a decision will be made as to the guilt or innocence of the accused party. At that time, evidence may no longer need to be retained, provided there is no possibility of an appeal. However, documenting any lessons learned from the incident is important. Any individuals involved in any part of the investigation should be a part of this lessons-learned session.

Forensic Procedures

Collecting digital evidence is trickier than collecting physical evidence and must be completed by trained forensic technicians and investigators. These individuals must stay abreast of the latest tools and technologies that can be used to investigate a computer crime.

Technicians and investigators must follow established forensic procedures to ensure that any evidence collected will be admissible in a court of law. It is the trained individual’s responsibility to ensure that the procedures they use follow established standards. Organizations, such as the National Institute of Standards and Technology (NIST) and the International Organization for Standardization and the International Electrotechnical Commission (ISO/IEC), establish standards that help to guide organizations in the proper establishment of these and other procedures. Always consult with these standards prior to performing any investigation to determine whether suggested procedures have changed or if new tools are available.

Reporting and Documentation

After any investigation is over, security professionals should provide reports and documentation to management regarding the incident. This report should be presented to management as quickly as possible so that management can determine whether controls need to be implemented to prevent the incident. This submission to management will often happen prior to the presentation of any legal findings in a court of law. Organizations should establish procedures for ensuring that individuals to whom the reports are submitted have the appropriate clearance. It may also be necessary to redact certain parts of the report to ensure that any criminal cases are not negatively affected.

Although internal reporting is important, security professionals should also have guidelines for when to report incidents to law enforcement. The earlier that law enforcement is involved, the more likely that evidence will be admissible in a court of law. However, most local law enforcement personnel do not have the knowledge or skills to carry out a full digital investigation. If the organization does not have properly trained personnel, a forensic investigator will need to be called in to perform the investigation. Legal professionals should also be brought in to help.

Proper documentation must be maintained throughout the investigation and include logs, chain of custody forms, and documented procedures and guidelines.

IOCE/SWGDE and NIST

The International Organization on Computer Evidence (IOCE) and Scientific Working Group on Digital Evidence (SWGDE) are two groups that study digital forensics and help to establish standards for digital investigations. Both groups release guidelines on many formats of digital information, including computer data, mobile device data, automobile computer systems data, and so on. Any investigators should ensure that they comply with the principles from these groups.

Although the IOCE is no longer a functioning evidence body, it did establish some principles that are still applicable today. The main principles as documented by IOCE are as follows:

The general rules of evidence should be applied to all digital evidence.
Upon the seizure of digital evidence, any actions taken to preserve the evidence should not change that evidence in any way.
When a person needs to access original digital evidence, that person should be suitably trained and authorized for the purpose.
All activity relating to the seizure, access, storage, or transfer of digital evidence must be fully documented, preserved, and available for review.
An individual is responsible for all actions taken with respect to digital evidence while the digital evidence is in that individual’s possession.
Any agency that seizes, accesses, stores, or transfers digital evidence is responsible for compliance with IOCE principles.

NIST SP 800-86, “Guide to Integrating Forensic Techniques into Incident Response,” provides guidelines on the data collection, examination, analysis, and reporting related to digital forensics. It explains the use of forensic investigators, IT staff, and incident handlers as part of any forensic investigation. It discusses how cost, response time, and data sensitivity should affect any forensic investigation.

Crime Scene

A crime scene is the environment in which potential evidence exists. After the crime scene has been identified, steps should be taken to ensure that the environment is protected, including both the physical and virtual environment. To secure the physical crime scene, an investigator might need to isolate the systems involved by removing them from a network. However, the systems should not be powered down until the investigator is sure that all digital evidence has been captured. Remember: Live computer data is dynamic and is possibly stored in several volatile locations.

In response to a possible crime, it is important to ensure that the crime scene environment is protected using the following steps:

Identify the crime scene.
Protect the entire crime scene.
Identify any pieces of evidence or potential sources of evidence that are part of the crime scene.
Collect all evidence at the crime scene.
Minimize contamination by properly securing and preserving all evidence.

Remember that there can be more than one crime scene, especially in digital crimes. If an attacker breaches an organization’s network, all assets that were compromised are part of the crime scene, and any assets that the attacker used are also part of the crime scene.

Access to the crime scene should be tightly controlled and limited only to individuals who are vital to the investigation. As part of the documentation process, make sure to note anyone who has access to the crime scene. After a crime scene is contaminated, no way exists to restore it to the original condition.

MOM

Documenting motive, opportunity, and means (MOM) is the most basic strategy for determining suspects. Motive is all about why the crime was committed and who committed the crime. Opportunity is all about where and when the crime occurred. Means is all about how the crime was carried out by the suspect. Any suspect who is considered must possess all three of these qualities. For example, a suspect might have a motive for a crime (being dismissed from the organization) and an opportunity for committing the crime (user accounts were not disabled properly) but might not possess the means to carry out the crime.

Understanding MOM can help any investigator narrow down the list of suspects.

Chain of Custody

At the beginning of any investigation, you should ask the questions who, what, when, where, and how. These questions can help you get all the data needed for the chain of custody. The chain of custody shows who controlled the evidence, who secured the evidence, and who obtained the evidence. A proper chain of custody must be preserved to successfully prosecute a suspect. To preserve a proper chain of custody, you must collect the evidence following predefined procedures in accordance with all laws and regulations.

Chain of custody forms should be used to track who has access to the evidence, when that access occurs, and other valuable details based on the organization’s or investigation’s needs. This chain of custody form should be kept with the evidence at all times. For example, if a forensic investigator plans to analyze the contents of a digital log, that investigator should complete the appropriate information on the chain of custody form to indicate when a copy of the digital log was obtained, the type of analysis being performed, and other details.

The primary purpose of the chain of custody is to ensure that evidence is admissible in court. Law enforcement officers emphasize chain of custody in any investigations that they conduct. Involving law enforcement early in the process during an investigation can help to ensure that the proper chain of custody is followed.

Interviewing

An investigation often involves interviewing suspects and witnesses. One person should be in charge of all interviews. Because evidence needs to be obtained, ensuring that the interviewer understands what information needs to be obtained and all the questions to cover is important. Reading rights to a suspect is necessary only if law enforcement is performing the interview. Recording the interview might be a good idea to provide corroboration later when the interview is used as evidence.

If an employee is suspected of a computer crime, a representative of the human resources department should be involved in any interrogation of the suspect. The employee should be interviewed only by an individual who is senior to that employee.

Investigative Techniques

A computer crime involves the use of investigative techniques, which include interviewing (discussed above), surveillance, forensics, and undercover operations.

Surveillance includes both physical surveillance and computer surveillance. Physical surveillance uses security cameras, wiretaps, and visual tracking to monitor movement. Computer surveillance monitors elements of computer use and online behavior. It may also include sting operations, like setting up a honeypot or honeynet.

After interviews are completed and surveillance gathers enough evidence, investigators will want to perform advanced forensic analysis. Organizations can do this by continually monitoring activity, but if law enforcement is involved, a warrant will need to be obtained that will allow forensic analysis of identified computers and devices. Investigators should follow the electronic trail wherever it leads, looking for digital fingerprints in emails, files, and web-browsing histories.

In some cases, crimes may require investigators to go undercover, adopting fake online personae to trap criminals. In this case, investigators should log all interactions as evidence and may even arrange a face-to-face meeting to arrest the perpetrator.

Evidence Collection and Handling

For evidence to be admissible, it must be relevant, legally permissible, reliable, properly identified, and properly preserved. Relevant means that it must prove a material fact related to the crime in that it shows a crime has been committed, can provide information describing the crime, can provide information regarding the perpetrator’s motives, or can verify what occurred. Legally permissible means that evidence is deemed by the judge to be useful in helping the jury or judge reach a decision and cannot be objected to on the basis that it is irrelevant, immaterial, or violates the rules against hearsay and other objections. Reliability means that it has not been tampered with or modified. Properly identified means that the evidence is labeled appropriately and entered into the evidence log. Preservation means that the evidence is not damaged, modified, or corrupted in anyway.

All evidence must be tagged. When creating evidence tags, be sure to document the mode and means of transportation, a complete description of evidence including quality, who received the evidence, and who had access to the evidence.

Any investigator must ensure that evidence adheres to the five rules of evidence (see the following section). In addition, the investigator must understand each type of evidence that can be obtained and how each type can be used in court. Investigators must follow surveillance, search, and seizure guidelines. Finally, investigators must understand the differences among media, software, network, and hardware/embedded device analysis.

Five Rules of Evidence

When gathering evidence, an investigator must ensure that the evidence meets the five rules that govern it:

Be authentic.
Be accurate.
Be complete.
Be convincing.
Be admissible.

Because digital evidence is more volatile than other evidence, it still must meet these five rules.

Types of Evidence

An investigator must be aware of the types of evidence used in court to ensure that all evidence is admissible. Sometimes the type of evidence determines its admissibility.

The types of evidence that you should understand are as follows:

Best evidence
Secondary evidence
Direct evidence
Conclusive evidence
Circumstantial evidence
Corroborative evidence
Opinion evidence
Hearsay evidence

Best Evidence

The best evidence rule states that when evidence, such as a document or recording, is presented, only the original will be accepted unless a legitimate reason exists for why the original cannot be used. In most cases, digital evidence is not considered best evidence because investigators must capture copies of the original data and state.

However, courts can apply the best evidence rule to digital evidence on a case-by-case basis, depending on the evidence and the situation. In this situation, the copy must be proved by an expert witness who can testify as to the contents and confirm that it is an accurate copy of the original.

Secondary Evidence

Secondary evidence has been reproduced from an original or substituted for an original item. Copies of original documents and oral testimony are considered secondary evidence.

Direct Evidence

Direct evidence proves or disproves a fact through oral testimony based on information gathered through the witness’s senses. A witness can testify on what he saw, smelled, heard, tasted, or felt. This is considered direct evidence. Only the witness can give direct evidence. No one else can report on what the witness told them because that is considered hearsay evidence.

Conclusive Evidence

Conclusive evidence does not require any other corroboration and cannot be contradicted by any other evidence.

Circumstantial Evidence

Circumstantial evidence provides inference of information from other intermediate relevant facts. This evidence helps a jury come to a conclusion by using a fact to imply that another fact is true or untrue. An example is implying that a former employee committed an act against an organization due to his dislike of the organization after his dismissal. Circumstantial evidence is often dismissed or never presented, although it is impossible to control the behavior of jurors in this regard once they start deliberation.

Corroborative Evidence

Corroborative evidence supports another piece of evidence. For example, if a suspect produces a receipt to prove she was at a particular restaurant at a certain time and then a waiter testifies that he waited on the suspect at that time, then the waiter provides corroborating evidence through his testimony.

Opinion Evidence

Opinion evidence is based on what the witness thinks, feels, or infers regarding the facts. A witness in opinion evidence is not normally an expert because if an expert witness is used, that expert is able to testify on a fact based on knowledge in a certain area. For example, a psychiatrist can testify as to conclusions on a suspect’s state of mind. Expert testimony is not considered opinion evidence because of the expert’s knowledge and experience.

Hearsay Evidence

Hearsay evidence is evidence that is secondhand, where the witness does not have direct knowledge of the fact asserted but knows it only from being told by someone. In some cases, computer-based evidence is considered hearsay, especially if an expert cannot testify as to the accuracy and integrity of the evidence.

Surveillance, Search, and Seizure

Surveillance, search, and seizure are important facets of any investigation. Surveillance is the act of monitoring behavior, activities, or other changing information, usually of people. Search is the act of pursuing items or information. Seizure is the act of taking custody of physical or digital components.

Investigators use two types of surveillance: physical surveillance and computer surveillance. Physical surveillance occurs when a person’s actions are reported or captured using cameras, direct observance, or closed-circuit TV (CCTV). Computer surveillance occurs when a person’s actions are reported or captured using digital information, such as audit logs.

A search warrant is required in most cases to actively search a private site for evidence. For a search warrant to be issued, probable cause that a crime has been committed must be proven to a judge. The judge must also be given corroboration regarding the existence of evidence. The only time a search warrant does not need to be issued is during exigent circumstances, which are emergency circumstances that are necessary to prevent physical harm, the destruction of evidence, the suspect’s escape, or some other consequence improperly frustrating legitimate law enforcement efforts. Exigent circumstances will have to be proven when the evidence is presented in court.

Seizure of evidence can occur only if the evidence is specifically listed as part of the search warrant unless the evidence is in plain view. Evidence specifically listed in the search warrant can be seized, and the search can occur only in areas specifically listed in the warrant.

Search and seizure rules do not apply to private organizations and individuals. Most organizations warn their employees that any files stored on organizational resources are considered property of the organization. This is usually part of any no-expectation-of-privacy policy.

A discussion of evidence would be incomplete without discussing jurisdiction. Because computer crimes can involve assets that cross jurisdictional boundaries, investigators must understand that the civil and criminal laws of countries can differ greatly. It is always best to consult local law enforcement personnel for any criminal or civil investigation and follow any advice they give for investigations that cross jurisdictions.

Media Analysis

Investigators can perform many types of media analysis, depending on the media type. A media recovery specialist may be employed to provide a certified forensic image, which is an expensive process. An artifact in a digital forensics investigation includes things like registry keys, files, timestamps, and event logs. These are the traces security professionals follow in digital forensic work. They will vary depending on the device type, operating system, and other factors.

The following types of media analysis can be used:

Disk imaging: Creates an exact image of the contents of the hard drive.
Slack space analysis: Analyzes the slack (marked as empty or reusable) space on the drive to see whether any old (marked for deletion) data can be retrieved.
Content analysis: Analyzes the contents of the drive and gives a report detailing the types of data by percentage.
Steganography analysis: Analyzes the files on a drive to see whether the files have been altered or to discover the encryption used on the file.

Software Analysis

Software analysis is a little more difficult to perform than media analysis because it often requires the input of an expert on software code, including source code, compiled code, or machine code. It often involves decompiling or reverse engineering. This type of analysis is often used during malware analysis and copyright disputes.

Software analysis techniques include the following:

Content analysis: Analyzes the content of software, particularly malware, to determine for which purpose the software was created.
Reverse engineering: Retrieves the source code of a program to study how the program performs certain operations.
Author identification: Attempts to determine the software’s author.
Context analysis: Analyzes the environment the software was found in to discover clues to determining risk.

Network Analysis

Network analysis involves the use of networking tools to preserve logs and activity for evidence.

Network analysis techniques include the following:

Communications analysis: Analyzes communication over a network by capturing all or part of the communication and searching for particular types of activity.
Log analysis: Analyzes network traffic logs.
Path tracing: Traces the path of a particular traffic packet or traffic type to discover the route used by the attacker.

Hardware/Embedded Device Analysis

Hardware/embedded device analysis involves using the tools and firmware provided with devices to determine the actions that were performed on and by the device. The techniques used to analyze the hardware/embedded device vary based on the device. In most cases, the device vendor can provide advice on the best technique to use depending on what information you need. Log analysis, operating system analysis, and memory inspections are some of the general techniques used.

This type of analysis is used when mobile devices are analyzed. For performing this type of analysis, NIST makes the following recommendations:

Any analysis should not change the data contained on the device or media.
Only competent investigators should access the original data and must explain all actions they took.
Audit trails or other records must be created and preserved during all steps of the investigation.
The lead investigator is responsible for ensuring that all these procedures are followed.
All activities regarding digital evidence, including its seizure, access to it, its storage, or its transfer, must be documented, preserved, and available for review.

Digital Forensic Tools, Tactics, and Procedures

For evidence collection, investigators will need a digital toolkit. The following should be included as part of any digital toolkit:

Forensic laptops and power supplies
Tool sets
Digital camera
Case folder
Blank forms
Evidence collection and packaging supplies
Software
Air card for Internet access
Cables for data transfer (network, crossover, USB, and so on)
Blank hard drives and other media
Hardware write blockers

The digital toolkit should contain forensic tools that will enable an investigator to obtain data that can be used as evidence. The tools used by investigators are classified according to the type of information they obtain, as shown in the following list:

Disk and data capture tools
File viewers
File analysis tools
Registry analysis tools
Internet analysis tools
Email analysis tools
Mobile device analysis tools
macOS analysis tools
Network forensic tools
Database forensic tools

Many of the tools available today can provide services in multiple areas listed here. Investigators should obtain training in the proper usage of these tools.

Tools that can be included in a digital forensic toolkit include the following:

Digital Forensics Framework (DFF)
Open Computer Forensics Architecture (OCFA)
Computer Aided INvestigative Environment (CAINE)
X-Ways Forensics
SANS Investigative Forensics Toolkit (SIFT)
EnCase Forensic
Registry Recon
The Sleuth Kit (TSK)
LibForensics
Volatility
WindowsSCOPE
The Coroner’s Toolkit (TCT)
Oxygen Forensic Suite
Bulk_Extractor
Xplico
RedLine
Computer Online Forensic Evidence Extractor (COFEE)
PlainSight
XRY
Helix3
UFED

Investigators must also be familiar with the proper digital forensic tactics and procedures that are commonly used. For this reason, investigators should be properly trained to ensure that the tools, tactics, and procedures are followed so that evidence collected will be admissible in court. Keep in mind that you should not be tested on the functionality of the individual tools or the digital forensics tactics and procedures on the CISSP exam; however, you should understand that these tools, tactics, and procedures provide digital forensic investigation automation and investigatory standards compliance. A CISSP candidate’s job role is not defined as performing individual forensic investigation tasks; however, the CISSP professional should be familiar with the tools, tactics, and procedures available to ensure that an organization’s investigator obtains the appropriate tools to perform digital investigations and follows appropriate tactics and procedures.

Logging and Monitoring Activities

As part of operations security, administrators must ensure that user activities are logged and monitored regularly. These activities include audit and review, intrusion detection and prevention, security information and event management, continuous monitoring, egress monitoring, log management, threat intelligence, and user and entity behavior analytics (UEBA).

Audit and Review

Accountability is impossible without a record of activities and review of those activities. Capturing and monitoring audit logs provide the digital proof when someone who is performing certain activities needs to be identified. This goes for both the good guys and the bad guys. In many cases it is required to determine who misconfigured something rather than who stole something. Audit trails based on access and identification codes establish individual accountability. The questions to address when reviewing audit logs include the following:

Are users accessing information or performing tasks that are unnecessary for their jobs?
Are repetitive mistakes (such as deletions) being made?
Do too many users have special rights and privileges?

The level and amount of auditing should reflect the security policy of the company. Audits can be either self-audits or performed by a third party. Self-audits always introduce the danger of subjectivity to the process. Logs can be generated on a wide variety of devices including intrusion detection systems (IDSs), servers, routers, and switches. In fact, a host-based IDS makes use of the operating system logs of the host machine.

When assessing controls over audit trails or logs, address the following questions:

Does the audit trail provide a trace of user actions?
Is access to online logs strictly controlled?
Is there separation of duties between security personnel who administer the access control function and those who administer the audit trail?

Keep and store logs in accordance with the retention policy defined in the organization’s security policy. They must be secured to prevent modification, deletion, and destruction. When auditing is functioning in a monitoring role, it supports the detection security function in the technical category. When formal review of the audit logs takes place, it is a form of detective administrative control. Reviewing audit data should be a function separate from the day-to-day administration of the system.

Log Types

Logging is the process of recording event information to a log file or database. It captures system events, changes, messages, and other information that shows the activities that occur on a system or device. The different types of logs that security professionals use include security logs, systems logs, application logs, firewall logs, proxy logs, and change logs.

Security logs record access to resources, including access to files, folders, and printers. They can record when a user accesses, modifies, or deletes a file or folder. Although most systems will record when key files are accessed, it is often necessary for an administrator to enable auditing on other resources, such as data folders or network printers. When auditing is running on a device, it will affect the performance of that device. For this reason, security professionals should configure auditing only when necessary based on the organization’s security policies.

System logs record system events, such as system and service startup and shutdown. They can help a security professional to determine the actions taken by a malicious user.

Applications logs record actions that occur within a specific application. Security professionals should work with application developers or vendors to determine which type of information should be logged.

Firewall logs record network traffic information, including incoming and outgoing traffic. This usually includes important data, such as IP addresses and port numbers that can be used to determine the origin of an attack.

Proxy logs record details on the Internet traffic that passes through the proxy server, including the sites being visited by users, how much time is being spent on those sites, and if attempts are being made to access prohibited sites.

Change logs report changes made to a specific device or application as part of the change management process.

Audit Types

When auditing is enabled, administrators can select individual events to monitor to ensure user accountability. Audit types include access review audits, user privilege audits, and privileged group audits.

Access review audits ensure that object access and user account management practices adhere to the organization’s security policy. User privilege audits monitor right and permission usage for all users. Privileged group audits monitor when high-level groups and administrator accounts are used.

Intrusion Detection and Prevention

IDSs alert organizations when unauthorized access or actions occur, while intrusion prevention systems (IPSs) monitor the same kind of activity but actually work to prevent the actions from being successful. IDS and IPS devices can be used during investigations to provide information regarding traffic patterns that occur just before an attack succeeds. Security professionals must constantly tune IDS and IPS devices to ensure that the correct activity is being detected or prevented. As changes occur in the way that attacks are carried out, these systems must be adjusted.

Security Information and Event Management (SIEM)

SIEM can collect log and system information to comply with regulatory requirements, provide internal accountability, provide risk management, and perform monitoring and trending. SIEM stores raw information from various systems and devices and aggregates that information into a single database. Security professionals must work together to ensure that the appropriate actions will be monitored and to ensure that the correct examinations of the records occur. Because SIEM systems are centralized repositories of security information, organizations should take particular care to provide adequate security for these systems to ensure that attackers cannot access or alter the records contained in them.

Continuous Monitoring

Any logging and monitoring activities should be part of an organizational continuous monitoring program. The continuous monitoring program must be designed to meet the needs of the organization and implemented correctly to ensure that the organization’s critical infrastructure is guarded. Organizations may want to look into Continuous Monitoring as a Service (CMaaS) solutions deployed by cloud service providers.

Egress Monitoring

Egress monitoring occurs when an organization monitors the outbound flow of information from one network to another. The most popular form of egress monitoring is carried out using firewalls that monitor and control outbound traffic.

Data leakage occurs when sensitive data is disclosed to unauthorized personnel either intentionally or inadvertently. Data loss prevention (DLP) software attempts to prevent data leakage. It does this by maintaining awareness of actions that can and cannot be taken with respect to a document. For example, it might allow printing of a document but only at the company office. It might also disallow sending the document through email. DLP software uses ingress and egress filters to identify sensitive data that is leaving the organization and can prevent such leakage.

Another scenario might be the release of product plans that should be available only to the Sales group. A security professional could set a policy like the following for that document:

It cannot be emailed to anyone other than Sales group members.
It cannot be printed.
It cannot be copied.

A DLP can be implemented in two locations:

Network DLP: Installed at network egress points near the perimeter, network DLP analyzes network traffic.
Endpoint DLP: Endpoint DLP runs on end-user workstations or servers in the organization.

You can use both precise and imprecise methods to determine what is sensitive:

Precise methods: These methods involve content registration and trigger almost zero false-positive incidents.
Imprecise methods: These methods can include keywords, lexicons, regular expressions, extended regular expressions, metadata tags, Bayesian analysis, and statistical analysis.

The value of a DLP system lies in the level of precision with which it can locate and prevent the leakage of sensitive data.

Log Management

Log management is the process of storing and handling log events generated by applications, devices, and infrastructure components. It includes collecting, aggregating, parsing, storing, analyzing, searching, archiving, and disposing of logs. The goal of log management is to use the events entered in the logs for troubleshooting.

Log files contain a record of events and are often divided into categories. Through log management, an administrator can gather the event data in one place and examine it together, thereby allowing the administrator to analyze the data and identify issues and patterns. Log management provides improved monitoring and troubleshooting, operations, resource usage, and security.

Log management includes five key functions:

Log collection
Log aggregation
Log search and analysis
Log monitoring and alerting
Log visualization and reporting

Organizations should adopt a log management policy. This policy should include guidelines on what to log, where to store logs, how long to store logs, how often logs should be reviewed, and whether logs should be encrypted or archived for audit purposes.

Threat Intelligence

Threat intelligence is threat information that allows organizations to implement controls to protect against the threats. A threat intelligence feed (TI feed), also referred to as a threat feed, is an ongoing stream of data related to identified potential threats to an organization’s security, usually provided by threat intelligence sources. Threat intelligence sources include open-source intelligence (OSINT), social media intelligence, human intelligence, and technical intelligence, including intelligence from the dark web. Threat hunting is a cyber defense activity that proactively and iteratively searches networks to detect and isolate advanced threats that evade existing security solutions. The actual process of threat hunting is beyond the scope of the CISSP exam with its management focus. Threat hunting requires that an organization employs an experienced cybersecurity analyst or firm to use manual or machine-based techniques to proactively identify threats that currently deployed automated detection methods did not catch.

User and Entity Behavior Analytics (UEBA)

User and entity behavior analytics (UEBA), also known as user behavior analytics (UBA), is the process of gathering data regarding daily user network events so that normal conduct by users is understood. Organizations can use UEBA to detect compromised credentials, lateral movement, and other malicious behavior. UEBA are often added on as a layer to existing SIEM deployments. The key to UEBA is detecting behavior anomalies that could be indicative of a threat.

Configuration and Change Management

Configuration management specifically focuses on bringing order out of the chaos that can occur when multiple engineers and technicians have administrative access to the computers and devices that make the network function.

Configuration management documents the setup for computers and devices, including operating system and firmware versions. Change management provides the mechanism whereby the configuration can be changed through formal approval. Some standards consider configuration management a subset of change management, whereas others say change management is a subset of configuration management. It is just as important to recognize that there is a relationship between the two. Configuration management documents the configuration items (CIs) that can then be changed via the change management process.

The following are the functions of configuration management:

Report the status of change processing.
Document the functional and physical characteristics of each CI.
Perform information capture and version control.
Control changes to the CIs, and issue versions of CIs from the software library.

Examples of these types of changes are

Operating system configuration
Software configuration
Hardware configuration

From a CISSP perspective, the biggest contribution of configuration management control is ensuring that changes to the system do not unintentionally diminish security. It is the main reason why all changes must be documented, and all network diagrams, both logical and physical, must be updated regularly and consistently to accurately reflect the state of each CI currently and not as it was a few months or years ago. Verifying that all configuration management policies are being followed should be an ongoing process.

In many cases, it is beneficial to form a configuration control board. The tasks of the configuration control board can include

Ensuring that changes made are approved, tested, documented, and implemented correctly.
Meeting periodically to discuss configuration status accounting reports.
Maintaining responsibility for ensuring that changes made do not jeopardize the soundness of the verification system.

In summary, the components of configuration management are

Configuration control
Configuration status accounting
Configuration audit

All networks and devices evolve, grow, and change over time. Companies and their processes also evolve and change, which is a good thing. But organizations should manage change in a structured way so as to maintain a common sense of purpose about the changes. By following recommended steps in a formal process, change can be prevented from becoming a small element that controls the larger process. The following are guidelines to include as a part of any change control policy:

All changes should be formally requested. Change logs should be maintained.
Each request should be analyzed to ensure that it supports all goals and policies. This analysis includes baselining and security impact analysis.
Prior to formal approval, all costs and effects of the methods of implementation should be reviewed. Using the collected data, changes should be approved or denied.
After they’re approved, the change steps should be developed.
During implementation, incremental testing should occur, and it should rely on a predetermined fallback strategy if necessary. Versioning should be used to effectively track and control changes to a collection of entities.
Complete documentation should be produced and submitted with a formal report to management.

One of the key benefits of following this method is the ability to make use of the documentation in future planning. Lessons learned can be applied and even the process itself can be improved through analysis.

Resource Provisioning

Resource provisioning is a process in security operations which ensures that an organization deploys only the assets it currently needs. Resource provisioning must follow the organization’s resource life cycle. To properly manage the resource life cycle, an organization must maintain an accurate asset inventory and use appropriate configuration management processes. Resources that are involved in provisioning include physical assets, virtual assets, cloud assets, and applications.

Asset Inventory and Management

An asset is any item of value to an organization, including physical devices and digital information. Recognizing when assets are stolen or improperly deployed is impossible if no item count or inventory system exists or if the inventory is not kept updated. All equipment should be inventoried, and all relevant information about each device should be maintained and kept up to date. Each asset should be fully documented, including serial numbers, model numbers, firmware version, operating system version, responsible personnel, and so on. The organization should maintain this information both electronically and in hard copy. Maintaining this inventory will aid in determining when new assets should be deployed or when currently deployed assets should be decommissioned.

Security devices, such as firewalls, network address translation (NAT) devices, and IDSs and IPSs, should receive the most attention because they relate to physical and logical security. Beyond this, devices that can easily be stolen, such as laptops, tablets, and smartphones, should be locked away. If that is not practical, then consider locking these types of devices to stationary objects (for example, using cable locks with laptops).

When the technology is available, tracking of small devices can help mitigate the loss of both devices and their data. Many smartphones now include tracking software that allows you to locate a device after it has been stolen or lost by using either cell tower tracking or GPS. Deploy the device tracking technology when available.

Another useful feature available on many smartphones and other portable devices is a remote wiping feature. This allows the user to send a signal to a stolen device, instructing it to wipe out all the data contained on the device. Similarly, these devices typically also come with the ability to be remotely locked when misplaced or stolen.

Strict control of the use of portable media devices can help prevent sensitive information from leaving the network or office premises. These devices include CDs, DVDs, flash drives, and external hard drives. Although written rules should be in effect about the use of these devices, using security policies to prevent the copying of data to these media types is also possible. Allowing the copying of data to these drive types as long as the data is encrypted is also possible. If these functions are provided by the network operating system, you should deploy them.

It should not be possible for unauthorized persons to access and tamper with any devices. Tampering includes defacing, damaging, or changing the configuration of a device. Organizations should use integrity verification programs to look for evidence of data tampering, errors, and omissions.

Encrypting sensitive data stored on devices can help prevent the exposure of data in the event of a theft or in the event of inappropriate access of the device.

Physical Assets

Physical assets include servers, desktop computers, laptops, mobile devices, and network devices that are deployed in the enterprise. Physical assets should be deployed and decommissioned based on organizational need. For example, suppose an organization deploys a wireless access point (WAP) for use by a third-party auditor. Proper resource provisioning should ensure that the WAP is decommissioned after the third-party auditor no longer needs access to the network. Without proper inventory and configuration management, the WAP may remain deployed and can be used at some point to carry out a wireless network attack.

Virtual Assets

Virtual assets include software-defined networks, virtual storage-area networks (VSANs), guest operating systems deployed on virtual machines (VMs), and virtual routers. As with physical assets, the deployment and decommissioning of virtual assets should be tightly controlled as part of configuration management because virtual assets, just like physical assets, can be compromised. For example, a Windows 10 VM deployed on a Windows Server system should be retained only until it is no longer needed. As long as the VM is being used, it is important to ensure that the appropriate updates, patches, and security controls are deployed on it as part of configuration management. When users no longer access the VM, it should be removed.

Virtual storage occurs when physical storage (including hard drives, DVDs, and other storage media) from multiple network devices is compiled into a single logical space and appears as a single drive to the regular user. Block virtualization separates the logical storage from the physical storage. File virtualization eliminates the dependency between data accessed at the file level and the physical storage location of the files. Host-based virtual storage requires software running on the host. Storage device–based virtual storage runs on a storage controller and allows other storage controllers to be attached. Network-based virtual storage uses network-based devices, such as iSCSI or Fibre Channel, to create a storage solution.

Cloud Assets

Cloud assets include cloud services, virtual machines, storage networks, and other cloud services contracted through a cloud service provider. Cloud assets are usually billed based on usage and should be carefully provisioned and monitored to prevent the organization from paying for portions of service that it does not need. Configuration management should ensure that the appropriate monitoring policies are in place to make certain that only resources that are needed are deployed.

Applications

Applications include commercial applications that are locally installed, web services, and any cloud-deployed application services, such as Software as a Service (SaaS). The appropriate number of licenses should be maintained for all commercial applications. An organization should periodically review its licensing needs. For cloud deployments of software services, configuration management should be used to ensure that only personnel who have valid needs for the software are given access to it.

Baselining

Baselining is the process of documenting the attributes of a CI at a point in time, which serves as a basis for defining change. Configuration baselines should be documented for all CIs. If a change is approved and completed to a CI, then the baseline of the CI needs to be adjusted based on the changes made.

Baseline configurations should be maintained over time. This requires creating new baselines as organizational information systems change.

Automation

Automation software reduces cost, complexity, and errors in configuration management by maintaining a CI database. With automation tools, CI baselines can be saved. Then if a change has unanticipated consequences, administrators can simply roll the CI back to the saved baseline.

Automation tools allow administrators to make changes and deployments faster and remove the potential for administrator error. They also allow administrators to track the state of resources, preventing duplicated effort by attempting to install something that is already deployed to a CI. Finally, configuration management tools can audit an organization’s CIs so that administrators can easily pinpoint CIs with certain issues or needs.

Security Operations Concepts

Throughout this book, you’ve seen references made to policies and principles that can guide all security operations. In the following sections, we review some concepts more completely that have already been touched on and introduce some new issues concerned with maintaining security operations.

Need to Know/Least Privilege

With regard to allowing access to resources and assigning rights to perform operations, always apply the concept of least privilege (also called need to know). In the context of resource access, that means that the default level of access should be no access. Give users access only to resources required to do their job, and that access should require manual implementation after the requirement is verified by a supervisor.

Discretionary access control (DAC) and role-based access control (RBAC) are examples of systems based on a user’s need to know. To ensure least privilege requires that the user’s job be identified and each user be granted the lowest clearance required for their tasks. Another example is the implementation of views in a database. Need to know requires that the operator have the minimum knowledge of the system necessary to perform a task.

Managing Accounts, Groups, and Roles

Devices, computers, and applications implement user and group accounts and roles to allow or deny access. User accounts are created for each user needing access. Group accounts are used to configure permissions on resources. User accounts are added to the appropriate group accounts to inherit the permissions granted to that group. User accounts can also be assigned to roles. Roles are most often used by applications.

Security professionals should understand the following accounts:

Root or built-in administrator accounts: These are the most powerful accounts on the system. Root accounts are used in Linux-based systems, whereas administrator accounts are used in Windows-based systems. It is best to disable such an account after you have created another account with the same privileges, because most of these account names are well known and can be used by attackers. If you decide to keep these accounts, most vendors suggest that you change the account name and give it a complex password. Root or administrator accounts should be used only when performing administrative duties, and use of these accounts should always be audited.
Service accounts: These accounts are used to run system services and applications. Therefore, security professionals can limit the service account’s access to the system. Always research the default user accounts that are used. Make sure that you change the passwords for these accounts on a regular basis. Use of these accounts should always be audited.
Regular administrator accounts: These administrator accounts are created and assigned only to a single individual. Any user who has an administrative account should also have a regular/standard user account to use for normal day-to-day operations. Administrative accounts should be used only when performing administrative-level duties, and use of these accounts should always be audited.
Power user accounts: These accounts have more privileges and permissions than normal user accounts. These accounts should be reviewed on a regular basis to ensure that only users who need the higher-level permissions have these accounts. Most modern operating systems limit the abilities of the power users or even remove this account type entirely.
Regular/standard user accounts: These are the accounts users use while performing their normal everyday job duties. These accounts must strictly follow the principle of least privilege.

Separation of Duties and Responsibilities

The concept of separation of duties prescribes that sensitive operations be divided among multiple users so that no one user has the rights and access to carry out the operation alone. Separation of duties and responsibilities is valuable in deterring fraud by ensuring that no single individual can compromise a system. It is considered a preventive administrative control. An example would be one person initiating a request for a payment and another authorizing that payment to be made.

Privilege Account Management

Security professionals should ensure that organizations establish the proper account, group, and role life cycle management procedures to ensure they are properly created, managed, and removed. The provisioning life cycle is covered in more detail in Chapter 5, “Identity and Access Management (IAM).”

Inevitably, some users, especially supervisors or those in the IT support department, will require special rights and privileges that other users do not possess. For example, one requirement might be that a set of users who work the help desk might need to be able to reset passwords or perhaps make changes to user accounts. These types of rights carry with them a responsibility to exercise the rights responsibly and ethically.

Although in a perfect world we would like to assume that we can expect ethical and secure behavior from all users, in the real world we know this is not always true. Therefore, one of the things to monitor is the use of these privileges and privileged accounts. Although security professionals should be concerned with the amount of monitoring performed and the amount of data produced by the monitoring of privilege usage, recording the exercise of special privileges or the use of privileged accounts should not be sacrificed, even if it means regularly saving the data as a log file and clearing the event gathering system.

Job Rotation and Mandatory Vacation

From a security perspective, job rotation refers to the training of multiple users to perform the duties of a position to help prevent fraud by any individual employee. The idea is that by making more than one person familiar with the legitimate functions of the position, the higher the likelihood that unusual activities by any one person will be noticed. This approach is often used in conjunction with mandatory vacations, in which all users are required to take time off, allowing another to fill their position while gone, which enhances the opportunity to discover unusual activity. Beyond the security aspects of job rotation, additional benefits include

Trained backup in case of emergencies
Protection against fraud
Cross training of employees

Rotation of duties, separation of duties, and mandatory vacations are all administrative controls.

Two-Person Control

A two-person control, also referred to as a two-man rule, occurs when certain access and actions require the presence of two authorized people at all times. Common examples are the requirement for two people to sign checks over a certain dollar amount or for two people to be present to perform a certain activity, such as opening a safe.

Sensitive Information Procedures

Access control and its use in preventing unauthorized access to sensitive data are important for organizational security. It follows that the secure handling of sensitive information is critical. Although we tend to think in terms of the company’s information, it is also critical that the company protect the private information of its customers and employees. A leak of users’ and customers’ personal information causes, at a minimum, embarrassment for the company and possibly fines and lawsuits.

Regardless of whether the aim is to protect company data or personal data, the key is to apply the access control principles to both sets of data. When you are examining access control procedures and policies, the following questions need to be answered:

Are data or privileges available to the user that are not required for the job?
How many users have access to sensitive data, and why?

Record Retention

Proper access control is not possible without auditing. It allows us to track activities and discover problems before they are fully realized. Because this analysis can sometimes lead to a mountain of data to analyze, you should monitor only the most sensitive of activities, and retain and review all records. Moreover, in many cases companies are required by law or regulation to maintain records of certain data.

Most auditing systems allow for the configuration of data retention options. In some cases, the default operation is to start writing over the older records in the log when the maximum log size is full. Regularly clearing and saving the log can prevent this from happening and avoid the loss of important events. In cases of extremely sensitive data, having a server shut off access when a security log is full and cannot record any more events is even advisable.

Information Life Cycle

In security operations, security professionals must understand the life cycle of information, which includes creation/reception, distribution, usage, maintenance, and disposal of information. After information is gathered, it must be classified to ensure that only authorized personnel can access the information.

Service-Level Agreements

Service-level agreements (SLAs) are agreements about the ability of the support system to respond to problems within a certain timeframe while providing an agreed-upon level of service. SLAs can be internal between departments or external to a service provider. When parties agree on the speed and accuracy with which various problems related to the provided service are addressed, some predictability is introduced to the response to problems, which ultimately supports the maintenance of access to resources.

An SLA should contain a description of the services to be provided and the service levels and metrics that the customer can expect. It also includes the duties and responsibilities of each party of the SLA. It lists the service specifics, exclusions, service levels, escalation procedures, and cost. It should include a clause regarding payment to the customers resulting from a breach of the SLA. Although SLAs can be transferable, they are not transferable by law. Metrics that should be measured include service availability, service levels, defect rates, technical quality, and security. SLAs should be periodically reviewed to ensure that the business needs, technical environment, or workloads have not changed. In addition, metrics, measurement tools, and processes should be reviewed to see if they have improved.

Resource Protection

Enterprise resources include both assets you can see and touch (tangible), such as computers and printers, and assets you cannot see and touch (intangible), such as trade secrets and processes. Although typically you would think of resource protection as preventing the corruption of digital resources and as the prevention of damage to physical resources, this concept also includes maintaining the availability of those resources. In the following sections, we discuss both aspects of resource protection.

Protecting Tangible and Intangible Assets

In some cases, among the most valuable assets of a company are intangible assets such as secret recipes, formulas, and trade secrets. In other cases the value of the company is derived from its physical assets such as facilities, equipment, and the talents of its people. All are considered resources and should be included in a comprehensive resource protection plan. Next, we explore some specific concerns with these various types of resources.

Facilities

Usually, the largest tangible asset an organization has is the building in which it operates and the surrounding land. Physical security is covered later in this chapter, but it bears emphasizing that vulnerability testing (discussed more fully in Chapter 6) ought to include the security controls of the facility itself. Some examples of vulnerability testing as it relates to facilities include

Do doors close automatically, and does an alarm sound if they are held open too long?
Are the protection mechanisms of sensitive areas, such as server rooms and wiring closets, sufficient and operational?
Does the fire suppression system work?
Are sensitive documents shredded as opposed to being thrown in the dumpster?

Beyond the access issues, the main systems that are needed to ensure operations are not disrupted include fire detection/suppression, HVAC (including temperature and humidity controls), water and sewage systems, power/backup power, communications equipment, and intrusion detection.

Hardware

Another of the more tangible assets that must be protected is all the hardware that makes the network operate. This hardware includes not only the computers and printers with which the users directly come in contact but also the infrastructure devices that they never see such as routers, switches, and firewall appliances. Maintaining access to these critical devices from an availability standpoint is covered later in the sections “Redundancy and Fault Tolerance” and “Backup and Recovery Systems.”

From a management standpoint, these devices are typically managed remotely. Special care must be taken to safeguard access to these management features as well as protect the data and commands passing across the network to these devices. Some specific guidelines include

Change all default administrator passwords on the devices.
Limit the number of users who have remote access to these devices.
Rather than Telnet (which sends commands in cleartext), use an encrypted command-line tool such as Secure Shell (SSH).
Manage critical systems locally.
Limit physical access to these devices.

Software

Software assets include any propriety application, scripts, or batch files that have been developed in house that are critical to the operation of the organization. Secure coding and development practices can help to prevent weaknesses in these systems. Security professionals also must pay attention to preventing theft of these assets.

Moreover, closely monitoring the use of commercial applications and systems in the enterprise can prevent unintentional breach of licensing agreements. One of the benefits of giving users only the applications they require to do their job is that it limits the number of users that have an application, helping to prevent exhaustion of licenses for software.

Information Assets

Information assets are the last asset type that needs to be discussed, but by no means are they the least important. The primary purpose of operations security is to safeguard information assets that are resident in the system. These assets include recipes, processes, trade secrets, product plans, and any other type of information that enables the enterprise to maintain competitiveness within its industry. The principles of data classification and access control apply most critically to these assets. In some cases the dollar value of these assets might be difficult to determine, although it might be clear to all involved that the asset is critical. For example, the secret formula for Coca-Cola has been closely guarded for many years due to its value to the company.

Asset Management

In the process of managing these assets, several issues must be addressed. Certainly, access to the asset must be closely controlled to prevent its deletion, theft, or corruption (in the case of digital assets) and from physical damage (in the case of physical assets). Moreover, the asset must remain available when needed. This section covers methods of ensuring availability, authorization, and integrity.

Redundancy and Fault Tolerance

One way to provide uninterrupted access to information assets is through redundancy and fault tolerance. Redundancy refers to providing multiple instances of either a physical or logical component such that a second component is available if the first fails. Fault tolerance is a broader concept that includes redundancy but refers to any process that allows a system to continue making information assets available in the case of a failure.

In some cases, redundancy is applied at the physical layer, such as network redundancy provided by a dual backbone in a local network environment or by using multiple network cards in a critical server. In other cases, redundancy is applied logically such as when a router knows multiple paths to a destination in case one fails.

Fault tolerance countermeasures are designed to combat threats to design reliability. Although fault tolerance can include redundancy, it also refers to systems such as Redundant Array of Independent Disks (RAID) in which data is written across multiple disks in such a way that a disk can fail and the data can be quickly made available from the remaining functioning disks in the array without resorting to backup media. Be familiar with a number of RAID types because not all provide fault tolerance. Regardless of the technique employed for fault tolerance to operate, a system must be capable of detecting and correcting the fault.

Backup and Recovery Systems

Although comprehensive coverage of backup and recovery systems is found throughout this chapter, it is important to emphasize here the role of operations in carrying out those activities. After the backup schedule has been designed, there will be daily tasks associated with carrying out the plan. One of the most important parts of this system is an ongoing testing process to ensure that all backups are usable in case a recovery is required. The time to discover that a backup did not succeed is during testing and not during a live recovery.

Identity and Access Management

From an operations perspective, it is important to realize that managing these things is an ongoing process that might require creating accounts, deleting accounts, creating and populating groups, and managing the permissions associated with all of these concepts. An essential task is ensuring that the rights to perform these actions are tightly controlled and that a formal process is established for removing permissions when they are no longer required and disabling accounts that are no longer needed.

Another area to focus on is the control of the use of privileged accounts or accounts that have rights and permissions that exceed those of a regular user account. Although this control obviously applies to built-in administrator, root, or supervisor accounts (which in some operating systems are called root accounts) that have vast permissions, it also applies to any account that confers special privileges to the user.

Moreover, as a security professional, you should maintain the same tight control over the numerous built-in groups that exist in Windows to grant special rights to the group members. When using these groups, make note of any privileges held by the default groups that are not required for your purposes. You might want to remove some of the privileges from the default groups to support the concept of least privilege. You learn more about identity and access management in Chapter 5.

Media Management

Media management is an important part of operations security because media is where data is stored. Media management includes RAID, SAN, NAS, and HSM.

RAID

Redundant Array of Independent Disks (RAID) refers to a system whereby multiple hard drives are used to provide either a performance boost or fault tolerance for the data. When we speak of fault tolerance in RAID, we mean maintaining access to the data even in a drive failure without restoring the data from backup media. The following are the types of RAID with which you should be familiar.

RAID 0, also called disk striping, writes the data across multiple drives or disks. Although it improves performance, it does not provide fault tolerance. Figure 7-2 depicts RAID 0.

An illustration of RAID 0. It has two disks 0 and 1. Disk 0 has data A 1, A 3, A 5, and A 7. Disk 1 has data A 2, A 4, A 6, and A 8. — **Figure 7-2** RAID 0

RAID 1, also called disk mirroring, uses two drives or disks and writes a copy of the data to both disks, providing fault tolerance in the case of a single drive failure. Figure 7-3 depicts RAID 1.

An illustration of RAID 1. It has two disks 0 and 1. Disk 0 has data A 1, A 2, A 3, and A 4. Disk 1 has data A 1, A 2, A 3, and A 4. — **Figure 7-3** RAID 1

RAID 3, requiring at least three drives or disks, also requires that the data is written across all drives like striping and then parity information is written to a single dedicated drive. The parity information is used to regenerate the data in the case of a single drive failure. The downfall is that the parity drive is a single point of failure if it goes bad. Figure 7-4 depicts RAID 3.

An illustration of RAID 3. — **Figure 7-4** RAID 3

RAID 5, requiring at least three drives or disks, also requires that the data is written across all drives like striping and then parity information also is written across all drives. The parity information is used in the same way as in RAID 3, but it is not stored on a single drive so there is no single point of failure for the parity data. With hardware RAID level 5, the spare drives that replace the failed drives are usually hot swappable, meaning they can be replaced on the server while it is running. Figure 7-5 depicts RAID 5.

An illustration of RAID 5. — **Figure 7-5** RAID 5

RAID 10, which requires at least four drives or disks, is a combination of RAID 0 and RAID 1. First, a RAID 1 volume is created by mirroring two drives together. Then a RAID 0 stripe set is created on each mirrored pair. Figure 7-6 depicts RAID 10.

An illustration of RAID 10. — **Figure 7-6** RAID 10

Although RAID can be implemented with software or with hardware, certain types of RAID are faster when implemented with hardware. When software RAID is used, it is a function of the operating system. Both RAID 3 and 5 are examples of RAID types that are faster when implemented with hardware. Simple striping or mirroring (RAID 0 and 1), however, tend to perform well in software because they do not use the hardware-level parity drives. Table 7-1 summarizes the RAID types.

Table 7-1 RAID Levels

RAID Level	Minimum Number of Drives or Disks	Description	Strengths	Weaknesses
RAID 0	2	Data striping without redundancy	Highest performance	No data protection; one drive fails, all data is lost
RAID 1	2	Disk mirroring	Very high performance; very high data protection; very minimal penalty on write performance	High redundancy cost overhead; because all data is duplicated, twice the storage capacity is required
RAID 3	3	Byte-level data striping with dedicated parity drive	Excellent performance for large, sequential data requests	Not well suited for transaction-oriented network applications; single parity drive does not support multiple, simultaneous read and write requests
RAID 5	3	Block-level data striping with distributed parity	Best cost/performance for transaction-oriented networks; very high performance, very high data protection; supports multiple simultaneous reads and writes; can also be optimized for large, sequential requests	Write performance is slower than RAID 0 or RAID 1
RAID 10	4	Disk mirroring with striping	Same fault tolerance as RAID 1; same overhead as with mirroring; provides high I/O rates; can sustain multiple simultaneous drive failures	Very expensive; all drives must move in parallel to properly track, which reduces sustained performance; very limited scalability at a very high cost

SAN

Storage-area networks (SANs) are composed of high-capacity storage devices that are connected by a high-speed private network (separate from the LAN) using storage-specific switches. This storage information architecture addresses the collection, management, and use of data.

NAS

Network-attached storage (NAS) serves the same function as SAN, but clients access the storage in a different way. In a NAS, almost any machine that can connect to the LAN (or is interconnected to the LAN through a WAN) can use protocols such as NFS, CIFS, or HTTP to connect to a NAS and share files. In a SAN, only devices that can use the Fibre Channel SCSI network can access the data, so it is typically done though a server that has this capability. Figure 7-7 shows a comparison of the two systems.

An illustration of N A S and S A N. — **Figure 7-7** NAS and SAN

HSM

A hierarchical storage management (HSM) system is a type of backup management system that provides a continuous online backup by using optical or tape “jukeboxes.” It operates by automatically moving data between high-cost and low-cost storage media as the data ages. When continuous availability (24-hours-a-day processing) is required, HSM provides a good alternative to tape backups. It also strives to use the proper media for the scenario. For example, a rewritable and erasable (CDR/W) optical disc is sometimes used for backups that require short-time storage for changeable data but require faster file access than tape.

Media History

Security professionals must accurately maintain media library logs to keep track of the history of the media. This task is important in that all media types have a maximum number of times they can safely be used. A log should be kept by a media librarian. This log should track all media (backup and other types such as OS installation discs and USB thumb drives). With respect to the backup media, use the following guidelines:

Track all instances of access to the media.
Track the number and location of backups.
Track age of media to prevent loss of data through media degeneration.
Inventory the media regularly.

Media Labeling and Storage

All forms of storage media (tapes, optical, USB thumb drives, and so on) should be labeled plainly and stored safely. Some guidelines in the area of media control are to

Accurately and promptly mark all data storage media.
Ensure proper environmental storage of the media.
Ensure the safe and clean handling of the media.
Log data media to provide a physical inventory control.

The environment where the media will be stored is also important. For example, damage starts occurring to magnetic media above 100 degrees. The Forest Green Book is a Rainbow Series book that defines the secure handling of sensitive or classified automated information system memory and secondary storage media, such as degaussers, magnetic tapes, hard disks, and cards. The Rainbow Series is discussed in more detail in Chapter 3.

Sanitizing and Disposing of Media

During media disposal, you must ensure no data remains on the media. The most reliable, secure means of removing data from magnetic storage media, such as a magnetic tape cassette, is through degaussing, which exposes the media to a powerful, alternating magnetic field. It removes any previously written data, leaving the media in a magnetically randomized (blank) state. Some other disposal terms and concepts with which you should be familiar are

Data purging: Using a method such as degaussing to make the old data unavailable even with forensics. Purging renders information unrecoverable against laboratory attacks (forensics).
Data clearing: Rendering information unrecoverable by a keyboard. This attack extracts information from data storage media by executing software utilities, keystrokes, or other system resources executed from a keyboard.
Remanence: Any data left after the media has been erased.

Network and Resource Management

Although the Security Operations domain focuses on providing confidentiality and integrity of data, availability of the data is also one of its goals. This means designing and maintaining processes and systems that maintain availability to resources despite hardware or software failures in the environment. The following principles and concepts are available to assist in maintaining access to resources:

Redundant hardware: Failures of physical components, such as hard drives and network cards, can interrupt access to resources. Providing redundant instances of these components can help to ensure a faster return to access. In some cases, changing out a component might require manual intervention, but in many cases these items are hot swappable (they can be changed with the device up and running), in which case a momentary reduction in performance might occur rather than a complete disruption of access.
Fault-tolerant technologies: Taking the idea of redundancy to the next level are technologies that are based on multiple computing systems working together to provide uninterrupted access even in the event of a failure of one of the systems. Clustering of servers and grid computing are both great examples of this approach.
Service-level agreements (SLAs): SLAs are agreements about the capability of the support system to respond to problems within a certain timeframe while providing an agreed level of service. They can be internal between departments or external to a service provider. By agreeing on the quickness with which various problems are addressed, some predictability is introduced to the response to problems, which ultimately supports the maintenance of access to resources.
MTBF and MTTR: Although SLAs are appropriate for services that are provided, a slightly different approach to introducing predictability can be used with regard to physical components that are purchased. Vendors typically publish values for a product’s mean time between failure (MTBF), which describes how often a component fails on average. Another valuable metric typically provided is the mean time to repair (MTTR), which describes the average amount of time it will take to get the device fixed and back online.
Single point of failure (SPOF): Though not actually a strategy, it is worth mentioning that the ultimate goal of any of these approaches is to avoid an SPOF in a system. All components and groups of components and devices should be examined to discover any single element that could interrupt access to resources if a failure occurs. Each SPOF should then be mitigated in some way.

Incident Management

Incident response and management are vital to every organization to ensure that any security incidents are detected, contained, and investigated. Incident response is the beginning of any investigation. After an incident has been discovered, incident response personnel perform specific tasks. During the entire incident response, the incident response team must ensure that they follow proper procedures to ensure that evidence is preserved. Incident management ensures that the incident response team manages an incident and returns service to normal as quickly as possible after the incident.

As part of incident response, security professionals must understand the difference between events and incidents (see the following section). The incident response team must have the appropriate incident response procedures in place to ensure that the incident is handled, but the procedures must not hinder any forensic investigations that might be needed to ensure that parties are held responsible for any illegal actions. Security professionals must understand the rules of engagement and the authorization and scope of any incident investigation.

Event Versus Incident

With regard to incident response, a basic difference exists between events and incidents. An event is a noticeable change of state that occurs. Whereas events include both negative and positive events, incident response focuses more on negative events—events that have been deemed as negatively impacting the organization. An incident is a series of events that negatively impact an organization’s operations and security.

Events can be detected only if an organization has established the proper auditing and security mechanisms to monitor activity. A single negative event might occur. For example, the auditing log might show that an invalid login attempt occurred. By itself, this login attempt is not a security concern. However, if many invalid login attempts occur over a period of a few hours, the organization might be undergoing an attack. The initial invalid login is considered an event, but the series of invalid login attempts over a few hours would be an incident, especially if it is discovered that the invalid login attempts all originated from the same IP address.

Incident Response Team and Incident Investigations

When establishing the incident response team, organizations must consider the technical knowledge of each individual. The members of the team must understand the organization’s security policy and have strong communication skills. Members should also receive training in incident response and investigations.

When an incident has occurred, the primary goal of the team is to contain the attack and repair any damage caused by the incident. Security isolation of an incident scene should start immediately when the incident is discovered. Evidence must be preserved, and the appropriate authorities should be notified.

The incident response team should have access to the incident response plan. This plan should include the list of authorities to contact, team roles and responsibilities, an internal contact list, procedures for securing and preserving evidence, and a list of investigation experts who can be contacted for help. A step-by-step manual should be created that the incident response team must follow to ensure that no steps are skipped. After the incident response process has been engaged, all incident response actions should be documented.

If the incident response team determines that a crime has been committed, senior management and the proper authorities should be contacted immediately.

Rules of Engagement, Authorization, and Scope

An organization ought to document the rules of engagement, authorization, and scope for the incident response team. The rules of engagement define which actions are acceptable and unacceptable if an incident has occurred. The authorization and scope provide the incident response team with the authority to perform an investigation and with the allowable scope of any investigation they must undertake.

The rules of engagement act as a guideline for the incident response team to ensure that they do not cross the line from enticement into entrapment. Enticement occurs when the opportunity for illegal actions is provided (luring) but attackers make their own decision to perform the action, and entrapment means to encourage someone to commit a crime that the individual might have had no intention of committing. Enticement is legal but does raise ethical arguments and might not be admissible in court. Conversely, entrapment is illegal.

Incident Response Procedures

When performing incident response, the incident response team must follow incident response procedures. Depending on where you look, you might find different steps or phases included as part of the incident response process.

For the CISSP exam, you need to remember the following steps:

Detect the incident.
Respond to the incident.
Mitigate the effects of the incident.
Report the incident to the appropriate personnel.
Recover from the incident.
Remediate all components affected by the incident to ensure that all traces of the incident have been removed.
Review the incident, and document all findings as lessons learned.

The actual investigation of the incident occurs during the respond, mitigate, report, and recover steps. Following appropriate forensic and digital investigation processes during the investigation can ensure that evidence is preserved.

The incident response process is shown in Figure 7-8.

An illustration of a Incident Response Process. The steps are detect, respond, mitigate, report, recover, remediate, and lessons learned. — **Figure 7-8** Incident Response Process

Incident Response Management

Security events will inevitably occur, and the response to these events says much about how damaging the events will be to the organization. Incident response policies should be formally designed, well communicated, and followed. They should specifically address cyberattacks against an organization’s IT systems.

Detect

The first step in the incident response process is to detect the incident. Prior to any incident response investigation, security professionals must first perform the appropriate triage for the affected assets. This process includes initially detecting the incident and determining how serious the incident is. In some cases, during the triage phase, security professionals may determine that a false positive has occurred, meaning that an attack really did not occur, even though an alert indicated that it did. If an attack is confirmed, then the incident response will progress into investigative actions.

All detective controls, such as auditing, discussed in Chapter 1, are designed to provide this capability. The worst sort of incident is the one that goes unnoticed.

Respond

The response to the incident should be appropriate for the type of incident. Denial-of-service (DoS) attacks against the web server would require a quicker and different response than a missing mouse in the server room. Establish standard responses and response times ahead of time.

Response involves containing the incident and quarantining the affected assets, thus preventing other assets from being affected and reducing the potential impact. Different methods can be used, depending on the category of the attack, the asset affected, and the data criticality or infection risk.

After an attack is contained or isolated, analysts should work to examine and analyze the cause of the incident. This analysis includes determining where the incident originated. Security professionals should use experience and formal training to make the appropriate conclusions regarding the incident. After the root cause has been determined, security professionals should follow incident handling policies that the organization has in place.

Mitigate

Mitigation includes limiting the scope of what the attack might do to the organization’s assets. If damage has occurred or the incident may broaden and affect other assets, proper mitigation techniques ensure that the incident is contained to within a certain scope of assets. Mitigation options vary, depending on the kind of attack that has occurred. Security professionals should develop procedures in advance that detail how to properly mitigate any attacks that occur against organizational assets. Preparing these mitigation procedures in advance ensures that they are thorough and gives personnel a chance to test the procedures.

Report

All incidents should be reported within a timeframe that reflects the seriousness of the incident. In many cases, establishing a list of incident types and the person to contact when that type of incident occurs is helpful. Exercising attention to detail at this early stage while time-sensitive information is still available is critical.

Recover

Recovery involves a reaction designed to make the network or system that is affected functional again; it includes repair of the affected assets and prevention of similar incidents in the future. Exactly what recovery means depends on the circumstances and the recovery measures that are available. For example, if fault-tolerance measures are in place, the recovery might consist of simply allowing one server in a cluster to fail over to another. In other cases, recovery could mean restoring the server from a recent backup. The main goal of this step is to make all resources available again. Delay putting any asset back into operation until it is at least protected from the incident that occurred. Thoroughly test assets for vulnerabilities and weaknesses before reintroducing them into production.

Remediate

The remediation step involves eliminating any residual danger or damage to the network that still might exist. For example, in the case of a virus outbreak, it could mean scanning all systems to root out any additional affected machines. These measures are designed to make a more detailed mitigation when time allows.

Review and Lessons Learned

Finally, security professionals should review each incident to discover what could be learned from it. Changes to procedures might be called for. Lessons learned should be shared with all personnel who might encounter this type of incident again. Complete documentation and analysis are the goal of this step.

Detective and Preventive Measures

As you have probably gathered by now, a wide variety of security threats faces those charged with protecting the assets of an organization. Luckily, a wide variety of tools is available to use to accomplish this task. The following sections cover some common threats and mitigation approaches.

IDS/IPS

Setup, configuration, and monitoring of any intrusion detection and intrusion prevention systems (IDS/IPS) are also ongoing responsibilities of operations security. Many of these systems must be updated on a regular basis with the attack signatures that enable them to detect new attack types. The analysis engines that they use also sometimes have updates that need to be applied.

Moreover, the log files of systems that are set to log certain events rather than take specific actions when they occur need to have those logs archived and analyzed on a regular basis. Spending large sums of money on software that gathers important log information and then disregarding that log information makes no sense.

IDS and IPS are discussed in more detail earlier in this chapter and in Chapter 4.

Intrusion response is just as important as intrusion detection and prevention. Intrusion response is about responding appropriately to any intrusion attempt. Most systems use alarms and signals to communicate with the appropriate personnel or systems when an intrusion has been attempted. An organization must respond to alerts and signals in a timely manner.

Firewalls

Firewalls can be implemented on multiple levels to allow or prevent communication based on a variety of factors. If personnel discover that certain types of unwanted traffic are occurring, it is often fairly simple to configure a firewall to prevent that type of traffic. Firewalls can protect the boundaries between networks, traffic within a subnetwork, or a single system. Make sure to keep firewalls fully updated per the vendor’s recommendations. Firewalls are discussed in more depth in Chapter 4.

Whitelisting/Blacklisting

Whitelisting occurs when a list of acceptable email addresses, Internet addresses, websites, applications, or some other identifier is configured as good senders or as allowed. Blacklisting identifies bad senders. Graylisting is somewhere in between the two, listing entities that cannot be identified as whitelist or blacklist items. In the case of graylisting, the new entity must pass through a series of tests to determine whether it will be whitelisted or blacklisted.

Whitelisting, blacklisting, and graylisting are commonly used with spam filtering tools.

Third-Party Security Services

Security professionals may need to rely on third-party security services to find threats in the enterprise. Some common third-party security services include malware/virus detection and honeypots/honeynets. It is often easier to rely on a solution developed by a third party than to try to develop your own in-house solution. Always research the features provided with a solution to determine if it meets the needs of your organization. Compare the different products available to ensure that the organization purchases the best solution for its needs.

Sandboxing

Sandboxing is a software virtualization technique that allows applications and processes to run in an isolated virtual environment. Applications and processes in the sandbox are not able to make permanent changes to the system and its files.

Some malware attempts to delay or stall code execution, allowing the sandbox to time out. A sandbox can use hooks and environmental checks to detect malware. These methods do not prevent many types of malware. For this reason, third-party security services are important.

Honeypots/Honeynets

Honeypots are systems that are configured with reduced security to entice attackers so that administrators can learn about attack techniques. In some cases, entire networks called honeynets are attractively configured for this purpose. These types of approaches should be undertaken only by companies with the skill to properly deploy and monitor them. Some third-party security services can provide this function for organizations.

Anti-malware/Antivirus

All updates of antivirus and anti-malware software are the responsibility of operations security. It is important to deploy a comprehensive anti-malware/antivirus solution for the entire enterprise.

Clipping Levels

Clipping levels set a baseline for normal user errors, and violations exceeding that threshold will be recorded for analysis of why the violations occurred. When clipping levels are used, a certain number of occurrences of an activity might generate no information, whereas recording of activities begins when a certain level is exceeded.

Clipping levels are used to

Reduce the amount of data to be evaluated in audit logs.
Provide a baseline of user errors above which violations will be recorded.

Deviations from Standards

One of the methods that you can use to identify performance problems that arise is to develop standards or baselines for the performance of certain systems. After these benchmarks have been established, deviations for the standards can be identified. This information is especially helpful in identifying certain types of DoS attacks as they occur. Beyond the security benefit, identifying these problems also aids in identifying systems that might need upgrading before the situation affects productivity.

Unusual or Unexplained Events

In some cases, events occur that appear to have no logical cause. That type of answer should never be accepted when problems occur. Although the focus is typically on getting systems up and running again, the root causes of issues must be identified. You should avoid the temptation to implement a quick workaround (often at the expense of security). When time permits, using a methodical approach to find exactly why the event happened is best, because inevitably the problem will come back if the root cause has not been addressed.

Unscheduled Reboots

When systems reboot on their own, this behavior is typically a sign of hardware problems of some sort. Reboots should be recorded and addressed. Overheating is the cause of many reboots. Sometimes reboots may also be the result of a DoS attack. You should have system monitoring in place to record all system reboots and investigate any that are not initiated by a human or have occurred as a result of an automatic upgrade.

Unauthorized Disclosure

The unauthorized disclosure of information is a large threat to organizations. It includes destruction of information, interruption of service, theft of information, corruption of information, and improper modification of information. Enterprise solutions must be deployed to monitor for any potential disclosure of information.

Trusted Recovery

When an application or operating system suffers a failure (crash, freeze, and so on), it is important that the system respond in a way that leaves the system in a secure state or that it makes a trusted recovery. A trusted recovery ensures that security is not breached when a system crash or other system failure occurs. You might recall from Chapter 3 that the Orange Book requires a system be capable of a trusted recovery for all systems rated B3 or A1.

Trusted Paths

A trusted path is a communication channel between the user or the program through which that user is working and the trusted computer base (TCB). The TCB provides the resources to protect the channel and prevent it from being compromised. Conversely, a communication path that is not protected by the system’s normal security mechanisms is called a covert channel. Taking this a step further, if the interface offered to the user is secured in this way, it is referred to as a trusted shell.

Operations security must ensure that trusted paths are validated. This validation occurs using log collection, log analysis, vulnerability scans, patch management, and system integrity checks.

Input/Output Controls

The main thrust of input/output control is to apply controls or checks to the input that is allowed to be submitted to the system. Performing input validation on all information accepted into the system can ensure that it is of the right data type and format and that it does not leave the system in an insecure state.

Also, secure output of the system (printouts, reports, and so on) should be ensured. All sensitive output information should require a receipt before release and have proper access controls applied regardless of its format.

System Hardening

Another of the ongoing goals of operations security is to ensure that all systems have been hardened to the extent that is possible and still provide functionality. The hardening can be accomplished both on a physical basis and on a logical basis. Physical security of systems is covered in detail later in this chapter. From a logical perspective,

Remove unnecessary applications.
Disable unnecessary services.
Block unrequired ports.
Tightly control the connecting of external storage devices and media if it’s allowed at all.

Vulnerability Management Systems

The importance of performing vulnerability and penetration testing has been emphasized throughout this book. A vulnerability management system is software that centralizes, and to a certain extent automates, the process of continually monitoring and testing the network for vulnerabilities. These systems can scan the network for vulnerabilities, report them, and in many cases remediate the problem without human intervention. Although they’re a valuable tool in the toolbox, these systems, regardless of how sophisticated they might be, cannot take the place of vulnerability and penetration testing performed by trained professionals.

Machine Learning and Artificial Intelligence (AI)-Based Tools

Machine learning and artificial intelligence (AI)-based tools are tools that give systems the ability to learn and improve without much human input. Artificial intelligence is any technology that enables a machine to simulate human behavior. A type of AI, machine learning allows a machine to learn from what is happening and make decisions based on what it sees.

Using these technologies in a security context means that data can be used to detect threats and compile investigative information. These tools help cybersecurity professionals find, contextualize, and organize relevant data at any stage in the threat intelligence life cycle. With machine learning and AI, organizations can process the data much faster than if actual human effort was involved.

Patch and Vulnerability Management

Patch management is often seen as a subset of configuration management. Software patches are updates released by vendors that either fix functional issues with or close security loopholes in operating systems, applications, and versions of firmware that run on the network devices.

To ensure that all devices have the latest patches installed, deploy a formal system to make certain that all systems receive the latest updates after thorough testing in a nonproduction environment. It is not always possible for a vendor to anticipate every possible impact a change might have on business-critical systems in the network. The enterprise is responsible for ensuring that patches do not adversely impact operations.

The patch management life cycle includes the following steps:

Patch prioritization and scheduling: Determine the priority of the patches and schedule the patches for deployment.
Patch testing: Test the patches prior to deployment to ensure that they work properly and do not cause system or security issues.
Patch installation: Install the patches in the live environment.
Patch assessment and audit: After patches are deployed, ensure that the patches work properly.

Many organizations deploy a centralized patch management system to ensure that patches are deployed in a timely manner. With this system, administrators can test and review all patches before deploying them to the systems they affect. Administrators can schedule the updates to occur during nonpeak hours.

Vulnerability management identifies, classifies, remediates, and mitigates vulnerabilities in systems and applications. Vulnerability management tools, also referred to as vulnerability scanners, should be used to regularly assess the network, systems, and applications. Any identified vulnerabilities should be investigated and the appropriate remediation or mitigation steps taken. Nessus is a popular open-source vulnerability scanner in use today. As with patch management systems and antivirus applications, it is necessary to ensure that vulnerability scanners have the latest signature files.

Recovery Strategies

Identifying the preventive controls is the third step of the business continuity steps as outlined in NIST SP 800-34 R1. If preventive controls are identified in the business impact analysis (BIA), disasters or disruptive events might be mitigated or eliminated. These preventive measures deter, detect, and/or reduce impacts to the system. Preventive methods are preferable to actions that might be necessary to recover the system after a disruption if the preventive controls are feasible and cost effective.

The following sections discuss the primary controls that organizations can implement as part of business continuity and disaster recovery, including redundant systems, facilities, and power; fault-tolerance technologies; insurance; data backup; fire detection and suppression; high availability; quality of service; and system resilience.

Create Recovery Strategies

Organizations must create recovery strategies for all assets that are vital to successful operation. Higher-level recovery strategies identify the order in which processes and functions are restored. System-level recovery strategies define how a particular system is to be restored. Keep in mind those individuals who best understand the system should define system recovery strategies. Although the business continuity planning (BCP) committee probably can develop the prioritized recovery lists and high-level recovery strategies, system administrators and other IT personnel need to be involved in the development of recovery strategies for IT assets.

Disaster recovery tasks include recovery procedures, personnel safety procedures, and restoration procedures. The overall business recovery plan should require a committee to be formed to decide the best course of action. This recovery plan committee receives its direction from the BCP committee and senior management.

All decisions regarding recovery should be made in advance and incorporated into the disaster recovery plan (DRP). Any plans and procedures that are developed should refer to functions or processes, not specific individuals. As part of the disaster recovery planning, the recovery plan committee should contact critical vendors ahead of time to ensure that any equipment or supplies can be replaced in a timely manner.

When a disaster or disruptive event has occurred, the organization’s spokesperson should report the bad news in an emergency press conference before the press learns of the news through another channel. The DRP should detail any guidelines for handling the press. The emergency press conference site should be planned ahead of time.

When resuming normal operations after a disruptive event, the organization should conduct a thorough investigation if the cause of the event is unknown. Personnel should account for all damage-related costs that occur as a result of the event. In addition, appropriate steps should be taken to prevent further damage to property.

The commonality between all recovery plans is that they all become obsolete. For this reason, they require testing and updating.

The following sections include a discussion of categorizing asset recovery priorities, business process recovery, facility recovery, supply and technology recovery, user environment recovery, data recovery, and training personnel.

Categorize Asset Recovery Priorities

As discussed in Chapter 1, the recovery time objective (RTO), work recovery time (WRT), and recovery point objective (RPO) values determine what recovery solutions are selected. An RTO stipulates the amount of time an organization will need to recover from a disaster, and an RPO stipulates the amount of data an organization can lose when a disaster occurs. The RTO, WRT, and RPO values are derived during the BIA process.

In developing the recovery strategy, the recovery plan committee takes the RTO, WRT, and RPO values and determines the recovery strategies that should be used to ensure that the organization meets these BIA goals.

Critical devices, systems, and applications need to be restored earlier than devices, systems, or applications that do not fall into this category. Keep in mind when classifying systems that most critical systems cannot be restored using manual methods. The recovery plan committee must understand the backup/restore solutions that are available and implement the system that will provide recovery within the BIA values and cost constraints. The window of time for recovery of data-processing capabilities is based on the criticality of the operations affected.

Business Process Recovery

As part of the DRP, the recovery plan committee must understand the interrelationships between the processes and systems. A business process is a collection of tasks that produces a specific service or product for a particular customer or customers.

For example, if the organization determines that an accounting system is a critical application and the accounting system relies on a database server farm, the DRP needs to include the database server as a critical asset. Although restoring the entire database server farm to restore the critical accounting system might not be necessary, at least one of the servers in the farm is necessary for proper operation.

Workflow documents should be provided to the recovery plan committee for each business process. As part of recovering the business processes, the recovery plan committee must also understand the process’s required roles and resources, input and output tools, and interfaces with other business processes.

Supply and Technology Recovery

Although facility recovery is not often a concern with smaller disasters or disruptive events, almost all recovery efforts usually involve the recovery of supplies and technology. Organizations must ensure that any DRPs include guidelines and procedures for recovering supplies and technology. As part of supply and technology recovery, the DRP should include all pertinent vendor contact information in the event that new supplies and technological assets must be purchased.

The DRP must include recovery information on the following assets that must be restored:

Hardware backup
Software backup
Human resources
Heating, ventilation, and air conditioning (HVAC)
Supplies
Documentation

Hardware Backup

Hardware that must be included as part of the DRP includes client computers, server computers, routers, switches, firewalls, and any other hardware that is running on the organization’s network. The DRP must include not only guidelines and procedures for restoring all the data on each of these devices, but also information regarding restoring these systems manually if the systems are damaged or completely destroyed. Legacy devices that are no longer available in the retail market should also be identified.

As part of preparing the DRP, the recovery plan team must determine the amount of time that it will take the hardware vendors to provide replacements for any damaged or destroyed hardware. Without this information documented, any recovery plans might be ineffective due to lack of resources. Organizations might need to explore other options, including purchasing redundant systems and storing them at an alternate location, if vendors are unable to provide replacement hardware in a timely manner. When replacement of legacy devices is possible, organizations should take measures to replace them before the disaster occurs.

Software Backup

Even if an organization has every device needed to restore its infrastructure, those devices are useless if the applications and software that run on the devices are not available. The applications and software include any operating systems, databases, and utilities that need to be running on the device.

Many organizations might think that this requirement is fulfilled if they have a backup on either tape, DVD, flash drive, hard drive, or other media of all their software. But all software that is backed up usually requires at least an operating system to be running on the device on which it is restored. These data backups often also require that the backup management software is running on the backup device, whether that is a server or dedicated device.

All software installation media, service packs, and other necessary updates should be stored at an alternate location. In addition, all license information should be documented as part of the DRP. Finally, frequent backups of applications should be taken, whether this is through the application’s internal backup system or through some other organizational backup. A backup is useful only if it can be restored, so the DRP should fully document all the steps involved.

In many cases, applications are purchased from a software vendor, and only the software vendor understands the coding that occurs in the applications. Because there are no guarantees in today’s market, some organizations might decide that they need to ensure that they are protected against a software vendor’s demise. A software escrow is an agreement whereby a third party is given the source code of the software to ensure that the customer has access to the source code if certain conditions for the software vendor occur, including bankruptcy and disaster.

Human Resources

No organization is capable of operating without personnel. An occupant emergency plan specifically addresses procedures for minimizing loss of life or injury when a threat occurs. The human resources team is responsible for contacting all personnel in the event of a disaster. Contact information for all personnel should be stored onsite and offsite. Multiple members of the HR team should have access to the personnel contact information. Remember that personnel safety is always the primary concern. All other resources should be protected only after the personnel are safe.

After the initial event is over, the HR team should monitor personnel morale and guard against employee stress and burnout during the recovery period. If proper cross-training has occurred, multiple personnel can be rotated in during the recovery process. Any DRP should take into consideration the need to provide adequate periods of rest for any personnel involved in the disaster recovery process. It should also include guidelines on how to deal with situations where any personnel fall victims of a disaster.

The organization must ensure that salaries and other funding to personnel continue during and after the disaster. Because funding can be critical both for personnel and for resource purchases, authorized, signed checks should be securely stored offsite. Lower-level management with the appropriate access controls should have the ability to disperse funds using these checks in the event that senior management is unavailable.

An executive succession plan should also be created to ensure that the organization follows the appropriate steps to protect itself and continue operation.

Supplies

Often disasters affect the ability to supply an organization with its needed resources, including paper, cabling, and even water. The organization should document any resources that are vital to its daily operations and the vendors from which these resources can be obtained. Because supply vendors can also be affected by the disaster, alternative suppliers should be identified.

Documentation

For disaster recovery to be a success, the personnel involved must be able to complete the appropriate recovery procedures. Although the documentation of all these procedures might be tedious, it is necessary to ensure that recovery occurs. In addition, each department within the organization should be asked to decide what departmental documentation is needed to carry out day-to-day operations. This documentation should be stored in a central location onsite, and a copy should be retained offsite also. Specific personnel should be tasked with ensuring that this documentation is created, stored, and updated as appropriate.

User Environment Recovery

All aspects of the end-user environment recovery must be included as part of the DRP to ensure that the end users can return to work as quickly as possible. As part of this user environment recovery, end-user notification must occur. Users must be notified of where and when to report after a disaster occurs.

The actual user environment recovery should occur in stages, with the most critical functions being restored first. User requirements should be documented to ensure that all aspects of the user environment are restored. For example, users in a critical department might all need their own client computers. These same users might also need to access an application that is located on a server. If the server is not restored, the users will be unable to perform their job duties even if their client computers are available.

Finally, manual steps that can be used for any function should be documented. Because we are so dependent on technology today, we often overlook the manual methods of performing our job tasks. Documenting these manual methods might ensure that operations can still occur, even if they occur at a decreased rate.

Data Recovery

In most organizations, the data is one of the most critical assets when recovering from a disaster. The BCPs and DRPs must include guidelines and procedures for recovering data. However, the operations teams must determine which data is backed up, how often the data is backed up, and the method of backup used. So although we discuss data backup, remember that BCP teams do not actually make any data backup decisions. The BCP teams are primarily concerned with ensuring that the data that is backed up can be restored in a timely manner.

Next, we discuss the data backup types and schemes that are used as well as electronic backup methods that organizations can implement.

Data Backup Types and Schemes

To design an appropriate data recovery solution, security professionals must understand the different types of data backups that can occur and how these backups are used together to restore the live environments.

For the CISSP exam, you must understand the following data backup types and schemes:

Full backup
Differential backup
Incremental backup
Copy backup
Daily backup
Transaction log backup
First in, first out rotation scheme
Grandfather/father/son rotation scheme

The three main data backups are full backups, differential backups, and incremental backups. To understand these three data backup types, you must understand the concept of archive bits. When a file is created or updated, the archive bit for the file is enabled. If the archive bit is cleared, the file will not be archived during the next backup. If the archive bit is enabled, the file will be archived during the next backup.

With a full backup, all data is backed up. During the full backup process, the archive bit for each file is cleared. A full backup takes the longest time and the most space to complete. However, if an organization uses only full backups, then only the latest full backup needs to be restored. Any differential or incremental backup will first start with a full backup as its baseline. A full backup is the most appropriate for offsite archiving.

In a differential backup, all files that have been changed since the last full backup will be backed up. During the differential backup process, the archive bit for each file is not cleared. A differential backup might vary from taking a short time and a small amount of space to growing in both the backup time and amount of space it needs over time. Each differential backup will back up all the files in the previous differential backup if a full backup has not occurred since that time. In an organization that uses a full/differential scheme, the full backup and only the most recent differential backup must be restored, meaning only two backups are needed.

An incremental backup backs up all files that have been changed since the last backup of any type. During the incremental backup process, the archive bit for each file is cleared. An incremental backup usually takes the least amount of time and space to complete. In an organization that uses a full/incremental scheme, the full backup and each subsequent incremental backup must be restored. The incremental backups must be restored in order. If your organization completes a full backup on Sunday and an incremental backup daily Monday through Saturday, up to seven backups could be needed to restore the data. Figure 7-9 compares the different types of backups.

A table of Backup Types Comparison — **Figure 7-9** Backup Types Comparison

Copy and daily backups are two special backup types that are not considered part of any regularly scheduled backup scheme because they do not require any other backup type for restoration. Copy backups are similar to normal backups but do not reset the file’s archive bit. Daily backups use a file’s timestamp to determine whether it needs archiving. Daily backups are popular in mission-critical environments where multiple daily backups are required because files are updated constantly.

Transaction log backups are used only in environments where capturing all transactions that have occurred since the last backup is important. Transaction log backups help organizations to recover to a particular point in time and are most commonly used in database environments.

Although magnetic tape drives are still used to back up data, organizations today may back up their data to optical discs, including CD-ROMs, DVDs, and Blu-ray discs; high-capacity, high-speed magnetic drives; flash-based media; or even network storage. No matter the media used, retaining backups both onsite and offsite is important. Store onsite backup copies in a waterproof, heat-resistant, fire-resistant safe or vault.

Electronic Backup

Electronic backup solutions back up data quicker and more accurately than the normal data backups and are best implemented when information changes often.

For the CISSP exam, you should be familiar with the following electronic backup terms and solutions:

Electronic vaulting: Copies files as modifications occur. This method occurs in real time.
Remote journaling: Copies the journal or transaction log offsite on a regular schedule. This method occurs in batches.
Tape vaulting: Creates backups over a direct communication line on a backup system at an offsite facility.
Hierarchical storage management (HSM): Stores frequently accessed data on faster media and less frequently accessed data on slower media.
Optical jukebox: Stores data on optical disks and uses robotics to load and unload the optical disks as needed. This method is ideal when 24/7 availability is required.
Replication: Copies data from one storage location to another. Synchronous replication uses constant data updates to ensure that the locations are close to the same, whereas asynchronous replication delays updates to a predefined schedule.

Many companies use cloud backup or replication solutions. Any organization considering a cloud solution should research the full security implications of this type of deployment.

Training Personnel

Even if an organization takes the steps to develop the most thorough BCPs and DRPs, these plans are useless if the organization’s personnel do not have the skills to completely recover the organization’s assets when a disaster occurs. Personnel should be given the appropriate time and monetary resources to ensure that adequate training occurs. This includes allowing personnel to test any DRPs.

Training should be obtained from both internal and external sources. When job duties change or new personnel are hired, policies should be in place to ensure the appropriate transfer of knowledge occurs.

Backup Storage Strategies

As part of any backup plan, an organization should also consider the backup storage strategy or rotation scheme that it will use. Cost considerations and storage considerations often dictate that backup media is reused after a period of time. If this reuse is not planned in advance, media can become unreliable due to overuse. Two of the most popular backup rotation schemes are first in, first out and grandfather/father/son.

In the first in, first out (FIFO) scheme, the newest backup is saved to the oldest media. Although this is the simplest rotation scheme, it does not protect against data errors. If an error in data exists, the organization might not have a version of the data that does not contain the error.

In the grandfather/father/son (GFS) scheme, three sets of backups are defined. Most often these three definitions are daily, weekly, and monthly. The daily backups are the sons, the weekly backups are the fathers, and the monthly backups are the grandfathers. Each week, one son advances to the father set. Each month, one father advances to the grandfather set.

Figure 7-10 displays a typical 5-day GFS rotation using 21 backup media. The daily backups are usually differential or incremental backups. The weekly and monthly backups must be a full backup.

An illustration titled, typical 5-day G F S rotation using 21 tapes. — **Figure 7-10** Grandfather/Father/Son Backup Rotation Scheme

Recovery and Multiple Site Strategies

When dealing with an event that either partially or fully destroys the primary facility, the organization will need an alternate location from which to operate until the primary facility is restored. The DRP should define the alternate location and its recovery procedures, often referred to as a recovery site strategy.

The DRP should include not only how to bring the alternate location to full operation but also how the organization will return from the alternate location to the primary facility after it is restored. Also, for security purposes, the DRP should include details on the security controls that were used at the primary facility and guidelines on how to implement these same controls at the alternate location.

The most important factor in locating an alternate location during the development of the DRP is to ensure that the alternate location is not affected by the same disaster. This might mean that the organization must select an alternate location that is in another city or geographic region. The main factors that affect the selection of an alternate location include the following:

Geographic location
Organizational needs
Location’s cost
Location’s restoration effort

Testing an alternate location is a vital part of any DRP. Some locations are easier to test than others. The DRP should include instructions on when and how to periodically test alternate facilities to ensure that the contingency facility is compatible with the primary facility.

The alternate locations that you should understand for the CISSP exam include the following:

Hot site
Cold site
Warm site
Tertiary site
Reciprocal agreements
Redundant sites

Hot Site

A hot site is a leased facility that contains all the resources needed for full operation. This environment includes computers, raised flooring, full utilities, electrical and communications wiring, networking equipment, and UPSs. The only resource that must be restored at a hot site is the organization’s data, often only partially. It should take only a few minutes to hours to bring a hot site to full operation.

Although a hot site provides the quickest recovery, it is the most expensive to maintain due to the ready-to-use asset conditions. In addition, it can be administratively hard to manage if the organization requires proprietary hardware or software. A hot site requires the same security controls as the primary facility and full redundancy, including hardware, software, and communication wiring.

Cold Site

A cold site is a leased facility that contains only electrical and communications wiring, air conditioning, plumbing, and raised flooring. No communications equipment, networking hardware, or computers are installed at a cold site until it is necessary to bring the site to full operation. For this reason, a cold site takes much longer to bring to full operation than a hot or warm site.

Although a cold site provides the slowest recovery, it is the least expensive to maintain. It is also the most difficult to test.

Warm Site

A warm site is a leased facility that contains electrical and communications wiring, full utilities, and networking equipment. In most cases, the only devices that are not included in a warm site are the computers. A warm site takes longer to restore than a hot site but less than a cold site.

A warm site is somewhere between the restoration time and cost of a hot site and cold site. It is the most widely implemented alternate leased location. Although testing a warm site is easier than testing a cold site, a warm site requires much more effort for testing than a hot site.

Figure 7-11 compares the components deployed in these three sites.

A comparison chart is shown Hot Site, Warm Site, and Cold Site Comparison. — **Figure 7-11** Hot Site, Warm Site, and Cold Site Comparison

Tertiary Site

A tertiary site is a secondary backup site that provides an alternate in case the hot site, warm site, or cold site is unavailable. Many large companies implement tertiary sites to protect against catastrophes that affect large geographic areas.

For example, if an organization requires a data center that is located on the coast, the organization might have its primary location in New Orleans, Louisiana, and its hot site in Mobile, Alabama. This organization might consider locating a tertiary site in Omaha, Nebraska, because a hurricane can affect both the Louisiana and Alabama Gulf coast.

Reciprocal Agreements

A reciprocal agreement is an agreement between two organizations that have similar technological needs and infrastructures. In the agreement, both organizations agree to act as an alternate location for the other if either one of the organization’s primary facilities is rendered unusable. Unfortunately in most cases, these agreements are hard to enforce due to various legalities.

A disadvantage of this alternate site is that it might not be capable of handling the required workload and operations of the other organization.

Redundant Sites

A redundant site (or mirrored site) is one that is identically configured as the primary site. A redundant or mirrored site is not a leased site but is usually owned by the same organization that owns the primary site. The organization is responsible for maintaining the redundant site. Multiple processing sites can also be configured to serve as operationally redundant sites.

Although redundant sites are expensive to maintain, many organizations today see them as a necessary expense to ensure that uninterrupted service can be provided.

Redundant Systems, Facilities, and Power

In anticipation of disasters and disruptive events, organizations should implement redundancy for critical systems, facilities, and power and assess any systems that have been identified as critical to determine whether implementing redundant systems is cost effective. Implementing redundant systems at an alternate location often ensures that services are uninterrupted. Redundant systems include redundant servers, redundant routers, redundant internal hardware, and even redundant backbones. Redundancy occurs when an organization has a secondary component, system, or device that takes over when the primary unit fails.

Redundant facilities ensure that the organization maintains a facility at whatever level it chooses to ensure that the organizational services can continue when a disruptive event occurs.

Power redundancy is implemented using uninterruptible power supplies (UPSs) and power generators.

Redundancy on individual components can also be provided. The spare components are either cold spares, warm spares, or hot spares. A cold spare is not powered up but can be inserted into the system if needed. A warm spare is in the system but does not have power unless needed. A hot spare is in the system and powered on, ready to become operational at a moment’s notice.

Fault-Tolerance Technologies

Fault tolerance enables a system to continue operation in the event of the failure of one or more components. Fault tolerance within a system can include fault-tolerant adapter cards and fault-tolerant storage drives. One of the most well-known fault-tolerant systems is RAID, which is discussed earlier in this chapter.

By implementing fault-tolerant technologies, an organization can ensure that normal operation occurs if a single fault-tolerant component fails.

Insurance

Although redundancy and fault tolerance can actually act as preventive measures against failures, insurance is not really a preventive measure. If an organization purchases insurance to provide protection in the event of a disruptive event, the insurance has no power to protect against the event itself. The purpose of the insurance is to ensure that the organization will have access to additional financial resources to help in the recovery.

Keep in mind that recovery efforts from a disruptive event can often incur large financial costs. Even some of the best estimates might still fall short when the actual recovery must take place. By purchasing insurance, the organization can ensure that key financial transactions, including payroll, accounts payable, and any recovery costs, are covered.

Insurance actual cost valuation (ACV) compensates property based on the value of the item on the date of loss plus 10 percent. However, keep in mind that insurance on any printed materials covers only inscribed, printed, or written documents, manuscripts, or records. It does not cover money and securities. A special type of insurance called business interruption insurance provides monetary protection for expenses and lost earnings.

Organizations should annually review insurance policies and update them as necessary.

Data Backup

Data backup provides prevention against data loss but not prevention against disruptive events. All organizations should ensure that all systems that store important files are backed up in a timely manner. Users should also be encouraged to back up personal files that they might need. In addition, periodic testing of the restoration process should occur to ensure that the files can be restored.

Data recovery, including backup types and schemes and electronic backup, was covered in detail earlier in this chapter.

Fire Detection and Suppression

Organizations should implement fire detection and suppression systems as part of any BCP. Fire detection and suppressions vary based on the method of detection/suppression used and are discussed in greater detail in the “Environmental Security and Issues” section of Chapter 3.

High Availability

High availability in data recovery is a concept which ensures that data is always available using redundancy and fault tolerance. Most organizations implement high-availability solutions as part of any DRP.

High-availability terms and techniques that you must understand include the following:

Redundant Array of Independent Disks (RAID): A hard-drive technology in which data is written across multiple disks in such a way that a disk can fail and the data can be quickly made available from the remaining disks in the array without restoring from a backup tape or other backup media.
Storage-area network (SAN): High-capacity storage (several petabytes) devices that are connected by a high-speed private network using storage-specific switches.
Failover: The capacity of a system to switch over to a backup system if a failure in the primary system occurs.
Failsoft: The capability of a system to terminate noncritical processes when a failure occurs.
Clustering: A capability of a software product that provides load-balancing services. With clustering, one instance of an application server acts as a master controller and distributes requests to multiple instances using round-robin, weighted round-robin, or least-connections algorithms.
Load balancing: A capability of a hardware product that provides load-balancing services. Application delivery controllers (ADCs) support the same algorithms but also use complex number-crunching processes, such as per-server CPU and memory utilization, fastest response times, and so on, to adjust the balance of the load. Load-balancing solutions are also referred to as farms or pools.

Quality of Service

Quality of service (QoS) is a technology that manages network resources to ensure a predefined level of service. It assigns traffic priorities to the different types of traffic or protocols on a network. QoS deploys when a bottleneck occurs and decides which traffic is more important than the rest. Exactly what traffic is more important than what other traffic is based on rules the administrator supplies. Importance can be based on IP address, MAC address, and even service name. However, QoS works only when a bottleneck occurs in the appropriate location and the settings are bandwidth declarations. For example, if the QoS settings are set beyond the ISP’s bandwidth, traffic will not be prioritized if a router thinks there is enough available bandwidth. But what if the ISP’s maximums are being met, and the ISP decides what is or is not important? The key to any QoS deployment is to tweak the settings and observe the network over time.

System Resilience

System resilience is the ability of a system, device, or data center to recover quickly and continue operating after an equipment failure, a power outage, or another disruption. It involves the use of redundant components or facilities. When one component fails or is disrupted, the redundant component takes over seamlessly and continues to provide services to the users.

Disaster Recovery

Disaster recovery involves restoring services and systems from a contingency state, or the temporary state that operations may be in where they are running but not at the primary facility or on the optimum resources. The DRP is discussed in detail in Chapter 1. In this chapter, we discuss the disaster recovery process further, in terms of response, personnel, communications, assessment, restoration, and training and awareness.

Response

After an event has occurred, the appropriate personnel should be contacted to initiate the communications that alert the appropriate recovery team and the affected personnel of the event. All the teams listed in the personnel section then need to perform their duties. A process hierarchy must be developed so that each team performs its duties as part of the disaster recovery process in the correct order.

Personnel

Although the number one and number two priorities when a disaster occurs are personnel safety and health and damage mitigation, respectively, recovering from a disaster quickly becomes an organization’s priority after these two are handled. However, no organization can recover from a disaster if the personnel are not properly trained and prepared. To ensure that personnel can perform their duties during disaster recovery, they must know and understand their job tasks.

During any disaster recovery, financial management is important. Financial management usually includes the chief financial officer and any other key accounting personnel. This group must track the recovery costs and assess the cash flow projections. They formally notify any insurers of claims that will be made. Finally, this group is responsible for establishing payroll continuance guidelines, procurement procedures, and emergency costs tracking procedures.

Organizations must decide which teams are needed during a disaster recovery and ensure that the appropriate personnel are placed on each of these teams. The disaster recovery manager directs the short-term recovery actions immediately following a disaster.

Organizations might need to implement the following teams to provide the appropriate support for the DRP:

Damage assessment team
Legal team
Media relations team
Recovery team
Relocation team
Restoration team
Salvage team
Security team

Damage Assessment Team

The damage assessment team is responsible for determining the disaster’s cause and the amount of damage that has occurred to organizational assets. It identifies all affected assets and the critical assets’ functionality after the disaster. The damage assessment team determines which assets will need to be restored and replaced and contacts the appropriate teams that need to be activated.

Legal Team

The legal team deals with all legal issues immediately following the disaster and during the disaster recovery. The legal team oversees any public relations events that are held to address the disaster, although the media relations team will actually deliver the message. The legal team should be consulted to ensure that all recovery operations adhere to federal and state laws and regulations.

Media Relations Team

The media relations team informs the public and media whenever emergencies extend beyond the organization’s facilities according to the guidelines given in the DRP. The emergency press conference site should be planned ahead. When issuing public statements, the media relations team should be honest and accurate about what is known about the event and its effects. The organization’s response to the media during and after the event should be unified.

A credible, knowledgeable, experienced, and informed spokesperson appointed by the company should deliver the organization’s response. When dealing with the media after a disaster, the spokesperson should report bad news before the media discovers it through another channel. Anyone making disaster announcements to the public should understand that the audience for such announcements includes the media, unions, stakeholders, neighbors, employees, contractors, and even competitors.

Recovery Team

The recovery team’s primary task is recovering the critical business functions at the alternate facility. This task mostly involves ensuring that the physical assets are in place, including computers and other devices, wiring, and so on. The recovery team usually oversees the relocation and restoration teams.

Relocation Team

The relocation team oversees the actual transfer of assets between locations. This task includes moving assets from the primary site to the alternate site and then returning those assets when the primary site is ready for operation.

Restoration Team

The restoration team actually ensures that the assets and data are restored to operations. The restoration team needs access to the backup media.

Salvage Team

The salvage team recovers all assets at the disaster location and ensures that the primary site returns to normal. The salvage team manages the cleaning of equipment, oversees the rebuilding of the original facility, and identifies any experts to employ in the recovery process. In most cases, the salvage team can decide when operations at the disaster site can resume.

Security Team

The security team is responsible for managing the security at both the disaster site and any alternate location that the organization uses during the recovery. Because the geographic area that the security team must manage after the disaster is often much larger, the security team might need to hire outside contractors to aid in this process. Using these outside contractors to guard the physical access to the sites and internal resources to provide security inside the facilities is always a better approach because the reduced state might make issuing the appropriate access credentials to contractors difficult.

Communications

Communication during disaster recovery is important to ensure that the organization recovers in a timely manner. It is also important to ensure that no steps are omitted and that the steps occur in the correct order. Communication with personnel depends on who is being contacted about the disaster. Personnel who are affected by a disaster should receive communications that list the affected systems, the projected outage time, and any contingencies they should follow in the meantime. The different disaster recovery teams should receive communications that pertain to their duties during the recovery from the disaster.

During recovery, security professionals should work closely with the different teams to ensure that all assets remain secure. All teams involved in the process should also communicate often with each other to update each other on the progress.

Assessment

When an event occurs, personnel need to assess the event’s severity and impact. Doing so ensures that the appropriate response is implemented. Most organizations establish event categories, including nonincident, incident, and severe incident. Each organization should have a disaster recovery assessment process in place to ensure that personnel properly assess each event.

Restoration

The restoration process involves restoring the primary systems and facilities to normal operation. The personnel involved in this process depend on the assets that were affected by the event. Any teams involved in the recovery of assets should carefully coordinate their recovery efforts. Without careful coordination, recovery could be negatively impacted. For example, if full recovery of a web application requires that the database servers be operational, the database administrator must work closely with the web application and system administrators to ensure that both web applications and computer servers are returned to normal function.

Training and Awareness

Personnel at all levels need to be given the proper training on the disaster recovery process. Regular users just need to be given awareness training so that they understand the complexity of the process. Leadership needs training on how to lead the organization during a crisis. Technical teams need training on the recovery procedures and logistics. Security professionals need training on how to protect assets during recovery.

Most organizations include business continuity and disaster recovery awareness training as part of the initial training given to personnel when they are hired. Organizations should also periodically update personnel to ensure that they do not forget about disaster recovery.

Lessons Learned

Documenting lessons learned is the process of gathering information that reflects both the positive and negative experiences of a project, incident, or disaster recovery effort. The purpose of documenting lessons learned at the end of a disaster is to use the lessons to refine the disaster recovery plan and to provide future disaster recovery teams with information to increase efficiency.

By properly documenting these lessons, team members ensure that their experiences are carried forward to aid future teams.

Testing Disaster Recovery Plans

After the BCP is fully documented, an organization must take measures to ensure that the plan is maintained and kept up to date. At a minimum, an organization must evaluate and modify the BCP and DRP on an annual basis. This evaluation usually involves some sort of test to ensure that the plans are accurate and thorough. Testing frequently is important because any plan is not viable unless testing has occurred. Through testing, inaccuracies, deficiencies, and omissions are detected.

Testing the BCP and DRP prepares and trains personnel to perform their duties. It also ensures that the alternate backup site can perform as needed. When testing occurs, the test is probably flawed if no issues with the plan are found.

The types of tests that are commonly used to assess the BCP and DRP include the following:

Read-through test
Checklist test
Table-top exercise
Structured walk-through test
Simulation test
Parallel test
Full-interruption test
Functional drill
Evacuation drill

Read-Through Test

A read-through test involves the teams that are part of any recovery plan. These teams read through the plan that has been developed and attempt to identify any inaccuracies or omissions in the plan.

Checklist Test

The checklist test occurs when managers of each department or functional area review the BCP. These managers make note of any modifications to the plan. The BCP committee then uses all the management notes to make changes to the BCP.

Table-Top Exercise

A table-top exercise is the most cost-effective and efficient way to identify areas of overlap in the plan before conducting higher-level testing. A table-top exercise is an informal brainstorming session that encourages participation from business leaders and other key employees. In a table-top exercise, the participants are given roles and responsibilities and agree to a particular disaster scenario on which they will focus.

Structured Walk-Through Test

The structured walk-through test involves representatives of each department or functional area thoroughly reviewing the BCP’s accuracy. This type of test is the most important one to perform prior to a live disaster.

Simulation Test

In a simulation test, the operations and support personnel execute the DRP in a role-playing scenario. This test identifies omitted steps and threats.

Parallel Test

A parallel test involves bringing the recovery site to a state of operational readiness but maintaining operations at the primary site.

Full-Interruption Test

A full-interruption test involves shutting down the primary facility and bringing the alternate facility up to full operation. This is a hard switch-over in which all processing occurs at the primary facility until the “switch” is thrown. This type of test requires full coordination between all the parties and includes notifying users in advance of the planned test. An organization should perform this type of test only when all other tests have been implemented and are successful.

Functional Drill

A functional drill tests a single function or department to see whether the function’s DRP is complete. This type of drill requires the participation of the personnel that perform the function.

Evacuation Drill

In an evacuation drill, personnel follow the evacuation or shelter-in-place guidelines for a particular disaster type. In this type of drill, personnel must understand the area to which they are to report when the evacuation occurs. All personnel should be accounted for at that time.

Business Continuity Planning and Exercises

After a test is complete, all test results should be documented, and the plans should be modified to reflect those results. The list of successful and unsuccessful activities from the tests will be the most useful to management when maintaining the BCP. All obsolete information in the plans should be deleted, and any new information should be added. In addition, modifying current information based on new regulations, laws, or protocols might be necessary.

Version control of the plans should be managed to ensure that the organization always uses the most recent version. In addition, the BCP should be stored in multiple locations to ensure that it is available if a location is destroyed by the disaster. Multiple personnel should have the latest version of the plans to ensure that the plans can be retrieved if primary personnel are unavailable when the plan is needed.

Physical Security

Physical security involves using the appropriate security controls to protect all assets from physical access. Perimeter security involves implementing the appropriate perimeter security controls, including gates and fences, perimeter intrusion detection, lighting, patrol force, and access control, to prevent access to the perimeter of a facility. Building and internal security involves implementing the appropriate building and internal security controls.

Perimeter Security Controls

When considering the perimeter security of a facility, taking a holistic approach, sometimes known as the concentric circle approach, is sometimes helpful (see Figure 7-12). This approach relies on creating layers of physical barriers to information.

An illustration of a thief approaching a star at the center of four concentric circles representing the security layers. The layers are perimeter fence, exterior door, office door, and locked cabinet. — **Figure 7-12** Concentric Circle Approach

Next, we look at implementing this concept in detail.

Gates and Fences

The outermost ring in the concentric circle approach is composed of the gates and fences that surround the facility. Within that are interior circles of physical barriers, each of which has its own set of concerns. Here, we cover considerations for barriers (bollards), fences, gates, and walls.

Barriers (Bollards)

Barriers called bollards have become quite common around the perimeter of new office and government buildings. These short vertical posts placed at the building’s entrance way and lining sidewalks help to provide protection from vehicles that might either intentionally or unintentionally crash into or enter the building or injure pedestrians. They can be made of many types of materials. The ones shown in Figure 7-13 are stainless steel.

A stainless Steel bollard is placed on the road or building for fencing. — **Figure 7-13** Stainless Steel Bollards

Fences

Fencing is the first line of defense in the concentric circle paradigm. When selecting the type of fencing to install, consider the determination of the individuals you are trying to discourage from entry. Use the following guidelines with respect to height:

Fences 3 to 4 feet tall deter only casual intruders.
Fences 6 to 7 feet tall are too tall to climb easily.
Fences 8 feet and taller deter more determined intruders, especially when these fences are augmented with razor wire.

A geo-fence is a geographic area within which devices are managed using some sort of radio frequency communication. For example, a geo-fence could be set up in a radius around a store or point location or within a predefined set of boundaries, such as around a school zone. It is used to track users or devices entering or leaving the geo-fence area. Alerts could be configured to message the device’s user and the geo-fence operator of the device’s location.

Gates

Gates can be weak points in a fence if not handled correctly. Gates are rated by the Underwriters Laboratory (UL) in the following way. Each step up in class requires additional levels of protection:

Class 1 gate: Residential use
Class 2 gate: Commercial usage
Class 3 gate: Industrial usage
Class 4 gate: Restricted area

Walls

In some cases, walls might be called for around a facility. When that is the case, and when perimeter security is critical, perimeter intrusion detection systems, discussed next, can be deployed to alert security personnel of any breaching of the walls.

Perimeter Intrusion Detection

Regardless of whether an organization uses fences or walls, or even decides to deploy neither of these impediments, it can significantly reduce exposure by deploying one of the following types of perimeter intrusion detection systems. All the systems described next are considered physical intrusion detection methods.

Infrared Sensors

Passive infrared (PIR) systems operate by identifying changes in heat waves in an area. Because the presence of an intruder would raise the temperature of the surrounding air particles, this system alerts or sounds an alarm when this occurs.

Electromechanical Systems

Electromechanical systems operate by detecting a break in an electrical circuit. For example, the circuit might cross a window or door and when the window or door is opened, the circuit is broken, setting off an alarm of some sort. Another example might be a pressure pad placed under the carpet to detect the presence of individuals.

Photoelectric Systems

Photometric systems (or photoelectric systems) operate by detecting changes in the light and thus are used in windowless areas. They send a beam of light across the area, and if the beam is interrupted (by a person or a stray animal, for example), the alarm is triggered.

Acoustical Detection Systems

Acoustical systems use strategically placed microphones to detect any sound made during a forced entry. These systems work well only in areas where there is not a lot of surrounding noise. They are typically very sensitive, which would cause many false alarms in a loud area, such as a door next to a busy street.

Wave Motion Detector

Wave motion devices generate a wave pattern in the area and detect any motion that disturbs the wave pattern. When the pattern is disturbed, an alarm sounds.

Capacitance Detector

Capacitance detectors emit a magnetic field and monitor that field. If the field is disrupted, which will occur when a person enters the area, the alarm will sound.

CCTV

A closed-circuit television (CCTV) system uses sets of cameras that can either be monitored in real time or can record days’ worth of activity that can be viewed as needed at a later time. In very high-security facilities, these systems are usually monitored. One of the main benefits of using CCTV is that it increases a guard’s visual capabilities. Guards can monitor larger areas at once from a central location. CCTV is a category of physical surveillance, not computer/network surveillance.

Camera types include outdoor, infrared, fixed-position, pan/tilt, dome, and Internet Protocol (IP) cameras. When implementing cameras, organizations need to select the appropriate lens, resolution, frames per second (FPS), and compression. In addition, analysis of the lighting requirements of the different cameras must be understood; a CCTV system should work in the amount of light that the location provides. In addition, an organization must understand the different types of monitor displays, including single-image, split-screen, and large-format displays. Finally, storage space will be required, whether the videos are digital format and stored on a server or physical format and stored on physical tapes. Storage can be a particular concern when continuous monitoring is implemented.

Lighting

One of the best ways to deter crime and mischief is to shine a light on areas of concern. Next, we look at some types of lighting and some lighting systems that have proven to be effective. Lighting is considered a physical control for physical security.

Types of Systems

The security professional must be familiar with several types of lighting systems:

Continuous lighting: An array of lights that provide an even amount of illumination across an area.
Standby lighting: A type of system that illuminates only at certain times or on a schedule.
Movable lighting: Lighting that can be repositioned as needed.
Emergency lighting: Lighting systems with their own power source to use when power is out.

Types of Lighting

A number of options are available when choosing the illumination source or type of light. The following are the most common choices:

Fluorescent: A very low-pressure mercury-vapor gas-discharge lamp that uses fluorescence to produce visible light.
Mercury vapor: A gas-discharge lamp that uses an electric arc through vaporized mercury to produce light.
Sodium vapor: A gas-discharge lamp that uses sodium in an excited state to produce light.
Quartz lamps: A lamp consisting of an ultraviolet light source, such as mercury vapor, contained in a fused-silica bulb that transmits ultraviolet light with little absorption.

Regardless of the light source, it will be rated by its feet of illumination. When positioning the lights, you must take this rating into consideration. For example, if a controlled light fixture mounted on a 5-meter pole can illuminate an area 30 meters in diameter, for security lighting purposes, the distance between the fixtures should be 30 feet. Moreover, there should be extensive exterior perimeter lighting of entrances or parking areas to discourage prowlers or casual intruders.

Patrol Force

An excellent augmentation to all other detection systems is the presence of a guard patrolling the facility. This option offers the most flexibility in reacting to whatever occurs. One of the keys to success is adequate training of guards so that they are prepared for any eventuality. There should be a prepared response for any possible occurrence. One of the main benefits of this approach is that guards can use discriminating judgment based on the situation, which automated systems cannot do.

The patrol force can be internally hired, trained, and controlled or can be outsourced to a contract security company. An organization can control the training and performance of an internal patrol force. However, some organizations outsource the patrol force to ensure impartiality and cost savings.

Access Control

When physical access to the facility is granted, a number of guidelines should be followed with respect to record keeping. Every successful and unsuccessful attempt to enter the facility, including those instances where admission was granted, should be recorded as follows:

Date and time
Specific entry point
User ID employed during the attempt

Building and Internal Security Controls

Building and internal security involves the locks, keys, and escort requirements/visitor controls that organizations should consider. Building and internal security is covered in detail in Chapter 3.

Personnel Safety and Security

The human resources are the first priority for any organization to protect under all circumstances. In the event of a fire, the first action to always take is to evacuate all personnel. Their safety comes before all other considerations. Although equipment and, in most cases the data, can be recovered, human beings can neither be backed up nor replaced.

An Occupant Emergency Plan (OEP) provides coordinated procedures for minimizing loss of life or injury and protecting property damage in response to a physical threat. In a disaster of any type, personnel safety is the first concern.

The organization is responsible for protecting the privacy of each individual’s information, especially as it relates to personnel and medical records. Although this expectation of privacy does not necessarily and usually does not extend to their activities on the network, both federal and state laws hold organizations responsible for the release of this type of information, with violations resulting in heavy fines and potential lawsuits if the company is found liable.

Organizations should develop policies for dealing with employee duress, travel, monitoring, emergency management, and security training and awareness.

Duress

Employee duress occurs when an employee is coerced to commit an action by another party. This is a particular concern for high-level management or employees with high security clearances because they have access to extra assets. Organizations should train employees on what to do when under duress. For any security codes, PINs, or passwords that are used, it is a good policy to implement a secondary duress code. Then, if personnel are under duress, they use the duress code to access the systems, facilities, or other assets. Security personnel are alerted that the duress code has been used. Organizations should stress to personnel that the protection of life should trump any other considerations.

Travel

Employees often travel for business purposes and take their organization-issued assets while traveling. Employees must be given the proper training to ensure that they keep organization-issued assets safe during the travel period and to be particularly careful when in public. They should also receive instructions on properly reporting lost or stolen assets.

Monitoring

Employee actions on organizational assets may need to be monitored, particularly for personnel with high clearance levels. However, it is important that personnel understand that they are being monitored. Organizations that will monitor employees should issue a no-expectation-of-privacy statement. Employees should be given a copy of this statement when hired and should sign a receipt for the statement. In addition, periodic reminders of this policy should be placed in prominent locations, including on bulletin boards, login screens, and websites.

For any monitoring to be effective, organizations should capture baseline behavior for users.

Emergency Management

Organizations should have specific emergency management policies and procedures in place. Emergency management teams should be formed to document the types of emergencies that could occur and prepare the appropriate emergency plans to be used if a specific emergency occurs.

These plans should be periodically tested to ensure that personnel understand what to do in the event of an emergency and revised based on the results of these tests.

Emergencies that should be anticipated include weather events (such as tornadoes, hurricanes, and winter storms), active shooter situations, and power outages. Emergency management oftentimes leads to business continuity and disaster recovery if the effects of the emergency are long term. Emergency management is concerned with the immediate reaction to the emergency. While business continuity and disaster recovery are focused on the recovery of the organization to normal operations, not every emergency will require full disaster recovery. For example, if an organization is notified that a tornado warning has been issued, the organization should implement the emergency plan for tornadoes. If the tornado does not affect the facility, operations can return to normal as soon as the warning expires and the tornado passes. If the tornado affects the facility, however, it might be necessary to implement the business continuity and disaster recovery plans.

Security Training and Awareness

Personnel should receive security training and awareness regularly. Security training and awareness are covered in detail in Chapter 1.

Exam Preparation Tasks

As mentioned in the section “About the CISSP Cert Guide, Fourth Edition” in the Introduction, you have a couple of choices for exam preparation: the exercises here, Chapter 9, “Final Preparation,” and the exam simulation questions in the Pearson Test Prep Software Online.

Review All Key Topics

Review the most important topics in this chapter, noted with the Key Topic icon in the outer margin of the page. Table 7-2 lists a reference of these key topics and the page numbers on which each is found.

Table 7-2 Key Topics for Chapter 7

Key Topic Element	Description	Page Number
List	Forensic investigation steps	639
List	Order of volatility	640
List	IOCE principles	643
List	Crime scene steps	643
List	Five rules of evidence	646
List	Types of evidence	647
List	Media analysis types	650
List	Software analysis techniques	650
List	Network analysis techniques	651
Section	Log Types	655
List	Functions of configuration management	659
Section	Security Operations Concepts	664
Table 7-1	RAID Levels	675
List	Incident response steps	682
List	Patch management life cycle steps	690
List	Data backup types and schemes	696
Figure 7-11	Hot Site, Warm Site, and Cold Site Comparison	702
List	Types of tests used to assess BCP and DRP	711
Section	Perimeter Security Controls	713

Define Key Terms

Define the following key terms from this chapter and check your answers in the glossary:

circumstantial evidence

closed-circuit television (CCTV) system

corroborative evidence

crime scene

criminal investigation

daily backup

data clearing

data loss prevention (DLP) software

first in, first out (FIFO)

fluorescent

full backup

full-interruption test

grandfather/father/son (GFS)

hearsay evidence

hierarchical storage management (HSM) system

network-attached storage (NAS)

operations security

opinion evidence

opportunity

passive infrared (PIR) system

photometric system

quality of service (QoS)

regulatory investigation

remanence

resource provisioning

service-level agreement (SLA)

steganography analysis

storage-area network (SAN)

structured walk-through test

threat intelligence feed (TI feed)

threat feed

threat intelligence sources

threat hunting

transaction log backup

trusted path

trusted recovery

two-person control

user and entity behavior analytics (UEBA)

user behavior analytics (UBA)

warm site

whitelisting

Answer Review Questions

1. What is the first step of the incident response process?

Respond to the incident.
Detect the incident.
Report the incident.
Recover from the incident.

2. What is the second step of the forensic investigations process?

Identification
Collection
Preservation
Examination

3. Which of the following is not one of the five rules of evidence?

Be accurate.
Be complete.
Be admissible.
Be volatile.

4. Which of the following refers to allowing access to users only to the resources required to do their jobs?

Job rotation
Separation of duties
Need to know/least privilege
Mandatory vacation

5. Which of the following is an example of an intangible asset?

Disc drive
Recipe
People
Server

6. Which of the following is not a step in incident response management?

Detect
Respond
Monitor
Report

7. Which of the following is not a backup type?

Full
Incremental
Grandfather/father/son
Transaction log

8. Which term is used for a facility that contains all the resources needed for full operation?

Cold site
Hot site
Warm site
Tertiary site

9. Which electronic backup type stores data on optical discs and uses robotics to load and unload the optical disks as needed?

Optical jukebox
Hierarchical storage management
Tape vaulting
Replication

10. What is failsoft?

The capacity of a system to switch over to a backup system if a failure in the primary system occurs
The capability of a system to terminate noncritical processes when a failure occurs
A software product that provides load-balancing services
High-capacity storage devices that are connected by a high-speed private network using storage-specific switches

11. An organization’s firewall is monitoring the outbound flow of information from one network to another. What specific type of monitoring is this?

Egress monitoring
Continuous monitoring
CMaaS
Resource provisioning

12. Which of the following are considered virtual assets? (Choose all that apply.)

Software-defined networks
Virtual storage-area networks
Guest OSs deployed on VMs
Virtual routers

13. Which of the following describes the ability of a system, device, or data center to recover quickly and continue operating after an equipment failure, power outage, or other disruption?

Quality of service (QoS)
Recovery time objective (RTO)
Recovery point objective (RPO)
System resilience

14. Which of the following are the main factors that affect the selection of an alternate location during the development of a DRP? (Choose all that apply.)

Geographic location
Organizational needs
Location’s cost
Location’s restoration effort

15. Which of the following is a hard-drive technology in which data is written across multiple disks in such a way that when one disk fails, data can be made available from other functioning disks?

RAID
Clustering
Failover
Load balancing

16. You need to record incoming and outgoing network traffic information in order to determine the origin of an attack. Which of the following logs would be appropriate for this purpose?

System log
Application log
Firewall log
Change log

17. What should you perform on all information accepted into a system to ensure that it is of the right data type and format and that it does not leave the system in an insecure state?

Clipping levels
Two-person control
Access review audits
Input validation

18. Which of the following defenses would you implement to discourage a determined intruder?

3 to 4 feet tall fence
6 to 7 feet tall fence
8 feet and taller fence
Geo-fence

19. Which of the following actions could you perform to logically harden a system? (Choose all that apply.)

Remove unnecessary applications.
Disable unnecessary services.
Block unrequired ports.
Tightly control the connecting of external storage devices and media.

Answers and Explanations

1. b. The steps of the incident response process are as follows:

Detect the incident.
Respond to the incident.
Report the incident to the appropriate personnel.
Recover from the incident.
Remediate all components affected by the incident to ensure that all traces of the incident have been removed.
Review the incident and document all findings.

2. c. The steps of the forensic investigation process are as follows:

Identification
Preservation
Collection
Examination
Analysis
Presentation
Decision

3. d. The five rules of evidence are as follows:

Be authentic.
Be accurate.
Be complete.
Be convincing.
Be admissible.

4. c. When security professionals allow access to resources and assign rights to perform operations, the concept of least privilege (also called need to know) should always be applied. In the context of resource access, this means the default level of access should be no access. Users should be given access only to resources required to do their jobs, and that access should require manual implementation after the requirement is verified by a supervisor.

5. b. In many cases, some of the most valuable assets for a company are intangible ones, such as secret recipes, formulas, and trade secrets.

6. c. The steps in incident response management are

Detect the incident.
Respond to the incident.
Mitigate the incident.
Report the incident.
Recover from the incident.
Remediate the incident.
Review and document lessons learned.

7. c. Grandfather/father/son is not a backup type; it is a backup rotation scheme.

8. b. A hot site is a leased facility that contains all the resources needed for full operation.

9. a. An optical jukebox stores data on optical discs and uses robotics to load and unload the optical discs as needed.

10. b. Failsoft is the capability of a system to terminate noncritical processes when a failure occurs.

11. a. Egress monitoring occurs when an organization monitors the outbound flow of information from one network to another. The most popular form of egress monitoring is carried out using firewalls that monitor and control outbound traffic. Continuous monitoring and Continuous Monitoring as a Service (CMaaS) are not specific enough to answer this question. Any logging and monitoring activities should be part of an organizational continuous monitoring program. The continuous monitoring program must be designed to meet the needs of the organization and implemented correctly to ensure that the organization’s critical infrastructure is guarded. Organizations may want to look into CMaaS solutions deployed by cloud service providers. Resource provisioning is the process in security operations that ensures that the organization deploys only the assets that it currently needs.

12. a, b, c, d. Virtual assets include software-defined networks (SDNs), virtual storage-area networks (VSANs), guest operating systems deployed on virtual machines (VMs), and virtual routers. As with physical assets, the deployment and decommissioning of virtual assets should be tightly controlled as part of configuration management because virtual assets, like physical assets, can be compromised.

13. d. System resilience is the ability of a system, device, or data center to recover quickly and continue operating after an equipment failure, power outage, or other disruption. It involves the use of redundant components or facilities. Quality of service (QoS) is a technology that manages network resources to ensure a predefined level of service. It assigns traffic priorities to the different types of traffic on a network. A recovery time objective (RTO) stipulates the amount of time an organization needs to recover from a disaster, and a recovery point objective (RPO) stipulates the amount of data an organization can lose when a disaster occurs.

14. a, b, c, d. The main factors that affect the selection of an alternate location during the development of a disaster recovery plan (DRP) include the following:

Geographic location
Organizational needs
Location’s cost
Location’s restoration effort

15. a. Redundant Array of Independent Disks (RAID) is a hard-drive technology in which data is written across multiple disks in such a way that a disk can fail and the data can be quickly made available from remaining disks in the array without restoring from a backup tape or other backup media. Clustering refers to a software product that provides load-balancing services. With clustering, one instance of an application server acts as a master controller and distributes requests to multiple instances using round-robin, weighted round-robin, or least-connections algorithms. Failover is the capacity of a system to switch over to a backup system if a failure in the primary system occurs. Load balancing refers to a hardware product that provides load-balancing services. Application delivery controllers (ADCs) support the same algorithms but also use complex number-crunching processes, such as per-server CPU and memory utilization, fastest response times, and so on, to adjust the balance of the load. Load-balancing solutions are also referred to as farms or pools.

16. c. Firewall logs record network traffic information, including incoming and outgoing traffic. This usually includes important data, such as IP addresses and port numbers that can be used to determine the origin of an attack. System logs record system events, such as system and service startup and shutdown. Applications logs record actions that occur within a specific application. Change logs report changes made to a specific device or application as part of the change management process.

17. d. The main thrust of input/output control is to apply controls or checks to the input that is allowed to be submitted to the system. Performing input validation on all information accepted into the system can ensure that it is of the right data type and format and that it does not leave the system in an insecure state. Clipping levels set a baseline for normal user errors, and violations exceeding that threshold will be recorded for analysis of why the violations occurred. A two-person control, also referred to as a two-man rule, occurs when certain access and actions require the presence of two authorized people at all times. Access review audits ensure that object access and user account management practices adhere to the organization’s security policy.

18. c. Fencing is the first line of defense in the concentric circle paradigm. When selecting the type of fencing to install, consider the determination of the individuals you are trying to discourage. Use the following guidelines with respect to height:

Fences 3 to 4 feet tall deter only casual intruders.
Fences 6 to 7 feet tall are too tall to climb easily.
Fences 8 feet and taller deter more determined intruders, especially when those fences are augmented with razor wire.

A geo-fence is a geographic area within which devices are managed using some sort of radio frequency communication. It is used to track users or devices entering or leaving the geo-fence area.

19. a, b, c, d. An ongoing goal of operations security is to ensure that all systems have been hardened to the extent that is possible and still provide functionality. The following actions can be performed to logically harden a system:

Remove unnecessary applications.
Disable unnecessary services.
Block unrequired ports.
Tightly control the connecting of external storage devices and media if it’s allowed at all.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.