Chapter 8

Security Operations

Terms you’ll need to understand:

  • Images Redundant array of inexpensive disks (RAID)

  • Images Clustering

  • Images Distributed computing

  • Images Cloud computing

  • Images Media management

  • Images Least privilege

  • Images Mandatory vacations

  • Images Due care

  • Images Due diligence

  • Images Privileged entities

  • Images Clipping level

  • Images Resource protection

Topics you’ll need to master:

  • Images Disaster recovery processes and plans

  • Images How to understand and support investigations

  • Images Foundational security concepts

  • Images Different types of RAID

  • Images How to implement disaster recovery strategies and recovery strategies

  • Images How to participate in business continuity planning and exercises

  • Images Perimeter and internal physical controls

  • Images How to implement disaster recovery processes

  • Images Auditing and monitorin

Introduction

When preparing for the (ISC)2 CISSP exam or reviewing the Security Operations domain, you need to understand what resources should be protected and be familiar with principles of best practices; methods to restrict access, protect resources, and monitor activity; and how to respond to incidents.

The Security Operations domain covers a wide range of topics involving operational security best practices. Security professionals apply operational controls to daily activities to keep systems running smoothly and facilities secure. This chapter reviews those controls and shows how their application to day-to-day activities can prevent or mitigate attacks.

The process starts before an employee is hired. Employers should perform background checks, reference checks, criminal history reports, and educational verification. Among many other onboarding tasks, a new employee must be trained on corporate policies.

Controls need to be put in place to limit the ability and access an employee has. Access is a major control that should be limited to just what is needed to complete required tasks; this limit is referred to as least privilege. Job rotation, dual control, and mandatory vacations are also several examples of these types of controls.

Controls are not just about people. Controls are also needed to deal with system failure. Disaster recovery and business continuity planning and exercises are key controls in this area.

Many of the controls discussed in this chapter are technical in nature. These controls include intrusion prevention, network access control, anti-malware, RAID and security information, and event management. Each of these controls is used in a unique way to prevent, detect, and recover from security incidents and exposures. Keep in mind that violations to operational security aren’t always malicious; sometimes things break or accidents happen. Operational security must be prepared to deal with such unintended occurrences by building in system resilience and fault tolerance.

Foundational Security Operations Concepts

Ask any seasoned security professional what it takes to secure its networks, systems, applications, and data, and the answer will most likely involve a combination of operational, technical, and physical controls. This process starts before you ever hire your first employee. Employees need to know what is expected of them. Accounts need to be configured, users need to have the appropriate level of access approved, and monitoring must be implemented. The following sections discuss these topics.

Managing Users and Accounts

One foundational way to increase accountability is to enforce specific roles and responsibilities in an organization. Most organizations have clearly defined controls that specify what each job role is responsible for. The following are some common roles in organizations:

  • Images Systems administrator: This role is responsible for the operation and maintenance of the LAN and associated components, such as Windows Server 2019, Linux, and possibly mainframes. A small organization might have only one systems administrator, and a larger one might have many.

  • Images Quality assurance specialist: This role can focus on either quality assurance or quality control. Quality assurance employees make sure programs and documentation adhere to standards; quality control employees perform tests at various stages of product development to make sure the products are free of defects.

  • Images Database administrator: This role is responsible for the organization’s data and maintains the data structure. The database administrator has control over all the data; therefore, detective controls and supervision of duties must be closely observed. This role is usually filled by a senior information systems employee because these employees have control over the physical data database, implementation of data definition controls, and definition and initiation of backup and recovery.

  • Images Systems analyst: This role is involved in the software development lifecycle (SDLC) process and is responsible for determining the needs of users and developing the requirements and specifications for the design of needed software.

  • Images Network administrator: This role is responsible for maintenance and configuration of network equipment, such as routers, switches, firewalls, wireless access points, and so on.

  • Images Security architect: This role is responsible for examining the security infrastructure of the organization’s network.

Job titles can be confusing because different organizations tend to use different titles for identical positions. In addition, smaller organizations tend to combine duties under one position or title. For example, some network architects are called network engineers. The critical concept for a security professional is to understand that, to avoid conflicts of interest, certain roles should not be combined. Table 8.1 lists some examples of role combinations and whether it’s okay to combine them.

TABLE 8.1 Separation of Duties

First Job Role

Can Be Combined With?

Second Job Role

Systems analyst

No

Security administrator

Application programmer

Yes

Systems analyst

Help desk

No

Network administrator

Data entry

Yes

Quality assurance

Computer operator

No

Systems programmer

Database administrator

Yes

Systems analyst

Systems administrator

No

Database administrator

Security administrator

No

Application programmer

Systems programmer

No

Security administrator

The titles and descriptions in Table 8.1 are just examples, and many organizations might describe them differently or assign more or less responsibility to particular job roles. To better understand the effect of role combinations that can conflict, consider a small company that employs one person as both the network administrator and the security administrator. This represents a real weakness because of the conflict of interest in the range of duties that a security administrator and a network administrator must perform: Whereas a network administrator is tasked with keeping the system up and running and keeping services available, a security administrator is tasked with turning services off, blocking them, and denying user access. A security professional should be aware of such incompatibilities and be concerned about the risks that can arise when certain roles are combined. Finally, any employee of the organization who has elevated access requires careful supervision. Such individuals should be considered privileged entities.

Privileged Entities

A privileged entity is anyone who has a higher level of access than a typical user. Privileged entities can include mainframe operators, security administrators, network administrators, power users, and anyone with higher-than-typical levels of access. It important that sufficient controls be placed on these entities so that misuse of their access is deterred or, if their access is misused, it can be detected and corrected.

Controlling Access

Before hiring employees, you must make sure that you have the right person for the right job. Items such as background checks, reference checks, education/certification checks, and Internet or social media checks might be run before new-hire orientation ever occurs. New employees might be asked to sign nondisclosure agreements (NDAs), agree to good security practices, and agree to acceptable use policies (AUPs).

When employees are onboarded, a number of controls can be used to control access and privilege. First, separation of duties describes the process of dividing duties so that more than one person is required to complete a particular task. Job rotation can be used to maintain redundancy, back up key personnel, and help identify fraudulent activities. The principle of least privilege is another important concept that can help an organization achieve its operational security goals. According to this principle, individuals should have only enough resources to accomplish their required tasks.

Controls such as mandatory vacations provide time for audits and for examining user activity for illicit activities. Controls need to be backed up by policies, procedures, and training. Keep in mind that organizations benefit when each employee actively participates in the security of the organization.

Clipping Levels

No one has the time to investigate every event or anomaly that occurs, but an organization must have systems in place to log and monitor activities. An organization can set a clipping level to identify an acceptable threshold for the normal mistakes a user might commit. Then, events that occur with a frequency in excess of the clipping level can trigger administrative notification and investigation.

A clipping level allows users to occasionally make mistakes, but if the established level is exceeded, violations are recorded or some type of response occurs. A network administrator might, for example, allow users to attempt to log in three times. If a user can’t get the password right by the third try, the account is locked, and the user is forced to call the help desk for support. If an administrator or a help desk staffer is contacted to reset a password, a second type of authentication should be required to protect against social engineering attacks. Chapter 7, “Security Assessment and Testing,” covers social engineering in detail.

Tip

To prevent social engineering attacks, when individuals need to have their passwords reset by automated means, they should be required to authenticate by providing information such as user ID, PIN, or two or more cognitive passwords. For systems with higher security, physical retrieval or in-person verification should be required for password recovery.

Resource Protection

When you think of resource protection, you might think of servers or other tangible assets. But resources can be both tangible and intangible. Tangible assets include equipment and buildings, and intangible assets can include such things as patents, trademarks, copyrights, and brand recognition. Loss of a trade secret to a competitor can be just as devastating as employee theft of a laptop. An organization must take reasonable care to protect all items of value.

Due Care and Due Diligence

Due care is focused on taking reasonable ongoing care to protect the assets of an organization. Due diligence is the background research. For example, before accepting credit cards, you might want to research the laws that govern their use, storage, and handling. In this case, due diligence would be associated with reviewing the controls highlighted in PCI-DSS.

Note

Due diligence was first used as a result of the U.S. Securities Act of 1933.

Organizations and their senior management are increasingly being held to higher levels of due care and due diligence. Depending on the law, senior management who are found negligent can be held responsible for criminal and/or financial damages. The Sarbanes-Oxley Act of 2002 and the Federal Information Security Modernization Act have increased an organization’s liability for maintaining industry compliance. For example, U.S. federal sentencing guidelines allow for fines in excess of $200 million.

When an organization’s due diligence is challenged, the court system looks at what a “prudent person” would have done; this is referred to as the reasonably prudent person rule. For example, a prudent person would implement PCI-DSS controls for credit card transactions for a retail store using a point of sale (POS) device with more than 80,000 transactions a year. The reasonably prudent person is a legal abstraction; in the context of cybersecurity, it would be a professional, well trained, certified, educated individual with common sense in cyberdefense.

Note

While PCI is a major standard for control of financial information, the Group of Eight (G8) started as a forum for the governments of eight of the world’s largest economies to discuss issues related to commerce. Today, it has grown to 20 members.

Asset Management

Asset management is the process of identifying all the hardware and software assets in an organization, including the organization’s employees. There is no way to assess risk or to consider what proper operational controls are without good asset management. Asset management not only helps an organization gain control of its software and hardware assets but also increases the organization’s accountability. Consider the process of hardening, patching, and updating. This process cannot be effectively managed without knowing what operating systems and/or software an organization owns and on what systems those products are installed.

System Hardening

Once we know what assets we have, system hardening is used to eliminate all applications, processes, and services that are not required for the business to function. When attackers attempt to gain access to a system, they typically look for systems that are highly vulnerable or where there is “low-hanging fruit.” This phrase describes services and applications that are easily exploitable, often because they are unnecessary and unmanaged. The purpose of system hardening is to reduce the attack surface by removing anything that is not needed or at least to isolate vulnerable services away from sensitive systems. After a system has been reduced to its bare essentials, there are fewer avenues for a potential attacker to exploit.

Hardening should also be considered from a hardware perspective. Hardware components such as DVD drives and USB ports should be disabled or removed. Also, hardening can be extended to the physical premises. Wiring closets should be locked, data centers should permit limited access, and network equipment such as switches, routers, and wireless access points should be physically secured.

Note

After performing many security assessments, one of the first things I now look for when I enter a facility is a lack of physical controls on assets such as wireless access points, telecommunication equipment, servers, and riser rooms. If an asset is physically accessible to an intruder, it is insecure.

Once a system has been hardened and approved for release, a baseline needs to be approved. Baselining is simply capturing a configuration or an image at a point in time and understanding the current system security configuration.

All your work up to this point would do little good if the systems were not maintained in a secure state. This is where change management comes into play.

Change and Configuration Management

Organizations put a lot of effort into securing assets and hardening systems. To manage required system changes, controls must be put in place to make sure all changes are documented and approved. This is accomplished through the change management process. Any time a change is to be made, it is important to verify what is being requested, how it will affect the systems, and what unexpected actions might occur. Most organizations do not directly deploy a patch without first testing it to see what changes will occur after the patch has been installed. It is important to ensure that changes do not somehow diminish or reduce the security of a system. Configuration management should also provide a means to roll back or undo any applied changes in the event that negative effects occur because of the change. Although change management processes can be implemented slightly differently in various organizations, the following is a generic process:

  1. Request a change.

  2. Approve the change.

  3. Catalog the change.

  4. Schedule the change.

  5. Prepare a means to roll back the change, if needed.

  6. Implement the change.

  7. Test or confirm the change.

  8. Report the completion of the change to the appropriate individuals/groups.

Tip

While some might question the need to have a rollback plan, things can go wrong. For example, in December 2020, Microsoft’s Windows 10 update conflicted with CORSAIR Utility Engine software and caused Windows to crash.

Despite the fact that different organizations might implement change management in different ways, there can be no argument over the value of using comprehensive change management. The primary benefits of change management include the following:

  • Images Verification that change is implemented in an orderly manner through formalized testing

  • Images Verification that the user base is informed of impending/completed changes

  • Images Review of the effects of changes on the system after implementation to create lessons learned for the next change

  • Images Mitigation of any adverse impact that changes might have had on services, systems, or resources

Change management can also be used to demonstrate due care and due diligence.

Trusted Recovery

Any failure that endangers the security of a system must be understood and investigated. It is critical that an organization’s environment be protected during recovery. Consider a server running Windows 2019 Server. Have you ever noticed that when you shut down such a server, you are asked why you are shutting it down? The screen that asks this question is an example of an operational control.

To protect the environment during the reboot/restart process, access to the server must be limited. You want to prevent opportunities for people to disrupt the process.

Some examples of recovery limits include the following:

  • Images Preventing a system from being booted from the network, DVD, or USB

  • Images Logging restarts so that auditing can be performed

  • Images Blocking complementary metal-oxide semiconductor (CMOS) changes to prevent tampering

  • Images Denying forced shutdowns

Remote Access

As transportation, utilities, and other associated costs associated with traditional 9-to-5 employees and global changes like the COVID-19 pandemic alter the way business is conducted, organizations are increasingly permitting employees to telecommute, access resources remotely, and use cloud computing. Organizations are therefore being required to enable remote access to their networks. However, remote access offers attackers a potential means of gaining access to the protected network. Therefore, organizations need to implement good remote access practices to mitigate risk. Some basic remote access controls include the following:

  • Images Implementing caller ID

  • Images Using a callback system

  • Images Disabling unused authentication protocols

  • Images Using strong authentication, including MFA

  • Images Implementing remote and centralized logging

  • Images Using VPNs and encryption

Media Management, Retention, and Destruction

Resource protection techniques go beyond when the resource is being used and also include disposal. If data is held on hard drives, magnetic media, or thumb drives, those devices must eventually be sanitized. Sanitization is the process of clearing all identified content such that no data remnants can be recovered. When sanitization is performed, none of the original information can be recovered. The following are some of the methods used for sanitization:

  • Images Drive wiping: This method involves overwriting all information on a drive. It allows the drive to be reused.

  • Images Zeroization: This involves overwriting the data with zeros. Zeroization is defined in ANSI X9.17.

  • Images Degaussing: This method is used to permanently destroy the contents of a hard drive or magnetic media. With degaussing, a powerful magnet is used to penetrate the media and polarize the magnetic particles on the tape or hard disk platters. Degaussed media cannot be reused.

  • Images Physical destruction: This may be required to sanitize newer solid-state drives.

Telecommunication Controls

Guglielmo Marconi probably had no idea that his contributions to the field of radio would lead to all the telecommunications systems available today. A security professional is not going to be tasked with building the first ship-to-shore radio system, like Marconi did, but she must be aware of current telecommunication systems and understand their usage and potential vulnerabilities. Concepts related to these systems that you must be aware of include cloud computing, email systems, fax machines, public branch exchanges (PBXs), whitelisting, sandboxing, and anti-malware.

Cloud Computing

Cloud computing refers to using Internet-based systems to perform on-demand computing. Users only have to pay for the services and computing resources they require and can increase usage when more computing resources are needed or reduce usage when the services are not needed. The following are some of the most common cloud computing models:

  • Images Monitoring-as-a-service (MaaS): MaaS allows IT and other organizations to remotely monitor and manage networks, applications, and services.

  • Images Communication-as-a-service (CaaS): With CaaS, the service provider seamlessly integrates multiple communication devices or channels for voice, video, IM, and email as a single solution.

  • Images Infrastructure-as-a-service (IaaS): IaaS enables organizations to rent storage and computing resources, such as servers, networking technology storage, and data center space.

  • Images Platform-as-a-service (PaaS): PaaS provides access to platforms that let organizations develop, test, and deploy applications. It is a cloud computing service delivery model that delivers a set of software. In PaaS, the user’s application resides entirely on the cloud, from development to delivery.

  • Images Software-as-a-service (SaaS): SaaS enables an organization to use applications that are running in the service provider’s environment. It is a cloud service model that delivers prebuilt applications over the Internet on an on-demand basis. SaaS can use a multi-tenant architecture to deliver a single application to multiple customers within an organization.

Cloud computing models generally fall into the following categories:

  • Images Private: The entire cloud and all its components are managed by a single organization.

  • Images Community: Cloud components are shared by multiple organizations and managed by one of them or by a third party.

  • Images Public: The cloud is open for any organization or user to use, is public, and is managed by a third-party provider.

  • Images Hybrid: This service model has components of more than one of the private, community, and public service models.

ExamAlert

For the CISSP exam, you need to know not just cloud computing models like SaaS and MaaS but also the categories private, community, public, and hybrid.

Email

Email enables individuals to communicate electronically over the Internet or a data communications network. Email is the most commonly used Internet application. Email is subject to some security concerns. Email was designed in a different era and, by default, sends information in plaintext. Anyone who is able to sniff plaintext traffic can read it. Email can be easily spoofed so that the true identity of the sender is masked. Email is also a major conduit for spam, phishing, and viruses.

Email functions by means of several underlying services, including the following:

  • Images Simple Mail Transfer Protocol (SMTP): SMTP is used to send mail and to relay mail to other SMTP mail servers. SMTP uses TCP port 25. A message sent through SMTP has two parts: an address header and message text. All types of computers can exchange messages by using SMTP.

  • Images Post Office Protocol (POP): POP is currently at version 3 (POP3) and is one of the protocols that can be used to retrieve messages from a mail server. POP3 performs authentication in plaintext on TCP port 110. An alternative to POP3 is IMAP.

  • Images Internet Message Access Protocol (IMAP): IMAP, which is used as a replacement for POP, operates on TCP port 143 and is designed to retrieve messages from an SMTP server. IMAP4, which is the current version, offers several advantages over POP. IMAP makes it possible to work with email remotely. Many of today’s email users need to access email from different locations and devices, such as smartphones, laptops, and desktops. IMAP makes it possible for multiple clients to access the email server and leave the email there until it’s deleted.

Tip

An updated version of POP that provides authentication is known as Authenticated Post Office Protocol (APOP).

With basic email operation, SMTP is used to send messages to the email server. To retrieve email, a client application, such as Outlook, might use POP or IMAP, as illustrated in Figure 8.1.

Anyone who uses email needs to be aware of the security risks. Spam is an ongoing problem, and techniques like graylisting can be used to deal with it. The sending of sensitive information in plaintext is another area of concern. If an organization has policies that allow email to be used for sensitive information, encryption should be mandatory. An organization needs to evaluate its needs related to email. Several solutions can make email more secure, including Pretty Good Privacy (PGP) and link encryption or secure email standards, such as Secure Multipurpose Internet Mail Extensions (S/MIME) and Privacy Enhanced Mail (PEM).

Images

FIGURE 8.1 Email Configuration

Whitelisting, Blacklisting, and Graylisting

Whitelisting, blacklisting, and graylisting are technical controls.

A whitelist is used to determine what is allowed access or what can be performed. Anything that is not included on the whitelist is prohibited.

Blacklists operate in the opposite way, banning or denying particular users, types of access, or resources. The problem with blacklisting is that as the list continues to grow, it requires more ongoing maintenance and oversight.

Many email administrators use graylists to deal with spam. A graylist rejects any email sender that is unknown. Mail that is from a legitimate email server is retransmitted after a period of time, and the graylisted email is moved off the graylist and onto the whitelist, at which point it is delivered to the inbox of the receiving account. Email is not necessarily blacklisted or deleted until the user evaluates the decision and makes a human decision to reject or accept the sender.

A related technique is sandboxing. A sandbox is often used when untested code or untrusted programs from third-party sources are being used.

ExamAlert

For the CISSP exam, you should understand blacklists, graylists, and whitelists.

Firewalls

The CISSP exam might test you on the advantages and disadvantages of different types of firewalls and their design. A packet filter, which is the most basic form of firewall, operates at the network layer of the OSI model. This type of firewall filters traffic by using an access control list (ACL). The ACL determines what packets can be accepted and what packets should be denied access.

Another type of firewall, a proxy firewall, can be an application-level proxy, a circuit-level proxy, or a kernel-level proxy. A kernel-level proxy is the most advanced and operates at the application layer of the OSI model. A kernel-level proxy firewall works faster than all the application-level firewalls because activity is centered in the kernel. When a packet ingresses a kernel proxy firewall, a new virtual stack is created that has only the protocol proxies needed to examine that specific packet.

Firewalls can be designed in three main ways:

  • Images Single-homed: With a single-homed firewall, one packet-filtering router is installed between the trusted and untrusted networks (which are usually the Internet and the organization’s network).

  • Images Dual-homed: A dual-homed gateway offers an improvement over a basic packet-filtering router because it comprises a bastion host that has two network interfaces. One important factor with a dual-homed gateway is that IP forwarding is disabled on the host. Additional protection can be provided by adding a packet-filtering router in front of a dual-homed host.

  • Images Demilitarized (DMZ): A DMZ (or screened subnet) is a subnet that is in between firewalls or off one leg of a firewall (see Figure 8.2). Because the DMZ sits in between the public Internet and private networks, it keeps the internal, private network isolated from the external network and provides an area of middle ground where you can host web, mail, and authentication servers.

Images

FIGURE 8.2 Screened Host

Phone, Fax, and PBX

Three techniques attackers can use to target phone users are phone hijacking, slamming, and cramming. Phone hijacking occurs when hackers use personal information to deceive a phone company’s customer service representatives into transferring your phone number to them. Slamming refers to switching users’ long-distance phone carriers without their knowledge. Cramming relates to unauthorized phone charges. One cramming technique is to send a fake SMS message that, when clicked on, authorizes the attacker to bill the victim a small amount each month.

Fax machines can present some security problems if they are being used to transmit sensitive information. Fax systems can be secured by using fax servers, encryption, and activity logs.

Caution

Although fax servers have solved many security problems, they have their own challenges. Many of them use hard drives where organizations store large numbers of commonly used administrative documents and forms. Others allow HTTP and/or FTP access to the print queue, where someone can capture the files. These issues must be addressed before effective security can be achieved.

Private organizations use PBX systems, which permit users to connect to a public switched telephone network (PSTN). A PBX can be used to assign extensions, provide voicemail, and enable special services for internal users and customers. Like other organizational resources, a PBX can be a potential target. If hacked, the PBX can be used to allow callers to call out and make free long-distance phone calls that are charged to the organization. PBX hacking is not as prevalent today as it was in the past, but a PBX can still pose a threat to operational security. Individuals who target PBX and phone systems are known as phreakers. Phreaking is the art of hacking phone systems. Although this might sound like a rather complicated affair, back in the early 1970s, it was discovered that free phone calls could be made by playing a 2600 Hz tone into a phone. This tone allowed the phreaker to bypass the normal billing process. The first device tailored to this purpose was known as a blue box. These boxes were invented in the 1970s and used until the early 1990s.

Although these tools are primarily historical, phreakers can still carry out activities like caller ID spoofing, SIM swapping attacks and they might even target VoIP phone systems for DoS attacks or sniffing attacks.

Anti-malware

Malware is a problem that computer users are faced with daily. Training users in safe computing practices is a good start, but anti-malware tools are still needed to protect an organization’s computers. When you find suspected malware, there are generally two ways to examine it: using static analysis or active analysis. Whereas static analysis requires you to decompile or disassemble the code, active analysis requires the suspected malware to be executed. Because executing malware on a live production environment can be dangerous, it is typically done on a standalone system or virtual machine referred to as a sandbox. The sandbox allows you to safely view or execute the suspected malware or any untrusted code while keeping it contained.

Caution

Keep in mind that even when malware is run in a sandbox, there is always some possibility that it may escape and infect other systems.

Anti-malware is software that helps you prevent malware from executing on your systems. Anti-malware software should be installed on servers, workstations, and even portable devices. It can use one or more techniques to check files and applications for viruses and other types of common malware. These techniques include the following:

  • Images Signature scanning: In a similar fashion to intrusion detection system (IDS) pattern-matching systems, signature scanning looks at the beginning and end of an executable file for known virus signatures. Virus creators attempt to circumvent the signature scanning process by making viruses polymorphic.

  • Images Heuristic scanning: Heuristic scanning examines computer files for irregular or unusual instructions. For example, think of your word processing program. It probably creates, opens, and updates text files.

  • Images Integrity checking: An integrity checker works by building a database of checksums or hashed values. Periodically, new scans are performed, and the results are compared to the stored results. Although integrity checking is not always effective for data files, this technique is useful for executables because their contents rarely change. For example, the md5sum hashed value of the Linux bootable OS Kali Linux is a66bf35409f4458ee7f35a77891951eb. Any change to the Kali.iso would result in a change in the hashed value, and an integrity checker would easily detect the change.

  • Images Activity blockers: An activity blocker intercepts a virus when it starts to execute and blocks it from infecting other programs or data. Activity blockers are usually designed to start at bootup and continue until the computer shuts down.

Honeypots and Honeynets

Honeypots and honeynets are much like IDSs in that they are tools for detecting intrusion attempts.

A honeypot is really a tool of deception. Its purpose is to fool an intruder into believing that the honeypot is a vulnerable computer. Honeypots are used for diversion and analysis of an attacker’s tactics, tools, and methods. Honeypots are simply fake systems or networks. Honeypots contain files, services, and databases that have no real value to an organization if compromised but are generally attractive to a hacker. Honeypots are effective because they can appear attractive without putting sensitive information at risk. To be effective, a honeypot must adequately persuade hackers that they have discovered a real system.

Some honeypot vendors sell products that can simulate an entire network, including routers and hosts, that are actually located on a single workstation; these are called honeynets. A honeynet can be deployed so that it is a separate server that is not being used in production.

Real servers can generate tons of traffic, which can make it hard to detect malicious activity. Because nothing is running on a honeypot or honeynet, any activity can easily be detected as a potential intrusion.

Honeypots can be configured for low interaction or high interaction. Low-interaction honeypots simulate only some parts of a service. For example, using a tool like netcat as a low-interaction honeypot, you can set a listener on a common port as shown here:

nc -v -n -l -p 80

This would show the port as open but would not return a banner.

In contrast, a high-interaction honeypot would show the port as open and could also return the proper banner, as shown here:

HTTP/1.1 400 Bad Request
Server: Microsoft-IIS/5.0
Date: Wed, 18 Jul 2012 18:08:25 GMT
Content-Type: text/html
Content-Length: 87

Honeypots can be configured in such a way that administrators will be alerted to their use so they have time to plan a defense for or guard the real network. However, honeypots do have downsides. Just like any other security system on a network, a honeypot requires time and configuration effort. In addition, a honeypot, by design, attracts a malicious element into your domain. Also, administrators must spend time monitoring these systems. Another downside is that, if an attacker can successfully compromise a honeypot, he now has a base of attack from which to launch further attacks.

Honeypots were originally designed for researching attack styles and designing improved architectures and anti-malware. More and more agencies are deploying honeypots to act as decoys, divert attackers from real systems, and provide early warning. It is important to understand that it is considered legal to entice someone, but it is not legal to entrap someone. The fuzzy distinction between enticement and entrapment can lead to interesting court cases.

Caution

A key issue with honeypots is to avoid entrapment, which is illegal. Using warning banners can help you avoid claims of entrapment by clearly noting that those who use or abuse the system will be monitored and potentially prosecuted.

Patch Management

Patch management is critical in helping to resolve software flaws and getting them updated in an expedient manner to reduce overall risk of system compromise. Patch management is key to keeping applications and operating systems secure. An organization should have a well-developed patch management testing and deployment system in place. The most recent security patches should be tested and installed on host systems as soon as possible. The only exception is when an immediate installation would interfere with business requirements.

Before a patch can be deployed, it must be verified. Typical forms of verification include digital signatures, digital certificates, and some checksums and integrity verification mechanisms. Verification is a critical step that must be performed before testing and deployment to make sure a patch has not been maliciously or accidentally altered. When testing is complete, deployment can begin. Change management protocols should be followed throughout this process.

System Resilience, Fault Tolerance, and Recovery Controls

Things will surely go wrong; it is just a matter of when. Understanding how to react and recover from errors and failures is an important part of operational security.

Good operational security practices require security planners to perform contingency planning, which involves developing plans and procedures that can be implemented when things go wrong. Contingency planning should occur after you’ve identified operational risks and performed a risk analysis to determine the extent of the impact of possible adverse events.

Recovery Controls

Recovery controls are controls that are applied after an adverse event occurs. They are administrative in nature and are useful for contingency planning and disaster recovery. Most of us do contingency planning in our personal lives. For example, while writing this book, I had a hard drive failure. I was lucky to have backed up the data, and I needed to find a way to finish the chapter and get it emailed by the deadline. My contingency plan was to use my laptop until I could get the desktop system back up and running. Most major organizations need much more detailed contingency plans than this.

The process of recovery requires having a mechanism to restore lost services after a disruptive event. To ensure that recovery goes smoothly, an organization must eliminate single points of failure and consider mean time between failures (MTBF) and mean time to repair (MTTR).

MTBF is the average time until something fails. Engineers often discuss MTBF in terms of the bathtub curve, which is illustrated in Figure 8.3. This graphic example of average time before failure looks at the average rate of failure of a population of devices. Some devices will fail early, but are engineered to operate until their designed end of service.

Devices that survive until their end of life will start to fail at an increasing rate as they wear out. Good operational control practices dictate that an organization should have some idea how long a device is calculated to last. This helps the organization plan for replacement before outages occur and services are disrupted.

Images

FIGURE 8.3 MTBF and the Bathtub Curve

For items that fail before the expected end of service, a second important variable is MTTR. The MTTR is the amount of time it will take to get the item back online. One of the major ways that organizations deal with such unknowns is to use service-level agreements (SLAs).

Monitoring and Auditing Controls

Computer resources are a limited commodity provided by an organization to help meet its overall goals.

Accountability must be maintained for network access, software usage, and data access. In a high-security environment, the level of accountability should be substantial, and users should be held responsible by logging and auditing their activities.

Good practice dictates that audit logs be transmitted to a remote centralized site. Centralized logging makes it easier for the person assigned the auditing task to review the data. Exporting the logs to a remote site also makes it harder for hackers to erase the logs and cover their activity. If there is a downside to all the logging that occurs, it is that all the information must be recorded and reviewed. A balance must be found between collecting audit data and maintaining a manageable log size. Reviewing logs can be expedited by using audit reduction and correlation tools, such as security information and event management (SIEM) tools. These tools parse the data and eliminate unneeded information. Another useful tool is a variance detection tool, which looks for trends that fall outside the realm of normal activity. For example, if an employee normally enters the building around 7 a.m. and leaves around 4 p.m. but is seen entering at 3 a.m., a variance detection tool would detect this abnormality.

Auditing and monitoring require accountability because if you don’t have accountability, you cannot perform an effective audit. True security relies on the capability to verify that individual users perform specific actions. Without the capability to hold individuals accountable, organizations can’t enforce security policies. Some of the primary ways to establish accountability are as follows:

  • Images Auditing user activity

  • Images Monitoring application controls

  • Images Using SIEM tools

  • Images Ensuring emanation security

  • Images Implementing network access control

  • Images Tracking the movement of individuals throughout the organization’s physical premises

Auditing User Activity

Auditing produces audit trails, which can be used to re-create events and verify whether security policies have been violated. The biggest disadvantage of the audit process is that it is detective in nature, and audit trails are usually examined after an event. Some might think of audit trails as only corresponding to logical access, but auditing can also be applied to physical access. Audit tools can be used to monitor who entered a facility and what time certain areas were accessed. A security professional has plenty of tools available to help isolate activities of individual users.

Many organizations monitor network traffic to look for suspicious activity and anomalies. Some monitoring tools enable administrators to examine just packet headers, whereas others can capture all network traffic. Snort, Wireshark, and tcpdump are several such tools. Regardless of the tools used to capture and analyze traffic, administrators need to make sure that policies detail how such uncovered activities will be handled. Warning banners and AUPs go a long way toward making sure users are adequately informed of what to expect when using organization resources.

ExamAlert

For the CISSP exam, you should understand the importance of monitoring employees and keep in mind that tools that examine activity are detective in nature.

Tip

A warning banner is the verbiage a user sees at the point of entry into a system. Its purpose is to identify the expectations that users accessing those systems will be subjected to. These banners also aid in attempts to prosecute those who violate the AUPs. A sample AUP is shown here:

WARNING: Unauthorized access to this system is forbidden and will be prosecuted by law. By accessing this system, you agree that your actions may be monitored if unauthorized use is suspected.

Monitoring Application Transactions

Good security is about more than people. A big part of a security professional’s day is spent monitoring controls to ensure that people are working according to policy. Much of today’s computing activity occurs on servers that are connected to the Internet, and these systems must be monitored.

All input, processed, and output data should be monitored. Inputs must be validated. Consider the example of a dishonest individual browsing an e-commerce website and entering a quantity of –1 for an item that is worth $2,450.99. Hopefully, the application has been written in such a way as to not accept a negative quantity for any items advertised. Figure 8.4 shows an example of an application that lacks this control.

Images

FIGURE 8.4 Shopping Cart with Altered Values

Note

One good example of an output control can be seen in many modern printer configurations. For example, some employee evaluation reviews might be configured so that they can be printed only to the supervisor’s printer. Another example can be seen in products such as Adobe’s Acrobat, which can limit printing of PDFs or embed password controls to limit who can open or edit PDFs.

Security Information and Event Management (SIEM)

Security information and event management (SIEM) is a relatively new set of tools and services that is used to collect and analyze auditable events. SIEM is the combination of the two separate services: security information management (SIM) and security event management (SEM). SIM is used to process and handle the long-term storage of audit and event data, whereas SEM is used for real-time reporting of events. Combining these two technologies provides users with the ability to alert, capture, aggregate, and review log information from many different systems and sources. Vendors that offer SIEM tools include Splunk, LogRhythm, and Sentinel.

SIEM allows for centralized logging and log analysis and can work with a variety of log data, such as NetFlow, sFlow, jFlow, and syslog. Most SIEM products support controls for confidentiality, integrity, and availability of log data. SIEM products provide four functions: aggregation, normalization, correlation, and reporting. SIEM can be used to detect misconfigured systems, unresponsive servers, malfunctioning controls, and failed applications. SIEM is typically used for ingress and egress monitoring:

  • Images Ingress: The SIEM tools monitor data traffic that originates from outside the trusted network.

  • Images Egress: The SIEM tools are used to monitor data that is leaving a trusted network.

While SIEM can be used to spot attacks and security incidents, it can also be used for the day-to-day operational concerns of a network. SIEM can also handle the storage of log data by disregarding data fields that are not significant to computer security, thereby reducing network bandwidth and data storage. Most SIEM products support two ways of collecting logs from log generators:

  • Images Agentless: The SIEM server receives data from the hosts without needing to have any special software (agents) installed on those hosts.

  • Images Agent based: An agent program is installed on the hosts and may be used to generate log input such as syslog and SNMP.

Although technologies such as SIEM are a great addition to a security professional’s toolkit, keep in mind that you should strive for defense in depth. For example, SIEM is typically used with a variety of other technologies. Data loss prevention (DLP) solutions are often used to help protect sensitive data as it moves around the network and makes its way to endpoint devices. Identity and access management (IAM) solutions complement DLP by connecting disparate authentication services together; therefore, when users need to access systems or applications, they can make requests through a single service. Combining these technologies with a SIEM provides much greater protection than using any one technology by itself.

Network Access Control

Network access control (NAC), which has grown out of the trusted computing movement, has the goal of unified security. NAC offers administrators a way to verify that devices meet certain health standards before allowing them to connect to the network. Laptops, desktop computers, and other devices that don’t comply with predefined requirements can be prevented from joining the network or can even be relegated to a controlled network where access is restricted until they are brought up to the required security standards.

Keystroke Monitoring

Keystroke monitoring, which can be accomplished with hardware or software devices, is used to monitor activity—for both legal and illegal purposes. As a compliance tool, a keystroke logger allows management to monitor a user’s activity and verify compliance. The primary issue of concern is the user’s expectation of privacy. Policies and procedures should be in place to inform the user that such technologies can be used to monitor compliance. The following is an example of an AUP that addresses keystroke monitoring:

This acceptable use policy defines the boundaries of the acceptable use of this organization’s systems and resources. Access to any organizational system or resources is a privilege that may be wholly or partially restricted without prior notice and without consent of the user. In cases of suspected violations or during the process of periodic review, employees can have activities monitored. Monitoring may involve a complete keystroke log of an entire session or sessions as needed to vary compliance with organizational policies and usage agreements.

Unfortunately, keystroke monitoring is not just for good guys. Hackers can use the same tools to monitor and record an individual’s activities. Although an outsider to an organization might have some trouble getting one of these devices installed, an insider is in a prime position to plant a keystroke logger. Keystroke loggers can be hardware or software based.

Emanation Security

The U.S. government was concerned enough about the possibility of emanations that the Department of Defense started a program to study them. Research actually began in the 1950s, based on the fear that attackers might try to sniff the stray electrical signals that emanate from electronic devices. TEMPEST technology resulted from this research. (Eavesdropping on the contents of a CRT by emanation leakage is referred to as Van Eck phreaking.) Devices that have been built to TEMPEST standards, such as cathode ray tube (CRT) monitors, have had TEMPEST-grade copper mesh, known as a Faraday cage, embedded in the case to prevent signal leakage. This costly technology is found only in very high-security environments.

TEMPEST is now considered somewhat dated; newer technologies such as white noise and control zones are now used to provide emanation security. White noise involves using special devices that send out streams of frequencies that make it impossible for an attacker to distinguish the real information. Control zones are facilities, walls, floors, and ceilings designed to block electrical signals from leaving the zones.

Perimeter Security Controls and Risks

Threats to physical security have existed for as long as humans have inhabited Earth. Consider the Incan city of Machu Picchu, built high on a mountain more than 7,000 feet above sea level. This ancient city was surrounded by thick stone walls and many natural exterior defenses that made it difficult to attack. Careful, ingenious planning is evident in the design of this city’s defense.

In the modern world, multinational organizations might not be headquartered on remote mountain peaks, but security is still evident to deal with a variety of threats to physical security. These threats can be divided into broad categories, such as natural disasters, human-caused threats, and technical problems. The sections that follow delve into these threats in greater detail.

Natural Disasters

Natural disasters come in many forms. Although it is impossible to prevent natural disasters, it is possible to create a disaster recovery plan to mitigate the impact of such an event. You can create and implement a recovery and corrective plan for facilities, information, and information systems that could be affected; in it, you can detail how you will respond when confronted with disasters. For example, organizations planning to establish a facility in New Orleans, Louisiana, might have minimal earthquake concerns; however, hurricanes would be considered an imminent threat. Understanding a region and its associated weather-related issues is important in planning physical security.

Natural disasters that organizations should consider include the following:

  • Images Hurricanes, typhoons, and tropical cyclones: These natural products of the tropical ocean and atmosphere are powered by heat from the sea. They grow in strength and velocity as they progress across the ocean and spawn tornadoes and cause high winds and floods when they come ashore.

  • Images Tidal waves/tsunamis: The word tsunami is based on a Japanese word meaning “harbor wave.” This natural phenomenon consists of a series of huge and widely dispersed waves that cause massive damage when they crash on shore.

  • Images Floods: Floods can result when the soil has poor retention properties or when the amount of rainfall exceeds the ground’s capability to absorb water. Floods are also caused when creeks and rivers overflow their banks.

  • Images Earthquakes: Earthquakes occur because of movement of the earth along fault lines. For example, the Nepal earthquake of 2015 killed more than 8,000 people and injured more than 21,000. Some areas of the United States, such as California and Alaska, are especially vulnerable to earthquakes because they are on top of major active fault lines.

  • Images Tornadoes: Tornadoes are storms that descend to the ground as violent rotating columns of air. A tornado leaves a path of destruction that may be quite narrow or extremely broad (up to about a mile wide).

  • Images Fire: Fire, which can be caused by humans (intentionally or accidentally) or nature, is the most common cause of damage to property and loss of life. According to statistics at fema.gov, some 3,655 deaths were due to fire in the United States in 2018. That’s a great loss of life. Wildfires can also cause massive damage.

Human-Caused Threats

Human-caused threats are a major concern when planning an organization’s physical security. Whereas natural threats such as floods, hurricanes, and tornadoes cannot be prevented, human-caused threats can be mitigated by controls that minimize (or eliminate) opportunity of occurrence and provide for quick response in the event of any occurrence.

The following are examples of human-caused threats:

  • Images Terrorism: As demonstrated in events such as the Sri Lanka terrorist suicide attacks in April 2019 that killed more 250 people, and as painfully understood by victims worldwide, terrorists act with calculated inhumane tactics to force their goals on society. Through risk analysis and threat modeling, organizations can determine what aspects of their businesses make them possible targets for terrorism (that is, “soft” targets). The answers could drive the need for physical security controls.

  • Images Vandalism: Since the Vandals sacked Rome in 455 BCE, the term vandalism has been synonymous with the willful destruction of another’s property.

  • Images Theft: Theft of organizational assets can range from annoyance to legal liability. An organization’s laptop, tablet, or smartphone can likely be replaced, but what about the data on the device?

  • Images Destruction: Physical and logical assets are vulnerable to destruction by current employees, former employees, and/or outsiders. The Shamoon malware is believed to have destroyed 30,000 Saudi Aramco workstations in 2012.

  • Images Criminal activities: This category is a catchall for other malicious behaviors that threaten an organization’s employees or infrastructure.

Technical Problems

Unlike natural disasters or human-caused threats, technical problems are events that just seem to happen, often at highly inopportune times. These events can range from inconvenient glitches to potentially large-scale disasters.

Technical problems can include the following:

  • Images Communication loss: Voice and data communication systems play a critical role in today’s organizations. Communication loss can refer to outage of voice communication systems or data networks. As more organizations use convergence technologies such as network-controlled door locks, Internet Protocol (IP) video cameras, and VoIP (voice over IP), network failure means failure of data connection as well as voice communication.

  • Images Utility loss: Utilities include water, gas, communication systems, and electrical power. The loss of utilities can bring business to a standstill. Generators and backups can be used to prevent these problems.

  • Images Equipment failure: Equipment fails over time. This is why maintenance is so important. With insufficient planning, you might experience a business outage. A Fortune 1000 study found that 65% of all businesses that fail to resume operations within one week never recover at all and permanently cease operation.

Caution

Using service-level agreements (SLAs) is one good way to plan for equipment failure. In an SLA, a vendor agrees to repair or replace the covered equipment within a given time. Just keep in mind that while an SLA covers replacement of materials or repair time, it doesn’t cover costs related to the downtime or loss of credibility.

Facility Concerns and Requirements

Whether you are charged with assessing an existing facility, moving into a new facility, or planning to construct a new facility, physical security must be a high priority. It’s important to consider all the threats that have been discussed so far, as well as additional threats that might be unique to your operations. You don’t want to build a facility in an area where your employees fear for their personal safety. You also don’t want a facility to feel like a bank vault or be designed like a prison. You need a facility in which employees can be comfortable and productive, and where they can feel safe.

CPTED

A key component of achieving a balance between comfort and safety is Crime Prevention Through Environmental Design (CPTED). The benefits of CPTED include the following:

  • Images Natural access control

  • Images Natural surveillance

  • Images Territorial reinforcement

CPTED is unique in that it considers the factors that facilitate crime and seeks to use proper facility design to reduce the fear and incidence of crime. At the core of CPTED is the belief that physical environments can be structured to reduce crime.

Let’s look at a few examples of CPTED. Maybe you have noticed limited entrance and exit points into and out of mall parking lots. This is an example of natural access control. Or maybe you have seen an organization that has its employee parking lot in an area that is visible from the employee workspace. This enables employees to look out their windows in the office and see their parked cars. Even if this organization employs only a single guard, the facility’s design allows increased surveillance by all the employees.

CPTED causes a criminal to feel an increase in the threat of being discovered and provides natural surveillance that can serve as a physical deterrent control. CPTED can also be applied to CCTV.

CCTV cameras should be mounted so that potential criminals can easily see the cameras and know they face a high risk of getting caught. A CCTV system can serve as a physical deterrent control and a detective control as well. Criminals may be deterred from entering property by the presence of a warning sign that alerts intruders that the property is under surveillance. Police can refer to video, along with log books and other technical logs, to make human judgments about who, how, when, and where a crime was committed; therefore, a CCTV system is a great physical detective control. CCTV is also a great tool for detecting and deterring insider threats.

Every facet of facility design should be reviewed with a focus on CPTED. Even items such as hedges are important in natural surveillance. They should not be higher than 2.5 feet as overgrown hedges obstruct visibility.

The third benefit of CPTED is territorial reinforcement. Walls, windows, fences, barriers, landscaping, and so on can be used strategically to define areas and create a sense of ownership with employees. It is typically best to use fences, lighting, sidewalks, and designated parking areas on the outside of a facility and move critical assets toward the center of the facility.

Area Concerns

Finding a good location is important when planning a new facility. Key points to consider include the following:

  • Images Accessibility: An organization’s facility needs to be in a location that people can access. Requirements will vary depending on business and individual needs, but aspects such as roads, freeways, local traffic patterns, public transportation, and convenience to regional airports need to be considered.

  • Images Climatology and natural disasters: Mother Nature affects all of us. If you’re building in Phoenix, Arizona, you will not have the same weather concerns as someone building a facility in Anchorage, Alaska. Events such as hurricanes, earthquakes, floods, snowstorms, dust storms, and tornadoes should be discussed and planned before starting construction.

  • Images Local considerations: Issues such as freight lines, airline flight paths, toxic waste dumps, and insurance costs should be considered when determining where to build a facility. Although cheap land for a new facility might seem like a bargain, the discovery that it is next to a railway used to haul toxic chemicals could change your opinion.

  • Images Utilities: You should check that water, gas, and electric lines are adequate for the organization’s needs. This might seem like a nonissue, but California found out otherwise in the California energy crisis of 2019 and 2020, which left many without power and caused periods of rolling blackouts.

  • Images Visibility: Area population, terrain, and types of neighbors are concerns. Depending on the type of business, you might want a facility that blends into the neighborhood. You might design individual buildings that cloak activities taking place there. Some organizations might even place an earthen dike or barrier around the facility grounds to obstruct the view of those who pass by.

Location

The location of a facility is an important issue. Before construction begins, an organization should consider how the location fits with the organization’s overall tasks and goals. A good example is the NSA museum outside Baltimore. It’s the kind of place every cryptographic geek dreams of going. It’s actually behind the main NSA facility, in what used to be a hotel. (Rumor has it that the hotel was a favorite hangout of the KGB before the NSA bought it.) Although having facilities nearby for visitors and guests can be a good idea, the placement of the hotel so close to a critical agency might be a problem as it would allow for spying.

Keep in mind that the acquisition of a new corporate site involves more than just the cost of the property. Other factors are important as well. For example, if your organization manufactures rockets for satellites, you might want to be near fire stations and hospitals in case there’s an accident.

Construction

After you have chosen a location, your next big task is to determine how the facility will be constructed. In many ways, this is driven by what the facility will be used for and by federal, state, and local laws. Buildings used to store groundskeeping equipment have different requirements than those used as clean rooms for the manufacturer of microchips. In other words, you need to know how various parts of a facility will be used. Remember to make sure that the facility is built to support whatever equipment you plan to put in it.

Tip

The load refers to how much weight a facility’s walls, floor, and ceiling are being asked to support.

Doors, Walls, Windows, and Ceilings

Have you ever wondered why most doors on homes open inward, whereas almost all doors on businesses open outward? This design is rooted in security. The door on your home is hinged to the inside to make it harder for thieves to remove your door to break in, and it also gives you an easy way to remove the door to bring in that big new leather couch. Years ago, the individuals who designed business facilities built them with the same type of doors. The problem is that open-in designs don’t work well when people panic. It’s a sad fact that the United States has a long and tragic history of workplace fires. In 1911, nearly 150 women and young girls died when they couldn’t exit the Triangle Shirtwaist Factory they were working in when it caught fire. The emergency exit doors were locked! Because of this and other tragic losses of life, modern businesses are required to maintain exits that are accessible and unlocked and that open out. These doors are more expensive than open-in doors because they are harder to install and remove. Special care must be taken to protect the hinges so that they cannot be easily removed. Many doors include a panic bar that permits quick exit: Just push, and you’re out. In emergencies or situations in which a crowd is exiting a building quickly, panic bars help keep people moving away from danger.

Maybe you have heard the phrase “security starts at the front door.” It is of the utmost importance to keep unauthorized individuals out of a facility or areas where they do not belong. Doors must be as secure as the surrounding walls, floor, and ceiling. If a door is protecting a critical area such as a data center or an onsite server room, the door needs to have the hinges on the inside of the door so that hinge pins cannot be removed. The structural components around the door must also be strengthened. The lock, hinges, strike plate, and the door frame must all have enough strength to prevent someone from attempting to kick, pry, pick, or knock down the door.

The construction of doors varies. Critical infrastructure should be protected with solid core doors. The core material is the material within the door that is used to fill space, provide rigidity, and increase security. Hollow core doors simply contain a lattice or honeycomb made of corrugated cardboard or thin wooden slats. Unlike a hollow core door, a solid core door is hard to penetrate. Solid core doors consist of low-density particle board, rigid foam, solid hardwood, or even steel that completely fills the space within the door. Solid core flush doors have great strength. The outer portion of the door is the skin, which can be wood, steel, or another material, such as a polymer. Commercial steel doors are classified by ANSI/SDI A250.8-2014 into various categories that include standard duty, heavy duty, extra-heavy duty, and maximum duty. Selection of a steel door should be based on usage, degree of abuse, and required protection factor.

Many organizations use electrically powered doors to control access. For example, an employee might have to insert an ID card to gain access to a facility. The card reader would actuate an electric relay that allows the door to open. A security professional should know the state of these door relays in the event of a power loss. An unlocked (or disengaged) state allows employees to enter or exit and not be locked in. If a door lock defaults to open during a power disruption, this is referred to as fail-safe. If the lock defaults to locked during a power disruption, this is referred to as fail-secure; in this situation, a panic bar or release must be provided so employees are not trapped inside the facility. For high-security doors, it is also important to consider delay alarms, which are used to alert security that a security door has been open for a long time. A fail-safe option may be the best option and/or may be a regulatory requirement (depending on the local fire code) when there are people employed within the facility; the code may differ for an unstaffed data warehouse.

Caution

Fail-safe locks protect employees in the event of power loss because they allow employees to exit the facility.

ExamAlert

The terms fail-safe and fail-secure have very different meanings when discussed in physical security than they have in logical security. When you take the CISSP exam, read the questions carefully to determine the context in which these terms are being used.

Doors aren’t the only factor you need to consider. For example, data centers typically should have raised floors, constructed in such a way that they are grounded against static electricity. Cables and wiring should be in conduit, not loose or above the raised floor such that a trip hazard exists. Walls must be designed to slow the spread of fires, and emergency lighting should be in place to light the way for anyone trying to escape in an emergency. Other considerations include the following:

  • Images Walls: Walls need to extend from the floor to the ceiling in critical areas and where they separate key departments. Walls should have an adequate fire rating and should be reinforced to keep unauthorized personnel from accessing secure areas, such as data centers or server rooms. Anyone who works in a cubicle environment understands the deficiency of short walls. A loud noise leads employees to “prairie dog” and look over their cubicle walls to see what is happening.

  • Images Ceilings: Ceilings need to be waterproof above the plenum space, have an adequate fire rating, and be reinforced to keep unauthorized personnel from accessing secure areas, such as server rooms.

  • Images Electrical and HVAC: It is important to plan for adequate power. Rooms that contain servers or other heat-producing equipment need additional cooling to protect that equipment. Heating, ventilating, and air conditioning (HVAC) systems should be controllable by fire-suppression equipment; otherwise, these systems can inadvertently provide oxygen and help feed a fire.

Caution

Air intakes should be properly designed to protect people from breathing toxins or other substances that might cause harm. For example, in 2020, OSHA released new guidelines for ventilation systems to decrease the airborne spread of COVID-19 (see www.businessinsider.com/osha-releases-covid-19-ventilation-guide-for-workplaces-2020-11?op=1).

  • Images Windows: Windows are a common point of entry for thieves, burglars, and others seeking access. Windows are usually designed with aesthetics, not security, in mind. Interior or exterior windows need to be fixed in place and should be shatterproof on at least the first and second floors. Windows can be standard glass, tempered, laminated, or acrylic, and they can be embedded with wire mesh to help prevent the glass from shattering. Alarms or sensors might also be needed.

  • Images Fire escapes: Fire escapes are critical because they enable personnel to exit in the event of a fire. It is critical that fire drills be performed to practice evacuation plans and determine real exit times. After the first attack on the World Trade Center towers in 1993, it was discovered that it took people two to three times longer to exit the facility than had been planned. Increased drills would have reduced evacuation time.

  • Images Fire detectors: Smoke detectors should be installed to warn employees of danger. Sprinklers and detectors should be used to reduce the spread of fire. Smoke detectors can be placed under raised floors, above suspended ceilings, in the plenum space, and within air ducts.

Asset Placement

Security management includes the appropriate placement of high-value assets, such as servers and data centers. Data centers should not be placed above the second floor of a facility because a fire might make them inaccessible. Likewise, you wouldn’t want a data center to be located in a basement because it could be at risk of flooding.

It’s not a good idea to have a data center with uncontrolled access or in an area where people will congregate or mill around. Even placing a data center off a main hallway is not a good idea. I often tell students that the location of the server room should be like Talkeetna, Alaska: If you are going there, you cannot be going anywhere else because that is where the road ends.

A data center should have limited accessibility and typically no more than two doors. A first-floor interior room is a good location for a data center. The ceilings should extend all the way up past the drop ceiling, access to the room should be controlled, and doors should be solid core with hinges to the inside. The goal in your design should be to make it as hard as possible for unauthorized personnel to gain access to the data center. Server rooms should not have exterior windows or walls. Placing a server room inside a facility protects the servers against potential destruction from storms and makes it more difficult for thieves or vandals to target them. If individuals can gain physical access to your servers, you have no security.

Environmental Controls

Heat can be damaging to computer equipment, so most data centers are kept at temperatures of around 70°F. Higher and lower temperatures can reduce the useful life of electronic devices. But temperature should not be your only concern when designing a data center. High humidity can cause electronics to corrode, and low humidity increases the risk of static electricity. What might feel like only a small shock to a human can totally destroy electronic components. Grounding devices such as antistatic wrist bands and antistatic flooring can be used to reduce the possibility of damage due to static electricity.

Heating, Ventilating, and Air Conditioning

Do you know what can be hotter than Houston in the summer? A room full of computers without sufficient HVAC. Data centers and other areas that are full of computer or electrical equipment generate heat. Modern electronic equipment is very sensitive to heat and can tolerate temperatures of only 110°F to 115°F degrees before circuits are permanently damaged.

Data centers should have HVAC systems separate from the HVAC of the rest of the facility. The HVAC should maintain positive pressurization and ventilation to control contamination by pushing air outside. Pressurization and ventilation are especially important in case of fire because they ensure that smoke will be pushed out of the facility instead of being pulled in.

Security management should know who is in charge of the HVAC system and how they can be contacted. Intake vents should be protected so that contaminants cannot spread easily. These systems must be controlled to protect organizations and their occupants from chemical and biological threats. HVAC systems generate water in gas (affecting humidity) or liquid (encouraging growth of mold, structural damage, and decay) form. As mentioned earlier in this section, high humidity causes rust and corrosion, and low humidity can increase the risk of static electricity. The ideal humidity for a data center is between 40% and 60% rH to prevent ESD corrosion.

Note

The American Society of Heating, Refrigerating, and Air-Conditioning Engineers (ASHRAE) has expanded the allowable temperatures for data centers in an effort to promote green environmental practices and to provide a wider range of allowed temperatures.

Electrical Power

Electrical power, like HVAC, is a resource that most of us take for granted. Residents of the United States are lucky, but large portions of the world live without dependable electrical power. Even areas that have dependable power can be subject to line noise or might suffer from electromagnetic interference (EMI). Electrical motors and other electronic devices can cause EMI. You might have noticed that fluorescent lights can also cause electrical problems; this phenomenon is known as radio frequency interference (RFI). Table 8.2 lists some other power anomalies.

TABLE 8.2 Power Faults

Fault

Description

Blackout

Prolonged loss of power

Brownout

Power degradation so that less power is available than normal

Sag

Momentary low voltage

Fault

Momentary loss of power

Spike

Momentary high voltage

Surge

Prolonged high voltage

Noise

Interference superimposed onto the power line

Transient

Electrical noise of a short duration

Inrush

Initial surge of power at startup

Luckily, power conditioners, surge protectors, and uninterruptible power supplies can provide clean power. Although most of the time we seek this clean power, there are times when we need to kill electricity quickly (such as when someone is electrocuted or when there is a danger of water coming into direct contact with a power source). National fire protection codes require that you have an emergency power off (EPO) switch located near server room exit doors to kill power quickly, if needed. These switches are typically big red buttons.

Caution

An EPO switch should have a plastic cover installed to prevent people from accidentally pressing it.

Uninterruptible Power Supplies (UPSs)

Because computers have become essential pieces of technology, downtime of any significant duration can be devastating to an organization. Power outages can happen, and businesses must be prepared to deal with them. Uninterruptible power supplies (UPSs) can help meet this challenge. Two categories of UPS exist:

  • Images Online system: An online system uses AC power to charge a bank of DC batteries. These batteries are held in reserve until power fails. At that time, a power inverter converts the DC voltage back to AC for the computer systems to use. These systems are good for short-term power outages.

  • Images Standby system: This type of system monitors a power line for a failure. When a failure is sensed, backup power is switched on. A standby system relies on generators or power subsystems to keep computers running during longer power outages. Most standby generators run on diesel fuel or natural gas:

    • Images Diesel fuel: An organization should maintain at least 12 hours’ worth of fuel.

    • Images Natural gas: Gas is an option in areas that have a good supply of natural gas and are geologically stable.

Equipment Lifecycle

Even when you do all the right things—perform preventive maintenance, keep equipment at the right operating temperature, and use surge protectors—equipment eventually ceases to function. This is why many organizations choose to maintain service-level agreements (SLAs).

ExamAlert

An SLA is a contract with a hardware vendor that provides a certain level of protection. For a fee, the vendor agrees to repair or replace the equipment within the contracted time.

Fire Prevention, Detection, and Suppression

A fire needs three things: oxygen, heat, and fuel. When all three items are present, a fire can ignite and present a lethal threat, as illustrated in the fire tetrahedron in Figure 8.5. Fires can be devastating to people and facilities. Saving human lives should always be your first priority. As a CISSP candidate, it’s important to understand that proper precautions, preparation, and training must be performed to help save lives and limit damage.

Fire prevention is a key to proactive defense against fires. A big part of prevention is making sure people are trained and know how to prevent potential fire hazards. Corporate policy must define how employees will be trained to deal with fires.

Images

FIGURE 8.5 Fire Tetrahedron

Fire drills are another important part of building a good security policy. Fire drills should occur periodically but randomly. Employees should have a designated area to go to in a safe zone outside the facility. Supervisors or others should be in charge of the safe zone and responsible for performing employee head counts to ensure that everyone is present and accounted for. After a drill, employees should be required to use their IDs to reenter the facility to deter social engineering and piggybacking attacks.

Fire-Detection Equipment

Having plans and procedures to carry out in the event of a fire is only part of an overall fire-prevention program. Organizations should make sure they have appropriate and functioning fire-detection equipment so that employees can be alerted to possible danger. Fire detectors can work in different ways and can be activated by the following:

  • Images Heat: A heat-activated sensor is triggered when a predetermined temperature is reached or when the temperature rises quickly in a specified time period. The rate-of-rise type of sensor produces more false positives than the predetermined temperature type.

  • Images Smoke: A smoke-activated sensor can be powered by a photoelectric optical detector or by a radioactive smoke-detection device.

  • Images Flame: A flame-activated sensor is the most expensive of the three types discussed. It functions by sensing either the infrared energy associated with flame or the pulsation of flame.

Fire Suppression

Just being alerted to a fire is not enough. Employees need to know what to do and how to handle different types of fires. A fire can be suppressed by removing heat, fuel, or oxygen. Fires are rated according to the types of materials burning. Although it might be acceptable to throw water on smoldering paper, it would not be a good idea to throw water on a combustible metal fire, which could actually cause the fire to spread. Table 8.3 lists fire classes and corresponding suppression methods.

TABLE 8.3 Fire Classes and Suppression Methods

Fire Class

Fire Type

Suppression Method

Class A

Paper or wood fires

Water or soda acid

Class B

Gasoline or oil fires

CO2, soda acid, or halon

Class C

Electronic or computer fires

CO2 or halon replacement, such as FM-200

Class D

Fires caused by combustible metals

Dry powder or special techniques

Class K

Commercial kitchen fires

Saponifying agents that blanket the fire

Tip

To remember the classes of fires, think of them in the order of frequency of occurrence: Paper (A) more than liquid (B), liquid more than electric (C), electric more than metal (D).

The two primary methods of corporate fire suppression are use of water sprinklers and gas discharge systems. Water is easy to work with, widely available, and nontoxic. Gas discharge systems are better suited for areas where humans are not present.

Water Sprinklers

Water sprinklers are an effective means of extinguishing Class A fires. The disadvantage of using sprinkler systems is that water is damaging to electronics. Four variants of sprinkler systems are available:

  • Images Dry pipe: As the name implies, this type of sprinkler system contains no standing water. The line contains compressed air. When the system is triggered, the clapper valve opens, air flows out of the system, and water flows in (see Figure 8.6). The benefit of this type of system is that it reduces the risk of accidental flooding and provides some time to cover or turn off electrical equipment. These systems are also great for cold-weather areas, unstaffed warehouses, and other locations where low temperatures could freeze any water standing in the system.

Images

FIGURE 8.6 Dry-Pipe Fire-Suppression System

  • Images Wet pipe: Wet-pipe systems are more widely used than dry-pipe systems, and they are ready for activation at all times. This type of system is charged and full of water. When triggered, only the affected sprinklers activate. Wet-pipe systems are not triggered by smoke, and they are prone to leaks and accidental discharge. The next time you are staying in a hotel, take a look around, and you’ll probably see this type of system. Wet-pipe systems typically use some type of fusible link that allows discharge after a link breaks or melts.

  • Images Pre-action: This is a combination system in which pipes are initially dry and do not fill with water until air pressure is reduced. Even then, the system does not activate until a secondary mechanism triggers. The secondary mechanism might be some type of fusible link similar to what is used in a wet-pipe system. The advantage of pre-action systems is that they provide an extra level of control and reduce the chance of accidental triggering.

  • Images Deluge: This type of system is similar to a dry-pipe system, except that after the system is triggered, there is no holding back the water. A large volume of water covers a large area quickly. If your organization builds booster rockets for supplies being shuttled to the International Space Station, this might be your preferred suppression system.

Halon

Using halon is one of the oldest fire-suppression methods. It was considered the perfect fire-suppression method: Halon mixes easily with air, doesn’t harm computer equipment, and, once dissipated, leaves no solid or liquid residue. Halon is unique in that it does not remove or reduce any of the three necessary components of a fire; instead, halon interferes with the fire’s chemical reaction. There are two types of halon: halon 1211 and halon 1301.

The Montreal Protocol of 1987 designated halon as an ozone-depleting substance. Halon is 3 to 10 times more damaging to the ozone layer than CFCs. Other issues with halon exist. If it is deployed in concentrations greater than 10% and in temperatures of 900°F or more, it degrades into hydrogen fluoride, hydrogen bromide, and bromine—a toxic brew that people should not breathe.

If you currently have a halon fire-suppression system, you can leave it in place, but there are strict regulations on reporting discharges. Laws also govern the removal and disposal of halon fire-suppression systems. The EPA has approved some more ecological and less toxic replacements for halon, including the following:

  • Images FM-200

  • Images CEA-410

  • Images NAF-S-III

  • Images FE-13

  • Images Argon

  • Images Low-pressure water mist

  • Images Argonite

Alarm Systems

A range of technical controls can be used to enhance physical security. Alarm systems are one such control.

An alarm system is made up of many components, including an intrusion detection system, a control panel, arming systems, and annunciators. Every time an alarm occurs, someone must respond and determine whether the event is real.

Intrusion Detection Systems (IDSs)

Physical intrusion detection systems (IDSs) are used for detecting unauthorized physical access. IDS sensors around windows or attached to doors can detect the breakage of glass or the opening of doors. These systems are typically effective in detecting changes in the environment. The following are some common types of IDS sensors:

  • Images Audio detection or acoustical detection sensors: These sensors use microphones to listen for changes in the ambient noise level. They are susceptible to false positives.

  • Images Dry contact switches: These sensors detect the opening of a door or window.

  • Images Electro-mechanical sensors: These sensors trigger on a break in the circuit.

  • Images Motion detectors: You have probably seen this type of sensor on one of the many security lights sold commercially. Motion detectors can be triggered by audio, radio wave pattern, or capacitance.

  • Images Vibration sensors: These sensors use piezoelectric technology to detect vibration and trigger on movement.

  • Images Pressure-sensitive sensors: These sensors are sensitive to weight and typically measure a change in resistance that triggers the device. Pressure mats are an example of this type of technology.

  • Images Photoelectric sensors: These sensors use infrared light and are laid out as a grid over an area. If the grid is disturbed, the sensor detects a change.

  • Images Passive infrared sensors: These sensors can sense changes in heat generated by humans.

ExamAlert

When you encounter intrusion detection questions on the CISSP exam, note whether the question is referencing a physical intrusion detection system or a logical intrusion detection system.

An organization may choose not to use IDS solutions because they can produce false positives. A false positive result indicates that a condition is present when it actually is not. Before IDSs are deployed, a risk assessment should be performed to determine the true value of these devices to the organization. IDS solutions often are a layer of security used for monitoring and alerting a security guard to do some human inspection to determine whether further preventive measures need to be taken.

Monitoring and Detection

Alarm systems must be monitored and controlled. Either an in-house guard or a third-party organization needs to be assigned the task of monitoring the alarm system. Alarm systems use one of four basic designs, as shown in Table 8.4.

TABLE 8.4 Alarm Systems

System Design

Description

Local alarm

The alarm triggers an audio and visual alert locally. A guard is required to respond.

Central station

This system is operated by private third-party organizations that can respond to the customer’s premises within 10 to 15 minutes.

Proprietary system

This is an in-house system that is much like a central station except that it is owned and operated by the organization.

Auxiliary system

This is a subcategory of any of the preceding three systems that can dial the police or fire department when a triggering event occurs.

Although movies sometimes show criminals disconnecting alarm systems or cutting the red wire to disable them, in reality, this doesn’t work. Alarm systems have built-in tamper protection. Any attempt to compromise detection devices, controllers, annunciators, or other alarm components initiates a tamper alarm. Even if power is cut, the alarm will still sound because modern alarm systems are backed up by battery power. Many systems also provide cellular phone backup. The National Fire Protection Association (NFPA) NFPA 72 standard specifies that a local alarm system must provide 60 hours of battery backup, and a central station signaling system must provide 24 hours of backup. In any situation in which the annunciator signals an alarm, NFPA 72 states that the audible alert should be at least 105 dB and have a visual component for those who are deaf or hearing impaired.

No monitoring plan is complete without controls that monitor physical access. Some common facility access controls include the following:

  • Images CCTV: An organization can use CCTV to monitor who enters or leaves the facility. It can also correlate these logs with logical access policies for systems and facilities.

  • Images Card readers or biometric sensors: An organization can use these devices on server room doors to maintain a log of who accesses the area.

  • Images Alarm sensors: An organization can use these devices on doors and windows to detect possible security breaches.

  • Images Mantraps and gates: An organization can use these devices to control traffic and log entry to secured areas. Remember that mantraps are double doors used to control the flow of employees and block unauthorized individuals.

ExamAlert

Make sure you know the difference between audit and accountability for the CISSP exam. Audit controls are detective controls, which are used after an event occurs and are usually implemented to detect fraud or other illegal activities. Accountability is the capability to track actions, transactions, changes, and resource usage to a specific user in a system. It is accomplished in part by having unique identification for each user, using strong authentication mechanisms, and logging events.

Intrusion Detection and Prevention Systems

Intrusion detection involves monitoring network traffic, detecting attempts to gain unauthorized access to a system or resource, and notifying the appropriate individuals so that counteractions can be taken. An IDS is designed to function as an access control monitor.

A huge problem with an IDS is that it is an after-the-fact device, used after an attack has already taken place. Other problems with IDSs are false positives and false negatives. A false positive occurs when an IDS triggers an alarm for normal traffic. A false negative is even worse: It occurs when a real attack occurs but the IDS does not pick it up. IDSs can be divided into two basic types: network-based intrusion detection systems (NIDSs) and host-based intrusion detection systems (HIDSs). A HIDS resides on a host computer.

Intrusion prevention systems (IPSs) build on the foundation of IDSs and attempt to take the technology a step further. IPSs can react automatically and actually prevent a security event from happening, sometimes even without user intervention. IPSs are considered the next generation of IDSs and can block attacks in real time. The National Institute of Standards and Technology (NIST) now uses the term IDP (intrusion detection and prevention system) to refer to modern devices that maintain the functionality of both IDS and IPS devices. These topics are covered in more detail in Chapter 9, “Software Development Security.”

Investigations and Incidents

Many different types of incidents might trigger investigations, including anything from unauthorized disclosure, to theft of property, to outage due to DoS attack, to the detection of an intrusion. As a CISSP candidate, you must understand the following investigation types:

  • Images Criminal: This type of investigation exists to preserve the public peace and protect the safety of people. Penalties can include fines and jail time.

  • Images Civil: This type of investigation exists to govern matters concerning legal disputes between citizens and organizations. Penalties include fines but not jail.

  • Images Regulatory: This type of investigation exists to ensure that administrative policies, procedures, and regulations are being observed. Penalties can include fines and jail time.

An organization needs to create a team to deal with an investigation. The team must understand legal issues and items such as e-discovery evidence collection and how to properly handle the crime scene. For example, the team needs to understand how to protect evidence and maintain it in the proper chain of custody. An organization also needs to establish specific roles and responsibilities within the team, including a team lead to be in charge of the response to any incident.

It is important to establish contacts within your organization with various departments such as HR. For example, if an employee is discovered to have been hacking, the supervisor may want to fire the employee but must first discuss the issue with human resources and the legal department. Even when an incident has been contained and is in the recovery phase, you need to think about what lessons can be learned and how you will report and document your findings. All this information must be included in an incident response policy.

Incident Response

The most important thing to understand when it comes to incident response is that every organization needs to have a plan in place before something unfortunate occurs. The basic stages of incident response are shown here:

  1. Preparation: Create an incident response team to address incidents.

  2. Identification: Determine what has occurred.

  3. Mitigation and containment: Halt the effect of the incident and prevent it from spreading further.

  4. Investigation: Determine what the problem is and who is responsible to mitigate.

  5. Eradication: Eliminate the problem.

  6. Recovery: Clean up any residual effects.

  7. Follow-up and resolution: Improve security measures to prevent or reduce the impact of future occurrences.

Incident response and forensics are very similar, except that incident response is more focused on finding the problem and returning to normal activities, whereas forensics is more focused on legal aspects and potentially prosecuting the accused. Also, only some organizations can justify forensic investigation due to the potential cost.

Digital Forensics, Tools, Tactics, and Procedures

Governments, the military, and law enforcement have practiced forensics for many years, but forensics is a much younger science for private industry. Its growth in recent years is due to the increased role of computers in the workplace and the types of information and access these computers maintain. There are four types of digital forensics:

  • Images Software forensics: This type of forensics includes analysis of malware and other types of malicious code, such as bots, viruses, worms, and Trojans. Organizations such as McAfee and Symantec perform such duties, and tools like decompilers and disassemblers are used.

  • Images Network forensics: This type of forensics includes review of network traffic and communication. Tools used include sniffers like Wireshark and Snort.

  • Images Computer forensics: This type of forensics includes review of hard drives, solid-state drives, and computer media, such as CDs, DVDs, and USB thumb drives. Tools used include hex editors, Encase, and FTK.

  • Images Hardware/embedded device forensics: This type of forensics includes review of smartphones, tablets, routers, and other hardware devices.

Tip

Hardware forensics continues to grow in importance as our reliance on electronic devices increases. One report from a former Pentagon analyst alleges that a large amount of foreign-made Telco gear has built-in backdoors (see www.zdnet.com/former-pentagon-analyst-china-has-backdoors-to-80-of-telecoms-7000000908/).

Digital forensics is a complex field and includes the following stages:

  1. Plan and prepare to be creating procedures and policies and conducting training.

  2. Secure and isolate the scene to prevent contamination.

  3. Record the scene by taking photographs and recording data in an investigator’s notebook.

  4. Interview suspects and witnesses.

  5. Systematically search for other physical evidence.

  6. Collect or seize the suspect system or media.

  7. Package and transport evidence.

  8. Submit evidence to a lab for analysis.

Before discussing the steps of digital forensics in more detail, let’s examine the overall concepts and targets of forensic activities. Digital forensics defines a precise methodology to preserve, identify, recover, and document computer or electronic data. Growth in this field is directly related to the ever-growing popularity of electronics.

Computers are some of the most commonly targeted items, but they are not the only devices subject to forensic analysis. Smartphones, tablets, digital cameras, iPods, USB drives, and just about any other electronic device can also be analyzed. Attempted hacking attacks and allegations of employee computer misuse have added to the need for organizations to examine and analyze electronic devices. Mishandling concerns can cost organizations millions of dollars. Organizations must handle each event in a legal and defensible manner. Digital forensics follows a distinct and measurable process that has been standardized.

Standardization of Forensic Procedures

In March 1998, the International Organization on Computer Evidence (IOCE; www.ioec.org) was appointed to draw up international principles for procedures related to digital evidence. The goal was to harmonize methods and practices among nations and guarantee the ability to use digital evidence collected by one country in the courts of another country. The IOCE has established the following six principles to govern these activities:

  • Images When dealing with digital evidence, all generally accepted forensic and procedural principles must be applied.

  • Images Upon seizing digital evidence, actions taken should not change that evidence.

  • Images When it is necessary for a person to access original digital evidence, that person should be trained in the techniques to be used.

  • Images All activity relating to the seizure, access, storage, or transfer of digital evidence must be fully documented, preserved, and available for review.

  • Images An individual is responsible for all actions taken with respect to digital evidence while the digital evidence is in his or her possession.

  • Images Any agency that is responsible for seizing, accessing, storing, or transferring digital evidence is responsible for compliance with these principles.

Digital Forensics

Digital forensics can be subdivided into three stages:

  1. Acquisition: Acquisition is usually performed by means of a bit-level copy. A bit-level copy is an exact duplicate of the original data, made using a write blocker, that allows an examiner to scrutinize the copy while leaving the original intact.

  2. Authentication: An investigator must show that the original data is unchanged and has not been tampered with; the investigator must be able to prove that the bit-level copy is an exact copy. Authentication can be accomplished through the use of checksums and hashes, such as MD5 and SHA.

    Tip

    Message digests, such as MD5 and SHA, are used to ensure the integrity of files and data and to ensure that no changes have occurred.

  3. Analysis: An investigator must be careful while examining the data and ensure that all actions are thoroughly documented. The investigator recovers evidence by examining files, state information, drive slack space, file slack space, free space, hidden files, swap data, Internet cache, and other locations, such as the Recycle Bin. Copies of the original disks, drives, or data are usually examined to protect the original evidence.

Acquisition

Acquisition refers to assuming possession of evidence or contracting to assume possession. In many instances, a forensic analyst is asked to acquire hard drives, computers, media, or other items on site. Just as with any other investigation, an analyst in a digital forensics case should make careful notes about what physical evidence is recovered and show the chain of custody for all evidence acquired. Physical evidence and digital forensics can help build a relationship between an incident scene, a victim, and a suspect (see Figure 8.7).

Images

FIGURE 8.7 Relationship of Evidence to Suspect

During the acquisition stage, the following processes occur:

  • Images Documentation and collection of the evidence

  • Images Protection of the chain of custody

  • Images Identification, transportation, and storage

  • Images Approved duplication and copying

During collection and handling of evidence, it is important to record everything. You can use a digital camera to record the layout of the scene. You need to document the condition of the computer systems, attachments, cables, physical layout, and all electronic media. You can also use a camera to take pictures of any screen settings visible on a running system. You should also document internal storage devices and hardware configuration, including hard drive make, model, size, jumper settings, location, and drive interface as well as internal components such as the sound card, video card, and network card.

Tip

The handling of evidence is of special importance to a forensic investigator. This is addressed through the chain of custody, a process that helps protect the integrity and reliability of the evidence by providing an evidence log that shows every access to evidence, from collection to appearance in court.

A forensic analyst needs to keep adequate records and build a proper chain of custody. Although the chain of custody is something that those in law enforcement are familiar with, it might be new to many IT professionals but will surely be called into question for all digital evidence in the court of law as well. Chain of custody is used to address the reliability and creditability of evidence. Chain of custody is the process of documenting the journey of any and all evidence while keeping it under control. Chain of custody should be able to answer the following the questions:

  • Images Who collected the evidence?

  • Images Where was the evidence collected?

  • Images When did possession of the evidence occur?

  • Images How was the evidence stored and protected (that is, what software tool was used)? Is this a best practice tool used by the industry? Is the professional trained on use of the tool? How many times has the professional used the tool?

  • Images If evidence was removed from storage, why, by whom, and for how long was it taken from storage?

Caution

Computer evidence is very volatile; it is therefore of utmost importance to protect the chain of custody throughout the entire evidence lifecycle.

Even though many forensic investigations might not lead to court cases or legal showdowns, some do, so you must always maintain the integrity of evidence. After collecting and recording evidence, it is likely that you have reached the point at which you might need hard drives or fixed disks for duplication. Any analysis needs to be performed on a copy of the evidence so that the original can remain safely stored away. The objective of disk imaging is to preserve the original copy in a pristine state and to provide the analyst with a copy to use for investigation. This process usually consists of three steps:

  1. Remove the drive from the suspect’s computer.

  2. Connect the suspect’s drive to a write blocker and fingerprint it using a message digest.

  3. Use a clean wiped drive to make a copy of the suspect’s computer or copy it to an image file.

The copy must be an exact copy of the original. This is known as a bit-level copy, or physical copy. A bit-level copy is a copy of everything, including all files, file slack, and drive slack or free space. A logical copy is not this type of copy.

Caution

Investigators must use caution when seizing computer systems because the equipment might be booby-trapped. That is, the device may be set up to act as a dead man’s switch that will activate when a network connection is broken or when a computer case is opened. Such a switch can wipe all the information on the device, encrypt files, turn off a self-encrypted drive, or take other actions that make the data inaccessible.

It’s critical that a hard drive used to receive a copy of the evidence not have any files, data, or information stored on it. Common practice is to wipe the drive before using it to receive the copy. Drive wiping is the process of overwriting all addressable locations on the disk. The U.S. Department of Defense (DoD) drive-wiping standard 5220-22M states, “All addressable locations must be overwritten with a character, its complement, then a random character and verify.” Drive wiping is useful for forensic purposes, for organizations that want to dispose of hard drives, and for criminals who want to dispose of evidence. By making up to seven wiping passes over the media, an organization can further decrease the possibility of data recovery.

Authentication

Having an exact copy of the data in an investigation is just a start. You must also show that the copy and the original are exactly the same. This verification can be accomplished by means of hashing or other integrity algorithms that fingerprint the original drive and the forensically produced copy. Integrity checks ensure the veracity of the information and allow users of that information to have confidence in its correctness. There are many ways that data can become distorted, either accidentally or intentionally. A forensic analyst must protect against all distortion.

Integrity

Integrity can apply to paper documents as well as electronic ones. Forgers can copy and create fake paper documents, but it is not a skill easily learned. Integrity in electronic documents and data is much more difficult to protect. Forensic duplication and verification require cryptographic algorithms, which use one-way hashing algorithms. Rules of evidence generally require that when a duplicate of the original data is admitted as evidence, it must be an exact duplicate of the original. The hash values must match and be of sufficient strength to prove that tampering has not occurred. Not every investigation you become involved in will go to court, but ethics and good practice require that evidence be authenticated as unchanged from the moment of discovery to the point of disposal.

Tip

A primary image is the original image. It should be held in storage and kept unchanged. The working image is the image used for analysis purposes. Forensic examiners should work on the working image only.

Analysis

Analysis is the process of examining the evidence. Forensic analysts typically make two copies of the original drive and work with one of the copies. The following items are commonly analyzed in an investigation:

  • Images Word documents, compressed files, and images

  • Images Deleted items

  • Images Files created/accessed/modified on suspect dates

  • Images Email files (such as .PST files)

  • Images Files stored in NTFS streams

Forensic investigators use many different programs to review the evidence. With dead analysis, a machine is turned off and the drive analyzed. Sometimes, a machine must be analyzed without being turned off; this is a live analysis. With live analysis, it is critical that evidence be examined from most volatile to least volatile.

Note

No single program will do everything you need to perform during an investigation. As an example, you may want to use hex editors to examine slack space and deleted items.

The Disaster Recovery Lifecycle

Disaster recovery is closely related to incident response and forensics. The purpose of disaster recovery is to get a damaged organization restarted so that critical business functions can resume. When a disaster occurs, the process of progressing back to normal operations includes the following stages:

  1. Crisis management

  2. Recovery

  3. Reconstitution

  4. Resumption

Federal and state government entities typically use a continuity of operations (COOP) site, which is designed to take on operational capabilities when the primary site is not functioning. The length of time that the COOP site is active and the criteria in which the COOP site is enabled depend on the business continuity and disaster recovery plans. Both government and nongovernment entities typically make use of a checklist to manage continuity of operations. Table 8.5 shows a sample disaster recovery checklist.

TABLE 8.5 Disaster Recovery Checklist

Time Frame

Activity

When disaster occurs

Notify disaster recovery manager and recovery coordinator

Within 2 hours

Assess damage, notify senior management, and determine immediate course of action

Within 4 hours

Contact offsite facility, recover backups, and replace equipment, as needed

Within 8 hours

Provide management with updated assessment and begin recovery at updated site

Within 36 hours

Reestablish full processing at alternate site and determine timeline for return to primary facility

ExamAlert

The disaster recovery manager should direct short-term recovery actions immediately following a disaster.

Individuals responsible for emergency management need to assess damage and perform triage. The areas impacted the most need attention first. Protection of life is a priority while working to mitigate damage. Recovery from a disaster requires that personnel be sent to the recovery site. When employees and materials are at the recovery site, interim functions can resume operations. This might entail installing software and hardware. Backups might need to be loaded, and systems might require configuration.

The recovery process does not necessarily occur as a series of steps. For example, while the recovery process is taking place, teams are also dispatched to the disaster site to start the cleanup, salvage, and repair process. When those processes are complete, normal operations can resume.

When operations are moved from the alternate operations site back to the restored site, the efficiency of the restored site must be tested. Processes should be sequentially returned, from least critical to most critical. In the event that a few glitches need to be worked out in the restored facility, you can be confident that your most critical processes are still in full operation at the alternate site.

Teams and Responsibilities

Individuals involved in disaster recovery must deal with many things; when called to action, their activities focus on emergency response, assessing the damage, recovery operations, and restoration. Figure 8.8 illustrates an example of disaster recovery activities.

Images

FIGURE 8.8 Disaster Recovery Timeline

The recovery team has the necessary authority and responsibility to get the alternate site up and running. This site is used as a stand-in for the original site until full operations can be restored.

Caution

Physical security is of great importance after a disaster. Precautions such as guards, temporary fencing, and barriers should be deployed to prevent looting and vandalism.

Recovery Strategy

When a disaster occurs, a recovery strategy is needed. A recovery strategy involves planning for failure by using methods of resiliency. Developing a successful recovery strategy requires the support of senior management. To judge the best strategy to recover from a given interruption, the team must evaluate and complete the following:

  • Images Detailed documentation of all costs associated with each possible alternative

  • Images Quoted cost estimates for any outside services that might be needed

  • Images Written agreements with chosen vendors for all outside services

  • Images Possible resumption strategies in the event of a complete loss of the facility

  • Images Documentation of findings and conclusions as a report to management of chosen recovery strategy for feedback and approval

This information is used to determine the best course of action based on the analysis of data from the business impact analysis (BIA). With so much to consider, it is helpful to divide the organization’s recovery into specific areas, functions, or categories, such as the following:

  • Images Business process recovery

  • Images Facility and supply recovery

  • Images User recovery

  • Images Operations recovery

Business Process Recovery

Business processes may be interrupted due to the loss of personnel, critical equipment, supplies, or office space; or from uprisings, such as strikes. Even if a facility is intact after a disaster, people are required and are an important part of the business process recovery.

Workflow diagrams and documents can assist with business process recovery by mapping relationships between critical functions to evaluate interdependencies. Often, a critical process cannot be done because a related process was left out of the workflow. For example, say that you bring in the hardware, software, electric supply, and a system engineer to restore a computerized business process; however, you do not have any network cables to connect the equipment. Now all the vendors are closed because of the storm, and no $5 networking cables are available. A process flow created before disaster strikes can identify what needs to be done and what parts and components will be needed. Building a workflow diagram allows an organization to examine the resources required for each step and the functions that are critical for continued business operations.

Facility and Supply Recovery

Facility and supply interruptions can be caused by fire, loss of inventory, transportation or telecommunications problems, or even heating, ventilating, and air conditioning (HVAC) problems. An emergency operations center (EOC) must be established, and redundant services must be enabled for rapid recovery from interruptions. Many options are available, from a dedicated offsite facility, to agreements with other organizations for shared space, to the option of putting up a prefab building and leaving it empty as a type of cold backup site. The following sections examine some of these options.

Subscription Services

Building and running data-processing facilities is expensive. Organizations typically opt instead to contract their EOC facility needs to a subscription service. The CISSP exam categorizes these subscription services as hot, warm, and cold sites.

A hot site is ready to be brought online quickly. It is fully configured and equipped with the same systems as the regular production site. It can be made operational within just a few hours. A hot site needs staff, data, and procedural documentation. Hot sites are a high-cost recovery option but can be justified when a short recovery time is required. A hot site subscription service involves a range of associated fees, including monthly cost, subscription fees, testing costs, and usage or activation fees. Contracts for hot sites need to be closely examined because some services charge extremely high activation fees to discourage subscribers from utilizing these facilities for anything less than a true disaster. To get an idea of the types of costs involved, www.drj.com reports that subscriptions for hot sites average 52 months in duration, and costs can be as high as $120,000 per month.

Caution

It’s possible that during a disaster, one backup site might not be available. Many organizations therefore have a backup to a backup site. Such a site is known as a tertiary site.

Regardless of the fees involved, a hot site needs to be periodically tested to evaluate processing abilities as well as security. The physical security of a hot site should be at least at the same level as the security of the primary site. Finally, it is important to remember that a hot site is intended for short-term use only. With a subscriber-based service, there might be others in line for the same resource once your contract ends. An organization should have a plan to recover primary services quickly or move to a secondary location.

Caution

To decrease risk of sabotage and other potential disruptions, hot sites should not be externally identifiable.

For organizations that lack the funds to spend on a hot site or in situations where a short-term outage is acceptable, a warm site might be acceptable. A warm site has data equipment and cables and is partially configured. It could be made operational within a few hours to a few days. The assumption with a warm site is that necessary computer equipment and software can be procured despite the disaster. Although a warm site might have some computer equipment installed, it is typically of lower processing power than at the primary site. The costs associated with a warm site are slightly lower than those of a hot site (see Figure 8.9).

Images

FIGURE 8.9 Recovery Site Availability Versus Cost

In situations where even longer outages are acceptable, a cold site might be the right choice. A cold site is basically an empty room with only rudimentary electrical power and computing capability. Although it might have a raised floor and some racks, it is nowhere near ready for use. It might take several weeks to a month to get the site operational. Cold sites are less ready than hot and warm sites, but the associated costs are also much lower than for hot or warm sites, averaging $2,000 per month or more.

Tip

Cold sites are a good choice for the recovery of noncritical services.

Redundant Sites

The CISSP exam considers redundant sites to be sites owned by the organization. Although these sites might be either partially or totally configured, the CISSP exam does not typically expect you to know that level of detail. A redundant site is capable of handling all operations if another site fails. Although a redundant site can be expensive, it offers an organization fault tolerance, which is necessary for an organization that cannot withstand any downtime. If redundant sites are geographically dispersed, the possibility of more than one being damaged is reduced. For low- to medium-priority services, a distance of 10 to 20 miles from the primary site is considered acceptable. If the loss of services, for even a very short time, could cost the organization millions of dollars, the redundant site should be farther away. Therefore, redundant sites that are meant to support highly critical services should not be in the same geographic region or subject to the same types of natural disasters as the primary site.

An organization that has multiple sites dispersed in different regions of the world might choose to use multiple processing centers. This way, a branch in one area can act as backup for a branch in another area.

Mobile Sites

Mobile sites are usually tractor-trailer rigs that have been converted into data-processing centers. These sites contain all the necessary equipment and are mobile, permitting transport to any business location quickly. Rigs can also be chained together to provide space for data processing and provide communication capabilities. Mobile units are a good choice for areas where no recovery facilities exist and are commonly used by the military and organizations such as large insurance agencies for immediate response during a disaster. They can get critical services up and running quickly and commonly provide tactical satellite services but do not work as a long-term solution.

Note

Mobile sites are a non-mainstream alternative to traditional recovery options. Mobile sites typically consist of fully contained tractor-trailer rigs that come with all the facilities needed for a data center. Units can be quickly moved to any site and are perfect for use after storms, whose boundaries are hard to predict.

Whatever recovery method is chosen, regular testing is important to verify that the redundant site meets the organization’s needs and that the team can handle the workload to meet minimum processing requirements.

Reciprocal Agreements

In a reciprocal agreement, two organizations pledge to assist one another in the event of a disaster. The organizations would share space, computer facilities, and technology resources. On paper, this appears to be a cost-effective approach, but it has drawbacks. Each party to such an agreement must place its trust in the other organization to provide aid in the event of a disaster. However, the party that has not been affected by the disaster may be hesitant to follow through when a disaster actually occurs.

Also, confidentiality is an important issue with a reciprocal agreement. The damaged organization is in a vulnerable position and needs to trust the other party’s housing of the victim’s confidential information. Legal liability can also be a concern; for example, one organization might agree to help another and be hacked as a result. Finally, if the two parties to a reciprocal agreement are geographically near one another, there is a danger that disaster could strike both of them, thereby rendering the agreement useless.

The biggest drawbacks to reciprocal agreements are that they are hard to enforce and that, many times, incompatibilities in organization hardware, software, and even cultures are not discovered until after a disaster strikes.

User Recovery

User recovery focuses on what employees need in order to do their jobs. User recovery must address the following:

  • Images Procedures, documents, and manuals

  • Images Communication systems

  • Images Means of mobility and transportation to and from work

  • Images Workspace and equipment

  • Images Alternate site facilities

  • Images Basic human requirements, such as food and water, sanitation facilities, rest, money, and morale

An organization might be able to get employees to a backup facility after a disaster, but if there are no phones, desks, or computers, the employees’ ability to work will be severely limited.

User recovery plans sometimes need to consider food. For example, my brother-in-law works for a large chemical company on the Texas Gulf Coast. During hurricanes and other disasters, he is required to stay at work as part of the emergency operations team. His job requires him to stay at the facility regardless of whether the disaster lasts two days or two weeks. During a simulation test several years ago, it was discovered that someone had forgotten to order food for the facility where the employees were to remain for the duration of the drill. Luckily, the 40 or so hungry employees were not really in a disaster and were able to have pizza delivered. Had it been a real disaster, however, no takeout would have been available.

Operations Recovery

Operations recovery addresses interruptions caused by equipment failure. Redundancy—redundant equipment, redundant arrays of inexpensive disks (RAID), backup power supplies (BPSs), and other redundant services—solves this potential loss of availability.

Hardware failures are some of the most common disruptions that can occur. Preventing this type of disruption is critical to operations. The best time to start planning hardware redundancy is when equipment is purchased. At purchase time, there are two important numbers that a buyer must investigate:

  • Images Mean time between failure (MTBF): Used to calculate the expected lifetime of a device. A higher MTBF number means the equipment should last longer.

  • Images Mean time to repair (MTTR): Used to estimate how long it would take to repair the equipment and get it back into production. Lower MTTR numbers mean the equipment requires less repair time and can be returned to service sooner.

You can use this formula to calculate availability of equipment:

MTBF / (MTBF + MTTR) = Availability

To maximize availability of critical equipment, an organization can consider obtaining an SLA. There are many kinds of SLAs, but an SLA for operations recovery is a contract between an organization and a hardware vendor, in which the vendor promises to provide a certain level of protection and support. For a fee, the vendor agrees to repair or replace the covered equipment within the contracted time.

Fault tolerance can be applied at the server or drive level. For servers, clustering is a technology that allows for high availability; clustering means grouping multiple servers together so that they are viewed logically as a single server. Users see a cluster as one unit. The advantage is that if one server in the cluster fails, the remaining active servers pick up the load and continue operation.

Fault tolerance at the drive level is achieved primarily with RAID, which provides hardware fault tolerance and/or performance improvements. This is accomplished by breaking up the data and writing it across one or more disks. To applications and other devices, RAID appears as a single drive. Most RAID systems have hot-swappable disks. This means that faulty drives can be removed and replaced without turning off the entire computer system. If a RAID system uses parity and is fault tolerant, the parity data can be used to reconstruct the newly replaced drive. The technique for writing the data across multiple drives is called striping. With striping, although write performance remains almost constant, read performance is drastically increased.

All hard drives and data storage systems fail. It’s not a matter of if but when. There are many types of RAID; Table 8.6 lists and describes the nine most common types.

TABLE 8.6 RAID Levels

Level

Type

Description

RAID 0

Striped disk without fault tolerance

Provides data striping but no fault tolerance. If one drive fails, all data in the array is lost.

RAID 1

Mirroring and duplexing

Provides disk mirroring. Level 1 provides twice the read transaction rate of single disks and the same write transaction rate as single disks.

RAID 2

Error-correcting

Stripes data at the bit level rather than the block level. This level of RAID is rarely used.

RAID 3

Bit-interleaved parity

Offers byte-level striping with a dedicated parity disk.

RAID 4

Dedicated parity drive

Provides block-level striping. If a data disk fails, the parity data is used to create a replacement disk.

RAID 5

Block-interleaved distributed parity

Provides data striping at the byte level, good performance, and good fault tolerance. It is also one of the most popular types of RAID.

RAID 6

Independent data disks with double parity

Provides block-level striping with parity across all disks.

RAID 10

A stripe of mirrors

Creates mirrors and a RAID 0 stripe. This is not one of the original RAID levels. Sometimes referred to as 0+1.

RAID 15

Mirrors and parity

Creates mirrors RAID 1 and RAID 5 distributed parity. This is not one of the original RAID levels.

It is worth mentioning that RAID Level 0 is used for performance only and not for redundancy.

The most expensive RAID solution to implement is RAID Level 1 because all the data on disk A is mirrored on disk B. However, mirroring has a disadvantage: If data on disk A is corrupted, data on disk B will also become corrupted.

The most common form of RAID is RAID 5. RAID 5 striping is useful because it offers a balance of performance and usability. RAID 5 stripes both data and parity information across three or more drives, whereas RAID 3 uses a dedicated parity drive.

Striping the data and parity across all drives removes the drive stress that the dedicated parity drive inflicts. Fault tolerance is provided by ensuring that, if any one drive dies, the other drives maintain adequate information to allow for continued operation and eventual rebuilding of the failed drive (once replaced).

Just a bunch of disks (JBOD) is somewhat like RAID, but it is really not RAID at all. JBOD can use existing hard drives of various sizes, combined together into one massive logical disk. JBOD provides no fault tolerance and no increase in speed. The only benefit of JBOD is that you can use existing disks, and if one drive fails, you lose the data on only that drive. Both of these advantages are minimal, so don’t expect to see too many organizations actually using this technique.

To better understand how RAID and JBOD technologies compare, take a moment to review Figure 8.10.

Images

FIGURE 8.10 RAID Technologies

ExamAlert

Fault tolerance and RAID are important controls. For the CISSP exam, you should be able to define RAID and describe specific levels and each level’s attributes. For example, you should know that RAID 1 has the highest cost per byte, and RAID 5 is the most widely used type.

Fault Tolerance

Fault tolerance requires a redundant system so that in the event of a failure, a backup system can take the place of the primary system. Tape and hard drives are commonly used for fault tolerance. A tape-based system is an example of a sequential access storage device (SASD). If you need information from a portion of the tape, you need to traverse the tape drive to the required position in order to access the information. A hard drive is an example of a direct access storage device (DASD). The advantage of a DASD is that information can be accessed much more quickly.

One option that can be used to speed up the sequential process when large amounts of data need to be backed up is a redundant array of independent tapes (RAIT). RAIT is efficient when large numbers of write operations are needed for massive amounts of data. RAIT stripes the data across multiple tapes, much as a RAID array and can function with or without parity.

Another technology, massive array of inactive disks (MAID), offers a distributed hardware storage option for the storage for data and applications. It was designed to reduce the operational costs and improve long-term reliability of disk-based archives and backups. MAID is similar to RAID except that it provides power management and advanced disk monitoring. MAID might or might not stripe data and/or supply redundancy. A MAID system powers down inactive drives, reduces heat output, reduces electrical consumption, and increases the disk drive’s life expectancy.

Storing and managing so much data can become a massive task for an organization. Organizations might have tape drives, MAID, RAID, optical jukeboxes, and other storage solutions to manage. To control all these systems, many organizations now use storage area networks (SANs). Although SANs are not common in small organizations, large organizations with massive amounts of data can use them to provide redundancy, fault tolerance, and backups. The beauty of this type of system is that the end user does not have to know the location of the information; the user must only make a request for the data, and the SAN retrieves and recovers it.

It is not just data that can be made fault tolerant. Computer systems can also benefit from fault tolerance. Redundant servers can be used, and the computing process can be distributed to take advantage of the power of many computers. There are two related ways to provide fault tolerance for computer systems:

  • Images Clustering: Clustering involves grouping computers to reach a greater level of usability than is possible with redundant servers. Whereas a redundant server waits until it’s needed, a clustered server actively participates in responding to the server’s load. If one of the clustered servers fails, the remaining servers can pick up the slack.

Note

A server farm can be used as a cluster of computers for complex tasks or in cases where supercomputers might have been used in the past.

  • Images Distributed computing: This technique is similar to clustering except there is no central control. Distributed computing, also known as grid computing, can be used for processes that require massive amounts of computer power. Because grid computing is not under centralized control, processes that require high security should not be considered. Distributed computing also differs from clustering in that distributed computers can add or remove themselves as they please.

Note

An example of distributed computing can be seen in the 2020 project Minecraft@Home, which was used to study questions related to Minecraft, such as the properties of worlds that can be generated from different random seeds.

Data and Information Recovery

Solutions to data interruptions include backups, offsite storage, and remote journaling. Because data processing is essential to most organizations, a data and information recovery plan is critical. The objective of such a plan is to back up critical software and data to enable quick restores with the least possible loss of content. Policy should dictate when backups are performed, where the media is stored, who has access to the media, and what the reuse or rotation policy will be. Types of backup media include tape reels, tape cartridges, removable hard drives, solid-state storage, disks, and cassettes.

Tape and optical systems still have the majority of the market share for backup systems. Common types of media include:

  • Images 8 mm tape

  • Images CDR/W media (recommended for temporary storage only)

  • Images Digital audio tape (DAT)

  • Images Digital linear tape (DLT)

  • Images Quarter inch tape (QIC)

  • Images Write-once/read-many (WORM)

ExamAlert

CISSP exam questions regarding different backup types can be quite tricky. Make sure you clearly know the difference before the exam. Backups can also be associated with disaster recovery planning metrics such as RPO, RTP, and MTTR.

Backups

Backups need to be stored somewhere, and they need to be accessible when it’s time to restore not just data but applications and configuration settings as well. Where the backup media is stored can have a real impact on how quickly data can be restored and brought back online. The media should be stored in more than one physical location to reduce the possibility of loss. Remote sites should be managed by a media librarian. It is this individual’s job to maintain the site, control access, rotate media, and protect this valuable asset. Unauthorized access to the media is a huge risk because it could impact the organization’s capability to provide uninterrupted service. Who transports the media to and from the remote site is also an important concern. Important backup and restoration considerations include the following:

  • Images Maintenance of secure transportation to and from the site

  • Images Use of bonded delivery vehicles

  • Images Appropriate handling, loading, and unloading of backup media

  • Images Use of drivers trained in proper procedures related to picking up, handling, and delivering backup media

  • Images Legal obligations for data, such as encrypted media, and separation of sensitive data sets, such as credit card numbers and credit card security codes

  • Images 24/7 access to the backup facility in the event of an emergency

An organization should contract its offsite storage needs with a known firm that demonstrates control of its facility and is responsible for its maintenance. Physical and environmental controls at offsite storage locations should be equal to or better than those in the organization’s own facility. A letter of agreement should specify who has access to the media and who is authorized to drop it off or pick it up. There should also be agreement on response times that will be met in the event of a disaster. Onsite storage should maintain copies of recent backups to ensure the capability to recover critical files quickly.

Backup media should be securely maintained in an environmentally controlled facility with physical control appropriate for critical assets. The area should be fireproof, and those depositing or removing media should have records of their access logged by a media librarian.

Table 8.7 lists some sample functions and their recovery times.

TABLE 8.7 Organization Functions and Example Recovery Times

Function

Recovery Time

Recovery Strategy

Database

Minutes to hours

Database shadowing (covered later in this chapter, in the section “Other Data Backup Methods”)

Help desk

7 to 14 days

Warm site

Research and development

Several weeks to a month

Cold site

Purchasing

1 to 2 days

Hot site

Payroll

1 to 5 days

Multiple site

Software itself can be vulnerable, even when good backup policies are followed, because sometimes software vendors go out of business or no longer support needed applications. In these instances, escrow agreements can help. Escrow agreements allow an organization to obtain access to the source code of business-critical software if the software vendor goes bankrupt or otherwise fails to perform as required. Given the myriad compilers and operating systems, source code escrow agreements now address everything required to build a product, including operating systems, tools, and compilers.

Each backup method has benefits and drawbacks. Full backups are the most comprehensive but take the longest time to create. So, even though it might seem best to do a full backup every day, it might not be possible due to the time and expense.

Tip

Two basic methods can be used to back up data: automated and on-demand backups. Automated backups are scheduled to occur at a predetermined time. On-demand backups can be scheduled at any time.

Full Backups

During a full backup, all data is backed up, and no files are skipped or bypassed; you simply designate which server to back up. A full backup takes the longest to perform and the least time to restore because only one backup data set is required.

Differential Backups

With a differential backup, a full backup is typically done once a week, and a differential backup, which involves backing up all files that have changed since the last full backup, is done more frequently, typically daily. If you need to restore, you need the last full backup and the most recent differential backup.

Differential backups make use of files’ archive bits. The archive bit indicates that a file is ready for archiving, or backup. A full backup clears the archive bit for each backed-up file. Then, if anyone makes changes to one of these files, its archive bit is toggled on. During a differential backup, all the files that have the archive bit on are backed up, but the archive bit is not cleared until the next full backup. Because more files will likely be modified during the week, the differential backup time will increase each day until another full backup is performed; still, this method takes less time than a daily full backup. The value of a differential backup is that only two backup data sets are required: the full backup and the differential backup.

Incremental Backups

With an incremental backup strategy, a full backup is scheduled for once a week (typically), and only files that have changed since the previous full backup or previous incremental backup are backed up more frequently (usually daily).

Unlike in a differential backup, in an incremental backup, the archive bit is cleared on backed-up files; therefore, incremental backups back up only changes made since the last incremental backup. This is the fastest backup option, but it takes the longest to restore because the full backup must be restored, and then all the incremental backups must be restored, in order.

Tape Rotation Schemes

Tapes and other media used for backups eventually fail. It is important to periodically test backup media to verify its functionality. Some tape rotation methods include the following:

  • Images Simple: A simple tape-rotation scheme uses one tape for every day of the week and then repeats the pattern the following week. One tape can be for Monday, one for Tuesday, and so on. You add a set of new tapes each month and then archive the previous month’s set. After a predetermined number of months, you put the oldest tapes back into use.

  • Images Grandfather-father-son (GFS): With this scheme, you typically use one tape for monthly backups, four tapes for weekly backups, and four tapes for daily backups (assuming that you are using a five-day work week). It is called grandfather-father-son because the scheme establishes a kind of hierarchy: The grandfather is the single monthly backup, the fathers are the four weekly backups, and the sons are the four daily backups.

  • Images Tower of Hanoi: This tape-rotation scheme is named after a mathematical puzzle. It involves using five sets of tapes, labeled A through E. Set A is used every other day; set B is used on the first non-A backup day and is used every 4th day; set C is used on the first non-A or non-B backup day and is used every 8th day; set D is used on the first non-A, non-B, or non-C day and is used every 16th day; and set E alternates with set D.

Note

Some backup applications perform continuous backups and keep a database of backup information. These systems are useful when a restoration is needed because the application can provide a full restore, a point-in-time restore, or a restore based on a selected list of files MG.

Data Replication Techniques

Data replication can be handled using two basic techniques, each of which provides various capabilities:

  • Images Synchronous replication: This technique uses an atomic write operation, which can either complete on both sides or be abandoned. Its strength is that it guarantees no data loss.

  • Images Asynchronous replication: This technique updates as allowed but may experience some performance degradation. Its downside is that the remote storage facility may not have the most recent copy of data; therefore, some data may be lost in the case of an outage.

Other Data Backup Methods

Other alternatives exist for further enhancing an organization’s resiliency and redundancy. Some organizations use the following techniques by themselves, and others combine these techniques with other backup methods:

  • Images Database shadowing: Databases are high-value assets for most organizations. File-based incremental backups can read only entire database tables and are considered too slow. A database shadowing system writes the data to two physical disks. It creates good redundancy by duplicating the database sets to mirrored servers. Therefore, this is an excellent way to provide fault tolerance and redundancy. Shadowing mirrors changes to the database as they occur.

  • Images Electronic vaulting: Electronic vaulting involves making a copy of database changes to a secure backup location. It is a batch-process operation in which all current records, transactions, and/or files are copied to the offsite location. To implement vaulting, an organization typically loads a software agent onto the systems to be backed up, and then, periodically, the vaulting service accesses the software agent on these systems to copy changed data.

  • Images Remote journaling: Remote journaling is similar to electronic vaulting, except that information is duplicated to the remote site as it is committed on the primary system. By performing live data transfers, this mechanism allows alternate sites to be fully synchronized and fault tolerant at all times. Depending on the configuration, it is possible to configure remote journaling to record only the occurrence of transactions and not the contents of the transactions. Remote journaling can provide a very high level of redundancy.

  • Images Storage area network (SAN): A SAN supports disk mirroring, backup and restore, archiving, and retrieval of archived data in addition to data migration from one storage device to another. A SAN can be implemented locally or can use storage at a redundant facility.

  • Images Cloud computing backup: This type of backup can offer a cost-savings alternative to traditional backup techniques. These backups should be carefully evaluated, as there are many concerns when using cloud-based services. Cloud backups can be deployed in a variety of configurations, such as onsite private clouds or offsite public or private clouds.

Caution

If you use offsite public cloud storage, you should encrypt the backup.

Choosing the Right Backup Method

It is not easy to choose the right backup method. To start the process, a disaster recovery team must consider the length of outage the organization can endure and how current the restored information must be:

  • Images Recovery point objective (RPO): This metric indicates how much data an organization can afford to lose. The greater the RPO, the more tolerant the process is to interruption.

  • Images Recovery time objective (RTO): This metric specifies the maximum acceptable time to recover the data. This same metric would be used to evaluate the application that stores the data or the time it would take to transfer the data to the alternate site. The goal for disaster recovery planning would be to determine the time it would take to get the data up and running, whether at the primary site or an alternate site. The greater the RTO, the longer the recovery process can take; an organization that can tolerate interruption can handle a larger RTO.

Figure 8.11 illustrates how RPO and RTO can be used to determine acceptable downtime.

Images

FIGURE 8.11 RPO and RTO

ExamAlert

For the CISSP exam, you must know the terms RPO and RTO.

The RPO and RTO metrics are very important. What you should realize about them both is that the lower the time requirements are, the higher the maintenance cost will be to provide for reduced restoration capabilities. For example, most banks have a very small RPO because they cannot afford to lose any processed information. Think of the recovery strategy calculations as being designed to meet the required recovery time frames. You can calculate maximum tolerable downtime (MTD) as follows:

MTD = RTO + WRT

where WRT is the work recovery time, which is simply the remainder of the MTD used to restore all business operations (see Figure 8.12).

Images

FIGURE 8.12 MTD, RTO, and WRT

Plan Design and Development

After determining the RPO and the RTO, the next phase of the business continuity planning process is plan design and development. In this phase, the team designs and develops a detailed plan for the recovery of critical business systems. The plan should focus on major catastrophes and what to do in the event that the entire facility is destroyed. If the organization can handle these types of events, less severe events that render the facility unusable for a time can be readily dealt with.

A business continuity plan (BCP) should include information on both long-term and short-term goals and objectives. In creating the plan, the business continuity planning team should follow these steps:

  1. Identify time-sensitive critical functions and priorities for restoration.

  2. Identify support systems needed by time-sensitive critical functions.

  3. Estimate potential outages and calculate the minimum resources needed to recover from the catastrophe.

  4. Select recovery strategies and determine which vital personnel, systems, and equipment will be needed to accomplish the recovery. (There must be a team for the primary site and the alternate site.)

  5. Determine who will manage the restoration and testing process.

  6. Determine what type of funding and fiscal management are needed to accomplish these goals.

The plan should also detail how the organization will contact and mobilize employees, provide for ongoing communication between employees, interface with external groups and the media, and provide employee services. The following sections discuss these processes.

Personnel Mobilization

The process for contacting employees in the event of an emergency needs to be worked out before a disaster. The process chosen depends on the nature and frequency of the emergency. Outbound dialing systems and call trees are widely used. An outbound dialing system stores the numbers to be called in an emergency. These systems can provide various services, including the following:

  • Images Call rollover: If one number gets no response, the next is called.

  • Images Leave a recorded message: If an answering machine answers, a message can be left for the individual.

  • Images Request a call back: Even if a message is left, the system will continue to call back until the user calls in to the predefined phone number.

A call tree is a communication system in which the person in charge of the tree calls a lead person on each “branch,” who in turn calls all the “leaves” on that branch. If call trees are used, the team should verify that there is a feedback mechanism built in. For example, the last person on any branch of the tree may call and confirm that he or she got the message. This can help ensure that everyone has been contacted. Call trees can be automated with VoIP and public switched telephone networks (PSTNs) and online services.

Personnel mobilization can also be triggered by emails to tablets, smartphones, and so on. Such systems require the email server to be functioning.

It is also important to plan for executive succession planning. An organization needs to be able to continue even if key personnel are not available. The organization should have measures in place that account for the potential loss of key individuals. If there is no executive succession planning, the loss of key individuals could mean the organization may not be able to continue.

Interface with External Groups

A public affairs officer (PAO) typically decides how to interact with external groups. Such interactions can affect the long-term reputation of your business. Damaging rumors can easily start, and it is important to have protocols in place for dealing with incidents, accidents, and catastrophes. An organization must decide how to deal with response teams, the fire department, the police department, and ambulance and other emergency response personnel. If you do not tell the public what you want them to know, the media will decide for you, or your employees or former employees may use social media to spread messages you may not want out there; therefore, it is important to have a policy and craft a statement for your PAO.

A media spokesperson should be identified to deal with the media. Negative public opinion can be costly. It is important to have a properly trained spokesperson to speak for and represent the organization. This person must be in the communication path to have the facts before speaking or meeting with the press. He or she should engage with senior management and legal counsel prior to making any public statements.

During a crisis, an organization should meet with the media only after adequately preparing. The organization’s plan should include generic communications that address each possible incident. The spokesperson also needs to know how to handle tough questions. Liability should never be assumed; the spokesperson should simply state that an investigation has begun. Tackling tough issues up front will enable an organization to create a preapproved framework to call on if a real disaster occurs.

Employee Services

Organizations have some responsibilities to employees and to their families: Paychecks must continue, and employees need to be taken care of. Employees must be trained in what to do in the event of emergencies and in what they can expect from the organization. Insurance and other necessary services must continue.

Caution

The number-one priority of any business continuity or disaster recovery plan is to protect the safety of humans.

Before a disaster occurs, senior management must determine who will be in charge during a disaster to avoid chaos and confusion. Employees must know what is expected of them and who is in charge. It is important to make the decision before an adverse event occurs and record it in policy. It is also important to specify a succession of command because people may die during a disaster.

Someone in an organization must have the authority to allocate emergency funding when needed. In addition, controls must be in place to ensure that funds are not misappropriated.

Insurance

Insurance is one option that organizations can consider implementing to eliminate a portion of the risk uncovered during the BIA. Just as individuals can purchase insurance for a host of reasons, organizations can purchase protection insurance. An organization may purchase hacker or cyber insurance (which might include potential penalties and fines) and insurance for outages and business interruptions, and it may purchase insurance that covers the following assets:

  • Images Data centers

  • Images Software

  • Images Documents, records, and important papers

  • Images Errors and omissions

  • Images Media transportation

Insurance is not without drawbacks. Insurance often involves high premiums, delayed claim payouts, denied claims, and problems proving real financial loss. Also, most insurance policies pay for only a percentage of any actual loss and do not pay for lost income, increased operating expenses, or consequential loss. It is also important to note that many insurance companies will not ensure organizations that have not exercised due care in the implementation of disaster recovery and business continuity plans.

Implementation

When the business continuity planning team finishes developing its plan, it is ready to submit a completed plan for implementation. A BCP is a result of all information gathered during the project initiation, the BIA, and the recovery strategies phase. A final checklist for completeness ensures that the plan addresses all relevant factors, including the following:

  • Images The type of funding and fiscal management needed to accomplish the stated goals

  • Images The procedures for declaring a disaster and under what circumstances this will occur

  • Images Evaluation of potential disasters and the minimum resources needed to recover from various catastrophes

  • Images Critical functions and priorities for restoration

  • Images The recovery strategy and equipment that will be needed to accomplish the recovery

  • Images Individuals who are responsible for each function in the plan

  • Images The individual(s) who will manage the restoration and testing process

The completed BCP should be presented to senior management for approval. References for the plan should be cited in all related documents so that the plan is maintained and updated whenever there is a change or update to the infrastructure. When senior management approves the plan, it must be released and disseminated to employees. Awareness training for the individuals who would be responsible for carrying out the plan is critical and will help ensure that everyone understands what their tasks and responsibilities are in the event of an emergency.

Awareness and Training

It is important to ensure that all employees as well as internal and external personnel involved in the BCP, including contractors and consultants, know what to do in the event of an emergency. Although you will certainly require support from external agencies, such as law enforcement, if a disaster occurs, they are not likely to have time to participate in your training; however, having a face-to-face meeting with them and getting to know them prior to a disaster is a good idea so that you understand their resources and capabilities.

If employees are untrained, they might simply stop what they’re doing and run for the door in the event of an emergency. Or, even worse, they might not leave when an alarm has sounded, even though the plan requires that they leave because of possible danger. Instructions should be written in easy-to-understand language that uses common terminology.

Caution

Although some organizations might feel that the business continuity planning is done when the plan is complete, it is important to remember that no demonstrated recovery exists until the plan has been tested.

Testing

The final phase of the business continuity planning process is to test and maintain the plan. Training and awareness programs are also developed during this phase. Testing a disaster recovery plan is critical. Without performing a test, there is no way to know whether the plan will work. Testing, which transforms theoretical plans into reality, should be repeated at least once a year.

Tests should start with the easiest parts of the plan and then build to more complex items. The initial tests should focus on items that support core processing, and they should be scheduled during a time that causes minimal disruption to normal business operations. As a CISSP candidate, you should be aware of the five different types of business continuity planning tests:

  • Images Checklist test: Although it is not considered a replacement for a live test, a checklist test is a good first test. A checklist test is performed by sending copies of the plan to different department managers and business unit managers for review. Each recipient reviews the plan to make sure nothing has been overlooked.

  • Images Structured walkthrough: This test, also known as a tabletop test, is performed by having the members of the emergency management team and business unit managers meet to discuss the plan. They walk through the plan line by line to see how an actual emergency would be handled and to discover discrepancies. Reviewing the plan in this way often makes errors and omissions apparent.

Tip

The primary advantage of a structured walkthrough is that it helps you discover discrepancies between different departments.

  • Images Simulation: A simulation is a drill involving members of the response team acting in the same way they would if there were an actual emergency. This test proceeds to the point of recovery or to relocation to the alternate site. The primary purpose of this test is to verify that members of the response team can perform the required duties with only the tools they would have available in a real disaster.

  • Images Parallel test: A parallel test is similar to a structured walkthrough but actually invokes operations at the alternate site. Operations at the new and old sites are run in parallel.

  • Images Full interruption test: This type of test is the most detailed, time-consuming, and disruptive to a business. A full interruption test mimics a real disaster, and all steps are performed to complete backup operations. It includes all the individuals who would be involved in a real emergency, both internal and external to the organization. Although a full interruption test is the most thorough, it is also the scariest, and it can be so disruptive that it actually creates a disaster.

ExamAlert

For the CISSP exam, you need to know the differences between these test types. You should also know the advantages and disadvantages of each test.

The final step of the business continuity planning process is to combine all this information into the BCP and inter-reference it with the organization’s other emergency plans. Although the organization will want to keep a copy of the plan onsite, there should be another copy offsite. If a disaster occurs, rapid access to the plan will be critical.

Caution

Access to the BCP should be restricted so that only those with a need to know can access the entire plan. In the wrong hands, a BCP could become a playbook for an attack.

Monitoring and Maintenance

When the testing process is complete, a few additional items still need to be considered. All the hard work that has gone into developing the plan can be lost if controls are not put in place to maintain the current level of business continuity and disaster recovery. Life is not static, and an organization’s BCP should note be static either. The BCP should be a living document, subject to constant change.

To ensure that the BCP is maintained, you need to build in responsibility for the plan. You can do this using several vehicles:

  • Images Job descriptions: Individuals responsible for the plan should have this responsibility detailed in their job descriptions. Management should work with HR to have this information added to the appropriate documents. To enforce a plan, you need to have someone to hold accountable.

  • Images Performance reviews: The accomplishment (or lack of accomplishment) of appropriate plan maintenance tasks should be discussed in the responsible individual’s periodic evaluations.

  • Images Audits: The audit team should review the plan and make sure it is current and appropriate. The audit team should also inspect the offsite storage facility and review its security, policies, and configuration.

Table 8.8 lists the individuals responsible for specific parts of the business continuity planning process.

TABLE 8.8 Business Continuity Planning Process Responsibilities

Person or Department

Responsibility

Senior management

Project initiation, ultimate responsibility, overall approval, and support

Middle management or business

Identification and prioritization of critical systems unit managers

Business continuity planning committee and team members

Planning, day-to-day management, implementation, and testing of the plan

Functional business units

Plan implementation, incorporation, and testing

Disaster recovery implications for monitoring, maintenance, and recovery should be part of any discussions related to procuring new equipment, modifying current equipment, hiring key personnel, or making changes to the infrastructure. The best method to accomplish this is to add BCP review into all change management procedures. If changes to the approved plans are required, they must also be documented and structured using change management; the plan should be updated and distributed if even 10% of the plan, employees, or organization are affected by the change. A change control document should be kept with the plan at all times, and it should have good version control. A centralized command and control structure eases this burden.

Tip

Senior management is ultimately responsible for the BCP, including funding, project initiation, overall approval, and support.

Exam Prep Questions

1. You have been given an attachment that was sent to the head of payroll and was flagged as malicious. You have been asked to examine the malware, and you have decided to execute the malware inside a virtual environment. What is this environment called?

images A. Honeypot

images B. Hyperjacking

images C. Sandbox

images D. Decompiler

2. Which of the following is not a security or operational reason to use mandatory vacations?

images A. It allows the organization the opportunity to audit employee work.

images B. It ensures that the employee is well rested.

images C. It keeps one person from being able to easily carry out covert activities.

images D. It ensures that employees will know that illicit activities could be uncovered.

3. What type of control is an audit trail?

images A. Application

images B. Administrative

images C. Preventative

images D. Detective

4. Which of the following is not a benefit of RAID?

images A. Capacity benefits

images B. Increased recovery time

images C. Performance improvements

images D. Fault tolerance

5. Separation of duties is related to which of the following?

images A. Dual controls

images B. Principle of least privilege

images C. Job rotation

images D. Principle of privilege

6. Phreakers target which of the following resources?

images A. Mainframes

images B. Networks

images C. PBX systems

images D. Wireless networks

7. You recently emailed a colleague you worked with years ago. The email you sent him was rejected, and you have been asked to re-send it. What has happened with the message transfer agent?

images A. Whitelist

images B. Graylist

images C. Blacklist

images D. Black hole

8. Your organization has experienced a huge disruption. In this type of situation, which of the following is designed to take on operational capabilities when the primary site is not functioning?

images A. BCP

images B. Audit

images C. Incident response

images D. COOP

9. Which RAID type provides data striping but no redundancy?

images A. RAID 0

images B. RAID 1

images C. RAID 3

images D. RAID 4

10. Which of the following is the fastest backup option but takes the longest to restore?

images A. Incremental

images B. Differential

images C. Full

images D. Grandfathered

11. Which of the following types of intrusion detection systems compares normal to abnormal activity?

images A. Pattern-based IDS

images B. Statistical-based IDS

images C. Traffic-based IDS

images D. Protocol-based IDS

12. Which of the following processes involves overwriting data with zeros?

images A. Formatting

images B. Drive-wiping

images C. Zeroization

images D. Degaussing

13. What type of RAID provides a stripe of mirrors?

images A. RAID 1

images B. RAID 5

images C. RAID 10

images D. RAID 15

14. Which of the following is the name of a multidisk technique that offers no advantage in speed and does not mirror, although it does allow drives of various sizes to be used and can be used on two or more drives?

images A. RAID 0

images B. RAID 1

images C. RAID 5

images D. JBOD

15. You have been assigned to a secret project that requires a massive amount of processing power. Which of the following techniques is best suited for your needs?

images A. Redundant servers

images B. Clustering

images C. Distributed computing

images D. Cloud computing

16. Which of the following water sprinkler systems does not activate until triggered by a secondary mechanism?

images A. Dry pipe

images B. Wet pipe

images C. Pre-action

images D. Deluge

17. Which of the following is not a component of CPTED?

images A. Natural access control

images B. Natural reinforcement

images C. Natural surveillance

images D. Territorial reinforcement

18. The method of fire suppression used depends on the type of fire that needs to be extinguished. Which of the following fire-suppression methods does not suppress any of a fire’s three key elements and led to the creation of the fire tetrahedron?

images A. CO2

images B. Halon

images C. Water

images D. Dry-pipe system

19. Which of the following is the best answer that correctly describes the difference between MTBF and MTTR?

images A. MTBF is the estimated time that a piece of hardware, device, or system will operate before it fails. MTTR is an estimate of how long it will take to repair the equipment and get it back into use.

images B. MTBF is the time required to correct or repair a device in the event that it fails, and MTTR is the estimated time that a piece of hardware, device, or system will operate before it fails.

images C. MTBF is a value that can be used to compare devices to one another and also to determine the need for an SLA for a device and MTTR is the estimated time that a piece of hardware, device, or system will never fail before.

images D. There is no need for an organization to determine MTBF and MTTR for assets if it is located in an area where natural disasters such as hurricanes are not common.

20. Business continuity and disaster recovery planning is likely to be a very large, complex, and multidisciplinary process that brings together key associates within the organization. Which of the following best describes the role of senior management in this process?

images A. To plan for money for the disaster recovery project manager, technology experts, process experts, or other financial requirements from various departments in the organization

images B. To make disaster recovery planning a priority, commit and allow staff time for the process, and set hard dates for completion

images C. To manage the multidisciplinary team to keep all the team members all on the same page

images D. To be experts and understand specific processes that require special skill sets

21. Which of the following BCP tests carries the most risk?

images A. Full interruption test

images B. Parallel test

images C. Walkthrough

images D. Checklist test

22. Which of the following is the best description of what a software escrow agreement does?

images A. Provides a vendor with additional assurances that the software will be used per licensing agreements

images B. Specifies how much a vendor can charge for updates

images C. Gives an organization access to source code under certain conditions

images D. Provides a vendor access to an organization’s code if there are questions of compatibility

23. Which of the following tape-rotation schemes involves using five sets of tapes, labeled A through E?

images A. Tower of Hanoi

images B. Son-father-grandfather

images C. Complex

images D. Grandfather-father-son

24. If the recovery point objective (RPO) is low, which of the following techniques would be the most appropriate solution?

images A. Clustering

images B. Database shadowing

images C. Remote journaling

images D. Tape backup

25. You have been assigned to the business recovery planning team responsible for backup options and offsite storage. Your organization is considering purchasing software from a small startup operation that has a proven record for unique software solutions. To mitigate the potential for loss, which of the following should you recommend?

images A. Clustering

images B. Software escrow

images C. Insurance

images D. Continuous backup

26. When developing a business continuity plan, what should be the number-one priority?

images A. Minimizing outage times

images B. Mitigating damage

images C. Documenting every conceivable threat

images D. Protecting human safety

Answers to Exam Prep Questions

1. C. A virtual environment where you can safely execute suspected malware is called a sandbox. Answer A is incorrect because a honeypot is a fake vulnerable system deployed to lure attackers. Answer B is incorrect because hyperjacking is a type of attack against a virtual system. Answer D is incorrect because a decompiler is used to disassemble an application.

2. B. Mandatory vacations are not primarily for employee benefit but to better secure the organization’s assets. Answers A, C, and D are incorrect because they list valid reasons to use mandatory vacations: Mandatory vacations enable the organization to audit employee work, keep one person from being able to easily carry out covert activities, and ensure that employees will know that illicit activities could be uncovered.

3. D. Audit trails are considered a detective type of control. Answers A, B, and C are incorrect because audit trails are not application, administrative, or preventive controls.

4. B. RAID provides capacity benefits, performance improvements, and fault tolerance; therefore, answers A, C, and D are incorrect. Although RAID might reduce recovery time, it certainly won’t increase it.

5. B. Separation of duties is closely tied to the principle of least privilege. Separation of duties is the process of dividing duties so that more than one person is required to complete a task, and each person has only the minimum resources needed to complete the task. Answer A is incorrect because dual controls are implemented to require more than one person to complete an important task. Answer C is incorrect because job rotation is used to prevent collusion. Answer D is incorrect because the principle of privilege would be the opposite of what is required.

6. C. Phreakers target phone and voice (PBX) systems. Answer A is incorrect because phreakers do not typically target mainframes. Answer B is incorrect because hackers might target networks, but phreakers target phone systems. Answer D is incorrect because wireless war drivers or hackers, not phreakers, target networks.

7. B. Graylisting rejects any email sender that is unknown. Mail from a legitimate email server will be retransmitted after a period of time. This moves the graylisted email off the hold list and onto the whitelist and, at that time, places the email in the inbox of the receiving account. Whitelisting only approves what is on an allowed list, whereas blacklisting blocks specific items. Black holes silently discard or drop traffic without informing the source.

8. D. COOP is designed to take on operational capabilities when the primary site is not functioning. Business continuity plans generally focus on the continuation of business services in the event of any type of interruptions. Therefore, answers A, B, and C are incorrect.

9. A. RAID 0 provides data striping but no redundancy. Answers B, C, and D are incorrect because RAID 1 provides disk mirroring, RAID 3 provides byte-level striping with a dedicated parity disk, and a RAID 4 drive is considered a dedicated parity drive.

10. A. Incremental backup is the fastest backup option, but has the longest restoration time. Answers B, C, and D are incorrect: Grandfathered backup is not a valid answer, a differential backup takes less time overall but takes longer to restore, and a full backup takes the longest to perform, although it’s the fastest to restore.

11. B. A statistical-based IDS compares normal activity to abnormal activity. Pattern-, traffic-, and protocol-based IDSs do not; therefore, answers A, C, and D are incorrect.

12. C. Zeroization overwrites data with zeros. Answer A is incorrect because formatting does not remove any data from the file allocation table (FAT). Answer B is incorrect because drive-wiping writes patterns of ones and zeros. Answer D is incorrect because degaussing works by means of a magnetic field.

13. C. RAID 10 provides a stripe of mirrors. RAID 1 offers only striping. RAID 5 is seen as a combination of good, cheap, and fast because it provides data striping at the byte level, good performance, and good fault tolerance. RAID 15 is a combination of RAID 1 and RAID 5.

14. D. JBOD can use existing hard drives of various sizes, combined into one massive logical disk. There is no fault tolerance and no increase in speed. The only benefit of JBOD is that you can use existing disks, and if one drive fails, you lose the data on only that drive. Answers A, B, and C are incorrect because RAID 0, RAID 1, and RAID 5 do not match the description provided.

15. B. Clustering is a means of grouping computers and moving to a greater level of usability. Answer A is incorrect because a redundant server waits until it’s needed before being used. Answer C is incorrect because distributed computing is not centrally controlled and, therefore, should not be used for sensitive or classified work. Answer D is incorrect because placing sensitive information in the cloud can be problematic in terms of security.

16. C. A pre-action fire sprinkler system is a combination system. Pipes are initially dry, and they do not fill with water until a predetermined temperature is reached. Even then, the system does not activate until a secondary mechanism triggers. Answers A, B, and D are incorrect because they are triggered without a secondary mechanism.

17. B. Natural reinforcement is not a component of CPTED. CPTED comprises natural access control (answer A), natural surveillance (answer C), and territorial reinforcement (answer D). CPTED is unique in that it considers the factors that facilitate crime and seeks to use the proper design of a facility to reduce the fear and incidence of crime. At the core of CPTED is the belief that physical environments can be structured in such a way as to reduce crime.

18. B. Halon is unique in that it does not work in the same way as most fire-suppression agents. Halon interferes with the chemical reaction of a fire and led to the creation of the fire tetrahedron. The other answers are incorrect because water (answer C) removes one of the needed items for a fire, as does carbon dioxide (answer A). A dry-pipe system (answer D) is a water suppression system design and does not hold water continuously.

19. A. The MTBF is the lifetime of a device. The MTTR is the time that would be required to correct or repair a device in the event that it fails. Answer B is not the correct description. Answer C describes only part of the correct answer, as it is only an estimate of the failure rate. Answer D is not correct because these values are not affected by natural disasters.

20. B. The best answer is B. If senior management does not fully support the DRP, the plan will likely fail. Answer A is not the best answer because it describes the roles of a budget manager or budget department. Answer C is not the best answer because it describes the roles of a project manager. Answer D is not the best answer as it describes the roles of a subject matter expert.

21. A. A full interruption is the test most likely to cause its own disaster. All the other answers listed are not as disruptive, so answers B, C, and D are incorrect.

22. C. A software escrow agreement allows an organization to obtain access to the source code of business-critical software if the software vendor goes bankrupt or otherwise fails to perform as required. Answer A is incorrect because an escrow agreement does not provide the vendor with additional assurances that the software will be used per licensing agreements. Answer B is incorrect because an escrow agreement does not specify how much a vendor can charge for updates. Answer D is incorrect because an escrow agreement does not address compatibility issues; it grants access to the source code only under certain conditions.

23. A. The Tower of Hanoi involves using five sets of tapes, labeled A through E. Set A is used every other day. Set B is used on the first non-A backup day and is used every 4th day. Set C is used on the first non-A or non-B backup day and is used every 8th day. Set D is used on the first non-A, non-B, or non-C day and is used every 16th day. Set E alternates with set D. Answer B is incorrect because son-father-grandfather is not the correct name of a backup type. Answer C is incorrect because complex does not refer to a specific backup type. Answer D is incorrect because grandfather-father-son includes four tapes for weekly backups, one tape for monthly backups, and four tapes for daily backups; this does not match the description in the question.

24. D. The RPO is the earliest point at which recovery can occur. If an organization has a low RPO, tape backup is acceptable because there is a low need to capture the most current data. If the backup occurs at midnight and the failure is at noon the next day, the organization has lost 12 hours of data. Answers A, B, and C are incorrect because each of these would be used when a higher RPO, or more current data, is required.

25. B. The core issue here is that the software provider is a small startup that may not be around in a few years. The organization must therefore protect itself so that it has access to the source code. An escrow agreement allows an organization to obtain access to the source code of business-critical software if the software vendor goes bankrupt or otherwise fails to perform as required. Answers A, C, and D are incorrect because clustering and continuous backup do nothing to provide the organization access to the source code should they cease to exist, and, while insurance is an option, the expense is not necessary if the organization has rights and access to the code in the event that something occurs.

26. D. The protection of human safety is always the number-one priority of a security professional. Answers A, B, and C are incorrect. Minimizing outages is important but not number one. Preventing damage is also important, but protection of human safety is number one. It not possible to identify and place a dollar amount on every conceivable threat.

Need to Know More?

Snort IDS: www.it.uu.se/edu/course/homepage/sakdat/ht05/assignments/pm/programme/Introduction_to_snort.pdf

Security operations resource protection: www.process.st/it-security-processes/

The Open Source Security Testing Methodology Manual: www.isecom.org/OSSTMM.3.pdf

Digital forensic tools: https://www.guru99.com/computer-forensics-tools.html

The evolution of firewalls: https://www.techrepublic.com/article/understand-the-evolution-of-firewalls/

Disaster recovery testing: www.enterprisestorageforum.com/backup-recovery/disaster-recovery-testing.html

Configuration management best practices: https://blog.inedo.com/configuration-management-best-practices

Duress alarms: https://alltronic.com.au/security-blog/how-does-a-duress-alarm-work

System resilience and fault tolerance: www.itperfection.com/cissp/security-operations-domain/system-resilience-high-availability-qos-and-fault-tolerance/

Xcopy Commands: https://www.lifewire.com/xcopy-command-2618103

Logging and monitoring best practices: https://www.dnsstuff.com/logging-monitoring-best-practices

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.237.31.131