CHAPTER 7
Systems and Application Security

Which viewpoint is the best one to use when looking at your information security architecture, programs, policies, and procedures? If you’re the end user, you might think it’s the endpoints and apps that matter most. Senior managers and leaders might think it’s the data that inform decisions in business processes from the strategic to operational levels and from long-range planning down to the transaction-by-transaction details of getting business done. Operations managers might emphasize core business processes as the right focal points for your security planning and attentiveness. Your chief finance officer, the purchasing and supply group, or your logistics support teams may see IT security as being parceled out across the different players in your IT supply chain. Other perspectives look more at deployment issues, especially if your organization has moved into the clouds (plural) or is planning such a major migration.

You, as the SSCP on scene, get to look at all of those views at the same time; you have to see it from the ground up.

This chapter is about keeping the installed base of hardware, systems software, apps, data, endpoints, infrastructure, and services all safe and sound, regardless of how these systems and elements are deployed and used. Virtualization of processing, storage, and networks can actually bring more security-enhancing capabilities to your systems and architectures than it does challenges and pitfalls, if you do your homework thoroughly. And as attackers are becoming more proficient at living off the land and not depending upon malware as part of their kill chain activities, it becomes harder to detect or suspect that something not quite normal is going on in your systems or endpoints.

It’s time to go back to thinking about the many layers of security that need to be planned for, built in, exercised, and used to mutually support each other.

Systems and Software Insecurity

Let’s set the context for this chapter by borrowing from the spirit of the OSI seven-layer Reference Model. This can give us a holistic view of everything you need to assess, as you make sure that your organization can do its business reliably by counting on its IT systems and data to be there when they need them to be. We’ll also invite our friend CIANA back to visit with us; confidentiality, integrity, availability, nonrepudiation, and authentication are primary characteristics of any information security program and posture, along with safety and privacy. Auditability and transparency are also taking on greater emphasis in many marketplaces and legal systems, so we’ll need to keep them in mind as part of our information security planning as well. From the bottom of the stack, let’s look at the kinds of questions or issues that need to be addressed as you assess your overall systems security and identify what actions you need to take to remedy any urgent deficiencies.

  • Physical: How do you protect the hardware elements of your systems throughout their lifecycle? Is your supply chain secure? Does your organization have proper administrative controls in place to ensure that those who provide spare parts, consumables, and repair and maintenance services are trusted and reliable? How are physical security needs addressed via maintenance agreements or warranty service plans? Are adequate records kept regarding repair and replacement actions? Is all of your hardware under configuration management and control? Do you know where all of it is, at any given time? What about equipment disposal, particularly the disposal of storage media or any hardware where data remanence may be a concern? Much of this was covered in Chapter 2, “Security Operations and Administration.”
  • Operating systems: Do all of your systems—endpoints, servers, network control and management devices, and security systems properly maintained and kept up-to-date? What about your access control systems—are they kept up-to-date as well? Is your systems software supply chain secure? Do you receive update files or patch kits that are digitally signed by the vendor, and are those signatures validated before the organization applies the updates? How well does your organization exert configuration management and control of these software components? This, too, was covered in some depth in Chapter 2.
  • Network and systems management: Significant effort is required to design and implement your network architectures so that they meet the security needs as established by administrative policy and guidance. Once you’ve established a secure network baseline, you need to bring it under configuration management and control in order to keep it secure. In most larger network architectures, it’s wise to invest in automated systems and endpoint inventory workflows as part of configuration control as well. Chapter 6, “Network and Communications Security,” in conjunction with Chapter 1, “Security Operations and Administration,” addresses these important topics and provide valuable information and advice.
  • Data availability and integrity: Think about CIANA+PS for a moment: every one of those attributes addresses the need to have reliable data available, where it’s needed, when it’s needed, and in the form that people need to get work done with it. If the data isn’t there—or if the reliability of the system is so poor that workers at all levels of the organization cannot count on it—they will improvise, and they will make stuff up if they have to, to keep the production line flowing or to make a sale or to pay a bill. If they can. If they cannot—if the business logic simply cannot operate that way and there are no contingency procedures—then your business stops working. What is the process maturity of your organization’s approach to data quality? Do you have procedures to deal with faulty or missing data that maintain systems integrity and operational safety in spite of missing, incomplete, or just plain wrong data as input?
  • Data protection, privacy, and confidentiality: It’s one thing to have the high-quality data your organization needs ready and available when it’s needed to get the job done. It’s quite another to let that data slip away, through either inadvertent disclosure, small data leaks, or wholesale data exfiltration. Legal, regulatory, and market compliance regimes are becoming more demanding every day; many are now specifying that company officers can be held personally responsible for data breaches, and that responsibility can mean substantial fines and even imprisonment.
  • Applications: Applications can range from fully featured platforms with their own built-in database management capabilities to small, lightweight apps on individual smartphones. These may be commercial products, freeware or shareware; they may even be programs, script files, or other stored procedures and queries written by individual end users in your organization. Your risk management process should have identified these baselines, and identified which ones needed to be under what degree of formal configuration management and control. Chapter 3, “Risk Identification, Monitoring, and Analysis” showed you how to plan and account for all updates and patches, while Chapter 2 demonstrated ways for the information security team to know which systems and endpoints are hosting or using which apps and whether they are fully and properly updated or not.
  • Connections and sessions: From the security viewpoint, this layer is out here on top of the applications layer, since your users are the ones that use sessions to make connections to your applications and data. It’s at this layer that appropriate use policies, business process logic, partnerships and collaboration, and a wealth of other administrative decisions should come together to provide a strong procedural foundation for day-to-day, moment-by-moment use of your IT systems and the data they contain. It’s also at this layer where many organizations are still doing things in very ad hoc ways, with spur-of-the-moment decision-making either maintaining and enhancing information security or putting it at risk. From a process or capabilities maturity modeling perspective, is your organization using well-understood, repeatable, auditable, and effective business logic, processes, and operational procedures? Or are you still all learning it as you go? Chapter 3, with its focus on risk management, should provide ways to drive down to the level of individual processes and, in doing so, help the organization know how well it is managing its information risk by way of how it does business day to day.
  • Endpoints: This is where the data (which is abstract) gets transformed into action in the physical world and where real-world actions get modeled as data. Endpoints can be smartphones, workstations, laptops, point of sales terminals, Supervisory Control and Data Acquisition (SSCADA) or industrial process control devices, or IoT devices. How well does your organization maintain and manage these? Which ones are subject to formal configuration management and control, and which ones are not? Can the organization quickly locate where each such device is, with respect to your networks and IT systems, and validate whether it is being used appropriately? What about the maintenance and update of the hardware, software, firmware, and control parameters and data onboard each endpoint, and also the effective configuration management and control of these endpoints?
  • Access management, authentication, authorization, and accounting: This is probably the most critical layer of your security protocol stack! Everything that an attacker wants to achieve requires access to some part of your systems. Chapter 1 looked in detail at how to flow information security classification and risk management policies into the details of configuring your AAA systems (named for the critical functions of authentication, authorization, and accounting) and setting the details in place to restrict each subject and protect each object. Are your access control processes well-managed, mature, and under configuration management themselves? Do your audit capabilities help reveal potentially suspicious activities that might be an intruder in your midst?
  • Audit: Every aspect of your organization’s use of its IT infrastructures, its software, and its data is going to be subject to some kind of audit, by somebody, and on a continuing basis. From an information security perspective, any audit can find exploitable vulnerabilities in your systems or processes, and that’s an opportunity to make urgently needed repairs or improvements. Failing to have those problems fixed on a subsequent audit can result in penalties, fines, or even being denied the ability to continue business operations. Audit requirements force us to make sure that our systems can maintain the pedigree of every audit-critical data item, showing its entire life history from first acquisition through every change made to it. Maintaining these audit trails throughout your systems is both a design issue and an operations and administration issue (Chapter 2), as well as providing the foundation for any security incident investigations and analysis (Chapter 4); here, we’ll do a roundup of those issues and offer some specific advice and details.
  • Business Continuity: Many different information security issues and control strategies come together to assure the continued survivability and operability of your organization’s business functions; in the extreme, information security architectures such as backup and restore capabilities provide for recovery and restart from disruptions caused by attacks, natural disasters, or accidents.

Two major categories of concerns might seem to be missing from that list—and yet, each layer in that “security maturity model” cannot function if they are not properly taken into account.

  • Weaponized malware: Although many new attack patterns are “living off the land” and using their target’s own installed software and systems against them, malware still is a significant and pernicious threat. More than 350,000 new species of malware are observed in the wild every day. The old ways of keeping your systems and your organization safe from malware won’t work anymore. This applies at every step: imagine, for example, receiving new motherboards for PCs, new endpoints, or new software that has been tampered with by attackers; how would your organization detect this when it happens? We’ll look at different approaches later in this chapter, in the section titled “Identify and Analyze Malicious Code and Activity.”
  • The security awareness and preparedness of your people: It’s time to junk that idea that your people are your weakest link—that’s only the case if you let them be ignorant of the threat and if you fail to equip them with the attitude, the skills, and the tools to help defend their own jobs by being part of the active defense of the organization’s information and its systems. As with any vulnerability, grab this one and turn it around. Start by looking at how your organization onboards new members of staff; look at position-specific training and acculturation. Use the risk management process (as in Chapter 3) to identify the critical business processes that have people-powered procedures that should be part of maintaining information security. From the “C-suite” senior leaders through to each knowledge worker at every level in the team, how well are you harnessing that talent and energy to improve and mature the overall information security posture and culture of the organization?

Let’s apply this layered perspective starting with software—which we find in every layer of this model, or the OSI seven-layer Reference Model stack, from above the physical to below the people. From there, we’ll take a closer look at information quality, information integrity, and information security, which are all intimately related to each other.

Software Vulnerabilities Across the Lifecycle

All software is imperfect, full of flaws. Everything in it, every step across its lifecycle of development, test, use, and support, is the result of compromise. Users never have enough time to step away from their business long enough to thoroughly understand their business logic; analysts cannot spend enough time on working with users to translate that business process logic into functional requirements. Designers are up against the schedule deadlines, as are the programmers, and the testers; even with the best in automated large-scale testing workflows, there is just no way to know that every stretch of code has been realistically tested. Competitive advantage goes to the business or organization that can get enough new functionality, that is working correctly enough, into production use faster than the others in their marketplace. Perfect isn’t a requirement, nor is it achievable.

From the lowliest bit of firmware on up, every bit of device control logic, the operating systems, and the applications have flaws. Some of those flaws have been identified already. Some have not. Some of the identified flaws have been categorized as exploitable vulnerabilities; others may still be exploitable, but so far no one’s imagination has been spent to figure out how to turn those flaws into opportunities to do something new and different (and maybe malicious). Even the tools that organizations use to manage software development, such as their integrated development environments (IDEs) and their configuration management systems, are flawed. In some respects, having a software- dependent world reminds us of what Winston Churchill said about the English language: with all those flaws, those logic traps, and those bugs just waiting to bite us, this must be the worst possible way to run a business or a planet-spanning civilization, except for all the others.

That might seem overwhelming, almost unmanageable. And for organizations that do not manage their systems, it can quickly become a security expert’s worst nightmare waiting to happen. One approach to managing this mountain of possible bad news is to use a software development lifecycle model to think about fault detection, analysis, characterization, prioritization, mitigation, and remediation. You’ve probably run into software development lifecycle (SDLC) models such as Agile, Scrum, or the classic Waterfall before; as a security specialist, you’re not writing code or designing a graphical user interface (GUI). You’re not designing algorithms, nor are you implementing data structures or workflows by translating all of that into different programming languages. Let’s look at these SDLCs in a more generic way.

Software Development as a Networked Sport

In almost all situations, software is developed on a network-based system. Even an individual end user writing their own Excel formulas or a Visual Basic (VBA) macro, strictly for their own use, is probably using resources on some organization’s network. They’re doing this on a PC, Mac, or other device, which as an endpoint is a full-blown OSI 7-layer stack in its own right. They’re drawing upon resources and ideas they see out on the Web, and they’re using development tools that are designed and marketed to support collaborative creation and use of data—and sometimes, that data can be code as well. An Excel formula is a procedural set of instructions, as is a VBA macro. When these user-defined bits become part of business processes the company depends upon, they’ve become part of your shadow IT system—unknown to the IT team, outside of configuration management, invisible to risk assessment and information security planning.

Larger, more formalized software development is done with a team of developers using a network-enabled IDE, such as Microsoft’s Visual Studio; in the Linux world, NetBeans, Code:app:Blocks, or Eclipse CDT provide IDE capabilities. All support teams using networked systems to collaborate in their development activities.

Software development as a networked team sport starts with all of those gaps in analysis, assumptions about requirements, and compromises with schedule and resources that can introduce errors in understanding which lead to designed-in flaws. Now, put your team of developers onto a LAN segment, isolated off from the rest of the organization’s systems, and have them use their chosen IDE and methodologies to start transforming user stories and use cases into requirements, requirements into designs, and designs into software and data structures. The additional vulnerabilities the company is exposed to in such an approach can include the following:

  • Underlying vulnerabilities in the individual server or endpoint devices, their hardware, firmware, OS, and runtime support for applications.
  • Vulnerabilities in the supporting LAN and its hardware, firmware, and software.
  • Access control and user or subject privilege management errors, as reflected in the synchronization of network and systems management configuration settings such as access control lists, user groups and privileges, and applications whitelisting.
  • Anti-malware and intrusion detection systems that have known vulnerabilities, are not patched or up-to-date, and are using outdated definition or signature data.
  • Gaps in coverage of the configuration management and configuration control systems, such that shadow IT or other unauthorized changes can and have taken place; these may or may not introduce exploitable vulnerabilities. Many development teams, with their systems administrators’ support, add their own tools, whether homemade, third-party commercial products, or shareware, into their development environments; these can increase the threat surface in poorly understood ways.
  • Known exploits, reported in CVE databases, for which vendors have not supplied patches, updates, or procedural workarounds.
  • Process or procedural vulnerabilities in the ways in which the team uses the IDE.

Vulnerabilities also get built in by your designers due to some all-too-common shortcomings in the ways that many programmers develop their code.

  • Poor design practices: Applications are complex programs that designers build up from hundreds, perhaps thousands, of much smaller, simpler units of code. This decomposition of higher-level, more abstract functions into simpler, well-bounded lower-level functions is the heart of any good design process. When designers consistently use proven and well-understood design rules, the designs are more robust and resilient—that is, their required functions work well together and handle problems or errors in well-planned ways.
  • Inconsistent use of design patterns: A design pattern is a recommended method, procedure, or definition of a way to accomplish a task. Experience and analysis have shown us that such design patterns can be built successfully and used safely to achieve correct results. Yet many programs are developed as if “from scratch,” as if they are the first-ever attempt to solve that problem or perform that task. Assembling hundreds or thousands of such “first-time” sets of designs can be fraught with peril—and getting them to work can be a never-ending struggle.
  • Poor coding practices: Since the 1940s, we’ve known that about 20 classes of bad programming practice can lead to all-too-familiar runtime errors and exploitable vulnerabilities. Universities, schools, and on-the-job training teaches programmers these “thou shalt nots” of programming; still, they keep showing up in business applications and systems software.
  • Inconsistent use (or no use at all) of proven, tested design and code libraries: Software reuse is the process of building new software from modules of code that have been previously inspected, tested, and verified for correct and safe execution. Such design and code libraries, when published by reputable development teams, are a boon to any software development effort—as long as the right library elements are chosen for the tasks at hand and then used correctly in the program being developed. High-quality libraries can bring a wealth of security-related features built into their designs and code; in many cases, the library developer provides ongoing technical support and participates in common vulnerability reporting with the information systems security community. Sadly, many software teams succumb to schedule and budget pressures and use the first bit of cheap (or free) code that they find on the Internet that seems to fit their needs. Sometimes, too, application programmers speed-read the high-level documentation of a library or a library routine and accept what they read as proof that they’ve found what they need. Then they just plug it into their application and pray that it works right, never taking the time to read the code itself or verify that it will correctly and safely do what they need it to do and do nothing else in the process. There is a growing body of research data that suggests that commercial code libraries are being developed with better security practices in mind, and hence are producing lower rates of vulnerabilities (per thousand lines of source code, for example) than comparable open source libraries are achieving. Open source libraries may also be targets of opportunity for black hats to sponsor their own bug hunts, as they seek currently unrecognized vulnerabilities to exploit.
  • Weak enforcement of data typing and data modeling during software development: A major business platform application, such as an enterprise resource planning (ERP) system, might have tens of thousands of identifiers—names for fields in records, for record types, for variables used in the software internally, and the like.
    • Data modeling is a formal process that translates the business logic into named data elements. It formalizes the constraints for initialization of each data item; how new values are input, calculated, produced, and checked against logical constraints; and how to handle errors in data elements. For example, one constraint on a credit card number field might specify the rules for validating it as part of a data set (including cardholder name, expiration date, and so forth); related constraints would dictate how to handle specific validation problems or issues.
    • Data typing involves the rules by which the programmer can write code that works on a data item. Adding dollars to dates, for example, makes no sense, yet preventing a programming error from doing this requires data typing rules that define how the computer stores calendar dates and monetary amounts, and the rules regarding allowable operations on both types taken together. Organizations that manage their information systems with robust data dictionaries and use rigorously enforced data typing in their software development tend to see fewer exploitable errors due to data format, type or usage errors.

Every one of those vulnerabilities is an opportunity for an adversary to poison the well by introducing malformed code fragments into libraries being developed or to distort test procedures so that the results mask a vulnerability’s existence. Automated testing can produce thousands of pages of output, if one were to actually print it out; more automated tools are used to summarize it, analyze it, and look for tell-tale flags of problems.

But wait, there’s more bad news. Once your developers have finished building and testing a new release of a system, an application, or a web app, they’ve now got to bundle it up as a distribution kit and push it out to all of the servers and endpoints that need to install that update. This process can have its own errors of omission and commission that can lead to an exploitable opportunity for things to go wrong, or to be made to go wrong by an intruder in your midst.

Are those vulnerabilities under management? And even if they are, are they easy for an attacker to find and exploit?

Vulnerability Management: Another Network Team Sport

It’s commonly accepted wisdom that responding to newly discovered vulnerabilities is important; we must urgently take action to develop and implement a fix, and update all of our affected systems; and while we’re waiting for that fix, we must develop a work-around, a procedural way to avoid or contain the risk associated with that vulnerability, as quickly as possible. Yet experience shows us that in almost every business or organizational environment, prudence has to prevail. Businesses cannot operate if faced with a constant stream of updates, patches, and procedural tweaks. For one thing, such nonstop change makes it almost impossible to do any form of regression testing to verify that fixing one problem didn’t create three more in some other areas of the organization’s IT systems and its business logic. Most organizations, too, do not have the resources to be constantly updating their systems—whether those patches are needed at the infrastructure level, in major applications platforms that internal users depend upon, or in customer-facing web apps. Planned, orderly, intentional change management gives the teams a way to choose which vulnerabilities matter right now, and which ones can and should wait until the next planned build of an update package to the systems as a whole.

The larger your organization and the more diverse and complex its information architectures are, the greater the benefit to the organization of such a planned and managed process. By bringing all functional area voices together—including those from the information security team—the organization can make a more informed plan that spells out which known vulnerabilities will get fixed immediately, which can wait until the next planned release or update cycle, and which ones will just have to continue to wait even longer. This decision process, for example, must decide when to fix which vulnerabilities in the production IT systems and applications and when to do which fixes in the development support systems. (Should you disrupt an ongoing development for an urgent-seeming security fix and possibly delay a planned release of urgently needed capabilities into production use? Or is the potential risk of an exploit too great and the release must be held back until it can be fixed?)

Vulnerability management needs to embrace all of the disciplines and communities of practice across the organization. Business operations departments, customer service organizations, finance, legal, logistics, and many other internal end user constituencies may all have some powerful insights to offer, and skills to apply, as your organization takes on information risk management in a significant way. As your vulnerability management processes mature, you may also find great value in selectively including strategic partners or members of your most important collaboration networks.

  Is Open Source Software More Vulnerable Than Commercial Code?

As with everything in information security, it depends. There have been a variety of surveys and reports published that try to answer this question in aggregate, such as the Coverity scan of 10 billion lines of open source and commercial software products done by Synopsys in 2014. Instead, it’s better if you look at the specific inventory of open source software that your organization uses, and not the “industry averages,” to determine what’s the best balance of new features, cost of ownership, and risk. Look at the published CVE data, consult with the provider of the open source software that your organization is using, and investigate further. Then, make an informed decision based on your own risk situation, and not the headlines.

Data-Driven Risk: The SDLC Perspective

Any software program—whether it’s part of the operating system, an element of a major cloud-hosted applications platform, or a small utility program used during software distribution and update—can be driven to behave badly by the input data it reads in and works on. The classic buffer overflow flaw is, of course, a logic and design flaw at its heart, but it allows user-entered data to cause the program to behave in unanticipated ways, and some of that bad behavior may lead to exploitable situations for the attacker. This is not using malware to hack into our systems—this is using malformed data as input to cause our own systems to misbehave. A traditional database-based attack—the fictional employee who nonetheless gets a paycheck every pay period—has been updated to become the false invoice attack, in which the attacker sends a bill (an invoice) to a large company’s accounts receivables department. In many cases, small invoices for a few tens of dollars will get paid almost automatically (some human action will be necessary to enter the source of the invoice as if they are a new supplier). The amount of each false invoice can vary and slowly increase. In the United Kingdom alone (not a large marketplace), false invoicing cost businesses upward of £93 million (about $130 million U.S. dollars) in 2018, more than double the losses seen the year before, according to UK Finance, the trade association of Britain’s banking and finance sector.

Is false invoicing a human problem or a software problem? Both, actually; failing to identify the proper controls in the overall business process led to software that seemed to trust that a new supplier and a small-value invoice was…routine. Safe. (False invoicing is just one of hundreds of examples of the so-called living off the land attack patterns.)

Data quality programs should prevent exploits such as false invoicing by instituting the right level of review before creating a new business relationship. Data quality, as a discipline, asks that something be done to verify that the identity claimed on the face of that invoice is actually a real business, with a real presence in the marketplace; furthermore, data quality controls should be validating that this business that just sent in a bill for us to pay is one that we actually do business with in the first place.

Those may seem commonsense questions; they are. Yet, they go unasked thousands of times every day.

Coping with SDLC Risks

The good news in all of this is that there’s a proven method for keeping these types of risks under control. You already know how to keep the exploitable vulnerabilities inherent in your SDLC systems, processes, and procedures from being turned against you. It takes a series of failures of your preexisting security measures, however, to let that happen and not detect its occurrence.

Access Control

Once inside your systems, intruders can and do have almost a free hand at exploiting any and every imaginable vulnerability, taking advantage of each step in your software development, test, deployment, and support processes to put in their own command and control logic. Multifactor authentication has become a must-have part of any access control system. Controlling and monitoring the lateral movement of users and subjects across your internal network segments must be part of your strategy.

Lateral Data and Code Movement Control

We’ve already heard more than enough horror stories of operational systems that were inadvertently loaded with test versions of data and software. Your IDE and the LAN segments that support it need stringent two-way controls on movement of code and data—to protect bad data and malware moving into the development shop as well as escaping out from it and into your production environments or the Internet at large. This should prevent the movement of code and data out of the test environment and into production unless it’s part of a planned and managed “push” of a release into the update distribution process. This will also prevent malware, or maliciously formed bad data, from being exported into production, or the exfiltration of requirements, designs, and entire software libraries out into the wild or the hands of your competitors or adversaries.

Hardware and Software Supply Chain Security

At one level, your hardware, your operating systems, and even the IDE systems as applications themselves are probably coming into your organization via trusted or trustworthy suppliers. You have strong reason to believe, for example, that a major hardware or software vendor who publishes the digital signatures of their software distribution kits (for initial load as well as update of your systems) is worthy of trust—once, of course, you’ve verified the digital signatures on the items you’ve just taken delivery of from them. But your software developers may be using open source code libraries, or code snippets they’ve found on the Internet, which may actually do the job that needs to be done…but they may also have some hidden side effects, such as having a trap door or a Trojan horse feature built into them. It might be illuminating to have a conversation with someone in your development teams to gain insight as to how much of such unpedigreed software they incorporate into your organization’s systems, services, and products.

Applications Designed with Security in Mind

Strange as it may seem, most applications software is specified, designed, written, and tested by people who rather innocently assume that the world is not a dangerous place. So, how do you, as a non-code-bending, non-software-trained SSCP help your company or organization get more defensive in the ways that it builds and maintains its software? A blog post by the editorial team at Synopsys, in their Software Architecture and Design blog at www.synopsys.com/blogs/software-security/principles-secure-software-design/, highlights four key points to ponder.

  • Be paranoid. Know that somebodies, somewhere, are out to get you. Lots of them.
  • Pay attention to abuse cases, instead of just the “business normal” use cases, as sources of your functional and nonfunctional requirements. Put on your gray hat and think like an attacker would; look at ways of deliberately trying to mislead or misuse your systems. This helps inoculate you (somewhat) against what the Synopsys team calls the three fallacies that lead to complacency about information security needs among software developers.
  • Understand that small vulnerabilities cascade together to become just as disruptive as a few large vulnerabilities can be.
  • Build things securely so that they last. Build for posterity.

One major problem has been that for decades, the software industry and academia have assumed that managers and senior designers are responsible for secure software design and development. It’s no good teaching brand-new programmers about it, because they don’t manage software projects, according to this view. As a result, an awful lot of insecure software gets written, as bad habits get engrained by use.

A great resource to learn with is the Open Web Application Security Project (OWASP), at www.owasp.org. OWASP is a nonprofit source of unbiased, vendor-neutral and platform-agnostic information and provides advice and ideas about the right ways to “bake in” security when designing web apps.

As the SSCP, you are the security specialist who can help your organization’s software developers better appreciate the threat, recognize the abuse cases, and advocate for penetration-style security testing during development—not just after deployment!

Baking the security in from the start of software development requires turning a classic programmer’s paradigm inside-out. Programmers are trusting souls; they trust that others will do their jobs correctly. They trust that users will “do the right thing,” that network and systems security will validate the user, or that access control “done by someone else” will prevent any abuses. (Managers like this paradigm, too, because it shifts costs and efforts to other departments while making their jobs simpler and easier.) Instead, it’s more than high time to borrow a page from the zero-trust school of thought for networks:

Trust no one, trust no input data, ever, and verify everything before you use it.

Listen to the Voice of the User

Too many organizations make it difficult for end users to call attention to problems encountered when they’re using the IT systems and applications on the job. Help desks can often seem unfriendly, even downright condescending, when their first response to a frustrated user’s call for help seems to be “Did you flush your cookies and browser cache yet?” And yes, it’s true that the more details that can be captured during the trouble ticket creation process, the greater the likelihood that the problem can be investigated and resolved.

As an SSCP, you’re asking the end users to maintain heightened awareness, and be on guard for any signs that their systems, applications, hardware, or even the flow of business around them seems…abnormal. You need to hear about phishing and vishing attempts; you need to multiply the numbers of eyes and ears and minds that are watching out for anomalies.

Do you have processes that truly invite end-user engagement and reporting? Do those processes provide useful feedback to users, individually, as groups, and to the entire organization, that demonstrates the value of the problem reports or security questions they submit? Do you provide them with positive affirmation that they’re getting it right when security assessments or ethical penetration testing produces good findings?

If not, you’ve got some opportunities here. Take advantage of them.

Risks of Poorly Merged Systems

The rapid pace of corporate mergers and acquisitions and the equally rapid pace at which business units get spun off to become their own separate entities mean that many organizations end up with a poorly integrated smash-up of IT architectures. Often these are badly documented to begin with; in the process of the corporate reorganization, much of the tacit knowledge of where things are, how they work, and what vulnerabilities still exist and need attention disappeared with the key employees in IT, information security, or operational business units when they left the company.

If you’ve inherited such a hodge-podge of systems, in all likelihood you do not have anything remotely close to a well-understood, clean, and secure environment. The chances could be very high that somewhere in that mess is a poorly implemented and highly vulnerable community outreach or charity event web page, started as a pet project of a former employee or manager. Such web pages, or other shadow IT systems, could easily allow an intruder to gain access to everything in your new merged corporate environment. Kate Fazzini, in her book Kingdom of Lies: Unnerving Adventures in the World of Cybercrime,1 illustrates one such example at a major financial institution, showing just how difficult it can be for even a well-equipped and highly trained security operations center team to find such gaping holes in their organization’s cyber-armor.

In such circumstances, your first best bet might be to do some gray-box ethical penetration testing, possibly using a purple team strategy (in which the attacking red team and the defending blue team coordinate their efforts, share their findings, and work hand-in-glove every day) to gain the best understanding of the overall architecture and its weaknesses. From there, you can start to prioritize.

Remember that your company’s own value chains are coupled into larger value streams as they connect with both upstream and downstream relationships. The suppliers, vendors, and partners that help your organization build and deliver its products and services may face this same mergers and acquisitions systems integrity and security risks. Many of your strategic customers may also be facing similar issues. It’s prudent to identify those upstream and downstream players who might be significant sources of risk or vulnerability to your own value chain and actively engage with them to seek ways to collaborate on reducing everybody’s joint risk exposure and threat surfaces.

Hard to Design It Right, Easy to Fix It?

This is perhaps the most pernicious thought that troubles every software development team and every user of the software that they depend on in their jobs and in their private lives. Hardware, after all, is made of metal, plastic, glass, rubber, and dozens of other physical substances. Changing the hardware is hard work, we believe. A design error that says that our SOHO router overheats and burns out quickly, because we didn’t provide enough ventilation, might require a larger plastic enclosure. That design change means new injection molds are needed to cast that enclosure’s parts; new assembly line processes are needed, maybe requiring changes to the fixtures and tooling; and new shipping and packing materials for the empty enclosure and the finished product will be needed. That’s a lot of work, and a lot of change to manage! But changing a few lines of code in something that exists only as a series of characters in a source code file seems easy by comparison.

This false logic leads many managers, users, and programmers to think that it’s easy and simple to add in a missing feature or change the way a function works to better suit the end user’s needs or preferences. It’s just a simple matter of programming, isn’t it, if we need to fix a bug we discovered after we deployed the application to our end users?

Right?

In fact, we see that software development is a constant exercise in balancing trade-offs.

  • Can we really build all of the requirements our users say they need?
  • Can we really test and validate everything we built and show that it meets the requirements?
  • Can we do that for a price that we quoted or contracted for and with the people and development resources we have?
  • Can we get it all done before the marketplace or the real world forces us to change the build-to requirements?

As with any project, software development managers constantly trade off risks versus resources versus time. Some of the risks involve dissatisfied customers when the product is finally delivered; some risks involve undetected but exploitable vulnerabilities in that product system. And all projects face a degree of uncertainty that the people, money, time, and other resources needed for development and acceptance testing won’t be as available as was assumed when the project was started—or that those resources will be there to support the maintenance phase once the project goes operational.

How much security is enough to keep what sort of applications secure? As with anything else in the IT world, the information security aspects of any app should be a requirements-driven process.

  Security Requirements: Functional or Nonfunctional?

In systems analysis, a functional requirement is one that specifies a task that must be done; the requirement may also specify how users can verify that the task has completed successfully or if one of many error conditions have occurred. For example, the requirement might state that pressing the “start engine” button causes prerequisite safety conditions to be checked and then activate various subsystems, step-by-step, to start the engine; failure of any step aborts the start process, returns all subsystems to their safe pre-start condition, and sends alerts to the operator for resolution. By contrast, a nonfunctional requirement states a general characteristic that applies to the system or subsystem as a whole but is not obviously present in any particular feature or function. Security requirements are often considered nonfunctional. This can be confusing, as a few requirements examples can suggest.

  • Safety requirements in a factory process control system might state that “the system will require two-factor authentication and two-step manual selection and authorization prior to allowing any function designated as safety-critical to be executed.” As a broad statement, this is hard to test for; yet, when allocated down to specific subfunctions, either these specific verification steps are present in the module-level requirements, then built into the design, and observable under test, or they are not. Any as-built system element that should do such safety checks that does not is in violation of the requirements. So, is such a safety requirement functional or nonfunctional?
  • Confidentiality requirements in a knowledge bank system might state that “no unauthorized users can view, access, download or use data in the system.” This (and other) requirements might drive the specification, design, implementation, and use of the identity management and access control systems elements. But does the flow-down of this requirement stop there? Or do individual applications inherit a user authentication and authorization burden from this one high-level requirement?
  • Nonrepudiation requirements for a clinical care system could dictate that there must be positive control for orders given by a physician, nurse practitioner, or other authorized caregiver, both as record of care decisions and as ways to prevent an order being unfilled or not carried out. The log of orders given is a functional requirement (somebody has to build the software that builds the log each time an order is entered). But is the nonrepudiation part functional or nonfunctional?

Many systems analysts will consider any requirement allocated to the human elements of the system as nonfunctional, since (they would argue) if the software or hardware isn’t built to execute that function, that function isn’t really a deliverable capability of the system. This also is the case, they’d argue, for functions “properly” allocated to the operating system or other IT infrastructure elements. Access control, for example, is rarely built into specific apps or platform systems, because it is far more efficient and effective to centralize the development and management of that function at the infrastructure level. Be careful—this train of thought leads to apps that have zero secure functions built into them, even the most trivial of input data validation!

Performance requirements, those analysts would say, are by nature functional requirements in this sense. The “-ilities”—the capabilities, availabilities, reliabilities, and all of the characteristics of the system stated in words that end in -ilities or -ility—are (they say) nonfunctional requirements.

As an SSCP, you’ll probably not be asked to adjudicate this functional versus nonfunctional argument. You may, however, have the opportunity to take statements from the users about what they need the system to do, and how they need to see it done, and see where CIANA-related concerns need to be assessed, analyzed, designed, built, tested, and then put to use. That includes monitoring, too, of course; there’s no sense building something if you do not keep an eye on how it’s being used and how well it’s working.

We live and work in a highly imperfect world, of course; it’s almost a certainty that a number of CIANA-driven functional and nonfunctional requirements did not get captured in the high-level systems requirements documentation. Even if they did, chances are good that not all of them were properly implemented in the right subsystems, elements, or components of the overall application system. The two-view look as described earlier (from requirements downward and from on-the-floor operational use upward) should help SSCPs make their working list of possible vulnerabilities.

Possible vulnerabilities, we caution. These are places to start a more in-depth investigation; these are things to ask others on the IT staff or information security team about. Maybe you’ll be pleasantly surprised and find that many of them are already on the known vulnerabilities or issues watch lists, with resolution plans in the works.

But maybe not.

Hardware and Software Supply Chain Security

Your organization’s IT supply chain provides all of the hardware, firmware, software, default initialization data, operators’ manuals, installation kits, and all of the spares and consumables that become the systems the organization depends upon. That chain doesn’t start with the vendor who sends in the invoice to your accounts receivables or purchasing department—it reaches back through each vendor you deal with and through the suppliers who provide them with the subassemblies, parts, and software elements they bundle together into a system or product they lease or sell to your company. It reaches even further back to the chip manufacturers and to the designers they used to translate complex logic functions into the circuit designs and the firmware that drives them. (Every instruction your smartphone’s CPU executes is performed by microcode, that is, firmware inside the CPU, that figures out what that instruction means and how to perform it. And that microcode…well, at some point it does turn into circuits that do simple things, many layers further down.)

Chapter 2 laid the foundations for operating a secure IT supply chain by stressing the importance of having a well-controlled and well-managed baseline. That baseline has to capture every element—every hardware item and every separate piece of software and firmware—so that your configuration management and control systems can know each instance of it by its revision level and update date and version ID. Effective baseline management requires that you know where each item in that inventory is and who its authorized users or subjects are. It requires that you be able to detect when a configuration item (as each of these manageable units of hardware, software, and data are known) goes missing. Security policies should then kick in almost automatically to prevent CIs that have “gone walkabouts” from reconnecting to your systems until they’ve gone through a variety of health, status, use, and integrity checks—if you let them reconnect at all. (There’s a valid business case to be made to say that you just go ahead and “brick” or destructively lock out a device that’s been reported as lost or stolen, rather than spend the effort to evaluate and rehabilitate it and run the risk that you’ve missed something in the process.)

It’s been mentioned several times already, but it bears repeating. Software and firmware updates should not be allowed to be performed if they’re not digitally signed and those signatures verified, all as part of your change control process. Change management should also be involved, as a way of mitigating the risk of an unexpected update being not what it seems to be on its surface.

Positive and Negative Models for Software Security

Ancient concepts of law, safety, and governance give us the idea that there are two ways to control the behavior of complex systems. Positive control, or whitelisting, lists by name those behaviors that are allowed, and thus everything else is prohibited. Negative control, or blacklisting, lists by name those behaviors that are prohibited, and thus everything else is allowed. (These are sometimes referred to as German and English common law, respectively.)

Antivirus or anti-malware tools demonstrate both of these approaches to systems security. Software whitelisting, port forwarding rules, or parameters in machine learning behavioral monitoring systems all aim to let previously identified and authorized software be run or installed, connections be established, or other network or system behavior be considered as “normal” and hence authorized. Malware signature recognition and (again) machine learning behavioral monitoring systems look for things known to be harmful to the system or similar enough to known malware that additional human authorization steps must be taken to allow the activity to continue.

A quick look at some numbers suggest why each model has its place. It’s been estimated that in 2018, more than a million new pieces of malware were created every month “in the wild.” As of this writing, AV-TEST GmbH notes on its website that it observes and categorizes more than 350,000 new malicious or potentially unwanted programs (PUPs) or applications (PUAs) every day, with a current total exceeding 875 million species. Although many are simple variations on exploits already in use, that’s a lot of new signatures to keep track of! By contrast, a typical medium to large-sized corporation might have to deal with authenticating from 1,000 to 10,000 new applications, or new versions of applications, that it considers authenticated to be used on its systems and endpoints.

Positive control models, if properly implemented, can also be a major component of managing system and applications updates. The details of this are beyond the scope of this book and won’t be covered on the SSCP exam itself. That said, using a whitelisting system as part of how your organization manages all of its endpoints, all of its servers, and all of its devices in between can have several key advantages.

  • As new versions of apps (or new apps) are authorized for use, a “push” of the approved whitelist to all devices can help ensure that old versions can no longer run without intervention or authorization.
  • While new versions of apps are still being tested (for compatibility with existing systems or for operability considerations), the IT managers can prevent the inadvertent update of endpoints or servers.

Individual users and departments may have legitimate business needs for unique software, not used by others in the company; whitelisting systems can keep this under control, down to the by-name individual who is requesting exceptions or overriding (or attempting to override) the whitelisting system.

  • Whitelisting can be an active part of separation of duties and functions, preventing the execution of otherwise authorized apps by otherwise authorized users when not accessing the system from the proper set of endpoints.
  • Whitelisting can be an active part in license and seat management if a particular app is licensed only to a fixed number of users.

Is Blacklisting Dead? Or Dying?

SSCPs ought to ask this about every aspect of information systems security. Fundamentally, this question is asking whether a positive or negative security model provides the best approach to information risk management and control. Both blacklisting and whitelisting have their place in access control, identity management, network connectivity, and traffic routing and control, as well as with operating systems and application software installation, update, and use. Some business processes (and their underlying information infrastructures) simply cannot work with whitelisting but can achieve effective levels of security with blacklisting instead. Crowdsourcing for data (such as crowd-science approaches like Zooniverse) are impractical to operate if all users and data they provide must be subject to whitelisting, for example. Anti-malware and advanced threat detection systems, on the other hand, are increasingly becoming more reliant on whitelisting. SSCPs need to appreciate the basic concepts of both control models (positive and negative, whitelisting and blacklisting) and choose the right approach for each risk context they face.

Let’s narrow down the question for now to application software only. NIST and many other authorities and pundits argue that whitelisting is the best (if not the only sensible) approach when dealing with highly secure environments. These environments are characterized by the willingness to spend money, time, and effort in having strong, positive configuration management and control of all aspects of their systems. User-written code, for example, just isn’t allowed in such environments, and attempts to introduce it can get one fired (or even prosecuted!). Whitelisting is trust-centric—for whitelisting to work, you have to trust your software logistics, support, and supply chain to provide you with software that meets or exceeds both your performance requirements and your information security needs across the lifecycle of that software’s use in your organization. Making whitelisting for software control work requires administrative effort; the amount of effort is strongly related to the number of applications programs you need to allow, the frequency of their updates, and the numbers of systems (servers, endpoints, or both) that need to be under whitelist control.

Blacklisting is of course threat-centric. It’s been the bedrock of anti-malware and antivirus software and hybrid solutions for decades. It relies on being able to define or describe the behavior signatures or other aspects of potentially harmful software. If a behavior, a digital signature, a file’s hash, or other parameters aren’t on the blacklist, the potential threat wins access to your system. The administrative burden here is shifted to the threat monitoring and intelligence community that supports the blacklist system vendor (that is, we transfer part of this risk to the anti-malware provider, rather than address it ourselves locally).

Whitelisting (or positive control) is sometimes described as requiring a strong authoritarian culture and mind-set in the organization; it’s argued that if users feel that they have an “inalienable right” to load and use any software that they want to, any time, then whitelisting stands in the way of them getting their job done. Yet blacklisting approaches work well (so far) when one central clearinghouse (such as an anti-malware provider) can push signature updates out to thousands if not millions of systems, almost all of them running different mixes of operating systems, applications, vendor-supplied updates and security patches, and locally grown code.

Software development shops probably need isolated “workbench” or lab systems on which their ongoing development software can evolve without the administrative burdens of a whitelisting system. (Containerized virtual machines are probably safer and easier to administer and control for such purposes.) Academic or white-hat hacking environments may also need to operate in a whitelist, blacklist, or no-list manner, depending on the task at hand. Ideally, other risk mitigation and control strategies can keep anything harmful in such labs from escaping (or being exfiltrated) out into the wild.

While the death certificate for negative control hasn’t been signed yet, there does seem to be a strong trend in the marketplace. Until it is, and until all of the legacy systems that use blacklisting approaches are retired from the field, SSCPs will still need to understand how they work and be able to appreciate when they still might be the right choice for a specific set of information risk mitigation and control needs.

Information Security = Information Quality + Information Integrity

There’s a simple maxim that ought to guide the way we build, protect, and use every element of our IT systems: if the data isn’t correct, the system cannot behave correctly and may in fact misbehave in harmful, unsafe, or dangerous ways.

Whether you think about information systems security in terms of the CIA triplet, CIANA, or even CIANA+PS, every one of those attributes is about the information that our information systems are creating, storing, retrieving, and using. Whether we call it information or data doesn’t really matter; its quality ought to be the cornerstone of our foundational approach to systems security and integrity. Yet it’s somewhat frightening to realize how many systems security books, courses, practitioners, and programs ignore it; or at best, have delegated it to the database administrators or the end users to worry about.

Let’s go a step further: audit requirements force us to keep a change log of every piece of data that might be subject to analysis, verification, and review by the many different auditors that modern organizations must contend with. That audit trail or pedigree is the evidence that the auditors are looking for, when they’re trying to validate that the systems have been being used correctly and that they’ve produced the correct, required results, in order to stamp their seal of compliance approval upon the system and the organization. That audit trail also provides a powerful source of insight during investigations when things go wrong; it can provide the indications and even the evidence that proves who was at fault, as well as exonerating those who were not responsible. Auditable data can be your “get out of jail free” card—and as a member of the security team, you may need one if the organization gets itself into a serious legal situation over a data breach, information security incident, or a systems failure of any kind.

  Be Prepared for Ransom Attacks!

These are rapidly becoming the attack of choice for many small bands of cybercriminals worldwide. When (not if) your systems are targeted by a ransom or ransomware attack, your senior leadership and managers will have a very stark and painful choice among three options: (1) pay up, and hope they deliver an unlock code that works; (2) don’t pay up, and reload your systems from backup images and data (so long as you’re 100 percent sure that they are 100 percent free of the ransom intruder or their malware) (3) don’t pay up, reload what you can, and rebuild everything else from scratch. Norsk Hydro demonstrates the pitfalls and the costs—£45 million so far and still climbing— of pursuing option 3, which the BBC described as “back to doing business with paper and pencil.” You can read more about this ransom attack at https://www.bbc.com/news/av/technology-48707033/ransomware-cyber-attacks-are-targeting-large-companies-and-demanding-huge-payments.

No matter which choice your leadership may make, you have to do the most thorough job you can to ensure that the new systems you rebuild do not have that same ransom intruder or their ransomware inside and that you’ve plugged the holes that they snuck in through.

Data Modeling

Data quality starts with data modeling, which is the process of taking each type of information that the business uses and defining the constraints that go with it. Purchase amounts, prices, and costs, for example, are defined in a specific currency; if the business needs to deal with foreign exchange, then that relationship (how many U.S. dollars is a Mexican peso or Swiss franc worth) is not tied to an inventory item or an individual payment. The business logic defines how the business should (emphasis on the imperative) know whether the value for that field for that item makes sense or does not. Almost all data that organizations deal with comes in groups or sets—consider all of the different types of information on your driver’s license, or in your passport, for example—and those sets of data are grouped or structured together into larger sets. Employee records all have certain items in common, but employees hired under one kind of contract may have different types of information associated with them than other employees do. (Salaried employees won’t have an hourly rate or overtime factor, for example.)

Data dictionaries provide centralized repositories for such business rules, and in well-managed applications development and support environments, the organization works hard to ensure that the rules in the metadata in the data dictionary are built into the logic of the application used by the business.

Without a data dictionary as a driving force in the IT infrastructure, the organization resorts to old-fashioned people-facing procedures to capture those business logic rules. That procedural knowledge might be part of initial onboarding and training of employees; it might only exist in the user manual for a particular application or the desk-side “cheat sheet” used by an individual worker.

All organizations face a dilemma when it comes to procedural knowledge. Procedural knowledge is what workers in your organization use to produce value, at the point of work; this knowledge (and experience) guides them in shaping raw input materials and data into products and services. Quality management calls this point the gemba (a term borrowed from the Japanese), and as a result, many quality management and process maturity programs will advise “taking a gemba walk” through the organization as a way to gain first-hand insight into current practice within the organization. The smarter your people at the gemba, where the real value-producing work gets done in any organization, the more that those people know about their job. The more they understand the meaning of the data that they retrieve, use, create, receive, and process, the greater their ability to protect your organization when something surprising or abnormal happens. But the more we depend on smart and savvy people, the more likely it is that we do not understand all of our own business logic. This data about “how to do things” is really data about how to use data; we call it procedural metadata. (Think about taking a gemba walk of your own through your IT and information security workspaces throughout the organization; what might you learn?)

What can happen when that procedural metadata is not kept sufficiently secure? Loss or corruption of this procedural and business logic knowledge could cause critical business processes to fail to work correctly. At best, this might mean missed business opportunities (similar to suffering a denial-of-service [DOS] attack); at worst, this could lead to death or injury to staff, customers or bystanders, damage to property, and expenses and exposure to liability that could kill the business.

What can the SSCP do? In most cases, the SSCP is not, after all, a knowledge manager or a business process engineer by training and experience. In organizations where a lot of the business logic and procedural knowledge exists in personal notebooks, on yellow stickies on physical desktops, or in human experience and memory, the SSCP can help reduce information risk and decision risk by letting the business impact analysis (BIA) and the vulnerability assessment provide guidance and direction. Management and leadership need to set the priorities—which processes, outcomes, or assets need the most security attention and risk management, and which can wait for another day. And when the SSCP is assessing those high-priority processes and finds evidence that much of the business logic is in tacit form, inside the heads of the staff, or in soft, unmanageable and unprotected paper notes and crib sheets, that ought to signal an area for process and security improvement.

  Modeling Data in Motion

There are many different design techniques used to understand data and identify the important business logic associated with it. Some of these modeling techniques do a far more informative capture of the movement of data between processes than others. Don’t let the names confuse you: just because a particular modeling technique doesn’t have a word like flow in its name does not mean that it doesn’t reveal the lifecycle of a datum (or a lot of data items taken together), including its travels into, across, and out of the organization’s system. One primary example of this is the use case diagram and modeling approach; another set of techniques diagram user stories, and while they focus on what a user (or a process that acts like a user) does, they reveal the flow of messages or data back and forth between users.

Serverless service provision is the next paradigm-busting revolution in cloud computing. It’s about ten years old, so it’s past the early adopters and becoming a much bigger force in the marketplace. Classical server models treat data as residing on the server until a process asks for it; serverless architectures treat the creation or arrival of data as an event, and let that event trigger the activation of a function or process. To nonprogrammers, that might not sound significant; but serverless cloud architectures can see data events happen half a planet away from the function that they trigger. Serverless architectural models look different than traditional relational database data models or data dictionaries do; they also don’t really resemble classical data flow diagrams either.

Your real concern, as the on-scene SSCP, is whether your organization understands the lifecycle of its data and how it chooses to model, control, and validate that lifecycle in action.

Preserving Data Across the Lifecycle

When we talk about data protection across its lifecycle, we actually refer to two distinct timeframes: from creation through use to disposal, and the more immediate cycle of data being at rest, in motion, and in use. In both cases, systems and security architects need to consider how the data might be corrupted, lost, disclosed to unauthorized parties, or not available in an authoritative and reliable form where authorized users and subjects need it, when they need it. Here are some examples:

  • Error correction and retransmission in protocols such as TLS are actually providing protection against data loss (while UDP does not).
  • Digital signatures provide protection against data loss and corruption.
  • HTTPS (and TLS underneath it) provide layers of protection against inadvertent disclosure or misrouting of data to the wrong party.
  • Other protocol measures that prevent MITM or session hijacking attacks protect data integrity, confidentiality, and assure that it only flows to known, authenticated recipients.
  • Encrypted backup copies of data sets, databases, or entire storage volumes provide forward security in time—when read back and decrypted at a future date, the file content, signatures, secure file digests, etc., all can attest to its integrity.
  • Access control systems should be accounting for all attempts to access data, whether it is stored in primary use locations (on production servers, for example), in hot backup servers in alternate locations, on load-balancing servers (whether in the cloud or in an on-premises data center), or in archival offline backup storage.

Today, most cloud storage providers (even free Dropbox or OneDrive accounts) bulk encrypt individual customer files and folder trees and then stripe them across multiple physical storage devices, as a way of eliminating or drastically reducing the risks of inadvertent disclosure. This does mean that on any given physical device at your cloud hosting company’s data center, multiple customers’ data is commingled (or cohabitating) on the same device. The encryption prevents disclosure of any data if the device is salvaged or pulled for maintenance or if one customer or the cloud hosting company is served with digital discovery orders. Cloud providers know that providing this type of security is all but expected by the marketplace today; nonetheless, check your SLA or TOR to make sure, and if in doubt, ask.

Business continuity and disaster recovery planning have a vital role to play in being the drivers for establishing your organization’s data backup and recovery policies, plans, and procedures. Those plans, as Chapter 3 points out, are driven in large part by the business impact assessment (BIA), which was the output of the risk assessment process. The BIA also drove the creation of the information security classification guide and its procedures (your firm does have and use one of those, doesn’t it?). Taken together these spell out what data needs to be made available how quickly in the event of a device failure, a catastrophic crash of one or more of your systems or applications platforms, or a ransom attack or malware infestation. Chapter 4 showed that as part of incident response, systems restoration and recovery needs to not only get the hardware, OS, and applications back up and running again but also must get databases, files, and other instances of important data back to the state it needs to be so that business activities can be started up again.

  Preventing a Blast from the Past

There’s always the risk that the data, software, or systems images you write to backing storage is infected with malware that your current anti-malware scanners cannot detect. Protect against being infected by the sins of the past by reloading such backup data into isolated sandbox systems and thoroughly scanning it with the most up-to-date anti-malware signatures and definitions before using that data or software to reload into your production systems.

Two potential weaknesses in most systems architectures are data moving laterally within the organization—but still in transit—and data as it is processed for display and use in an endpoint device. (Remember that data that never goes to an endpoint is of questionable value to the organization; if the data dictionary defines no business logic for using it, it’s probably worth asking why do you have it in the first place, and keep it secure and backed up in the second place.)

  • East-West (Internal) Data Movement Chapter 5, “Cryptography,” looked at some of the issues in so-called east-west or internal data flows, particularly if the organization routinely encrypts such data for internal movements.2 These internal flows can occur in many ways depending upon your overall architecture.
    • Between physical servers in different organizational units or locations
    • Between virtual servers (which may support different applications or business processes)
    • Across system and network segmentation boundaries
    • From production servers to archival or backup servers
    • As content push from headquarters environments to field locations
    • As part of load balancing and service delivery tuning across an enterprise system spanning multiple time zones or continents
    • Between virtual servers, data warehouses, or data lakes maintained in separate cloud provider environments, but as part of a hybrid cloud system
    • To and from cloud servers and on-premise physical servers or data centers
  • A simple example is the movement of data from a regional office’s Dropbox or Google Docs cloud storage services into a SharePoint environment used at the organization’s headquarters. A more complex example might involve record or transaction-level movement of customer-related data around in your system. In either case, you need to be able to determine whether a particular data movement is part of an authorized activity or not. When application logic and access control tightly control and monitor the use of sensitive data, then you are reasonably assured that the data has been under surveillance—that is, subject to full monitoring—regardless of where and how it is at rest, in use, or in transit. In the absence of such strong application logic, logging, and access control, however, you may be facing some significant CIANA+PS security concerns.

    Keep in mind that data movement east-west can be at any level of granularity or volume; it can be a single record, a file, a database, or an entire storage subsystem’s content that is in motion and therefore potentially a security concern. Real-time monitoring of such movement can be done in a number of ways, which all boil down to becoming your own man-in-the-middle on almost every connection within your systems. Fingerprinting techniques, such as generating a digital signature on any data flow that meets or exceeds certain security thresholds for sensitivity, can be used as part of enhanced logging. Each step you add takes time which impacts throughput, while each piece of metadata (such as a tag or signature) to each data flow adds to your overall storage needs.

    At the heart of the east-west data movement problem is the risk that your efforts to monitor and control this traffic put you in the role of being your own man-in-the-middle attacker. This can have ethical and legal consequences of its own. Most companies and organizations allow for some personal use of the organization’s IT systems and its Internet connection. This means that even the most innocent of HTTPS sessions—such as making reservations for an evening out with one’s family—can involve personal use of encrypted traffic. Employee VoIP calls to medical practitioners or educational services providers are (in most jurisdictions) protected conversations, and if your acceptable use policies allow them to occur from the workplace or over employer-provided IT systems, you run the risk of violating the protections required for such data.

    At some point, if most of your internal network traffic is encrypted, you have no practical way of knowing whether any given data stream is legitimately serving the business needs of the organization or if it is actually data being exfiltrated by an attacker without having some way of inspecting the encrypted contents itself or by adding another layer of encapsulation that can authoritatively show that a transfer is legitimate. Some approaches to this dilemma include the addition of hardware security modules (HSMs), which provide high-integrity storage of encryption keys and certificates, and then using so-called TLS inspection capabilities to act as a decrypting/inspecting/re-encrypting firewall at critical east-west data flow junctions. The throughput penalties on this can be substantial, without significant investment.

  • Data in Use at the Endpoint Imagine that your user has queried data from a corporate server via an authorized app installed on their smartphone or other endpoint. That app has been configured to retrieve that data via an encrypted channel, and thus a set of encrypted data arrives in that smartphone. Should the data be decrypted at the endpoint for use? If so, several possible use cases need to be carefully considered regarding how to protect this data in use at the endpoint.

    • Data display and output: The human user probably cannot make use of encrypted data; but while it’s being displayed for them, another person (or another device) might be able to shoulder-surf that display surface and capture the data in its unencrypted form. Malware, screen capture tools, or built-in diagnostic capabilities on the endpoint itself might also be able to capture the data as it is displayed or output.
    • Data download or copy to another device: The endpoint may be able to copy the now-decrypted data to an external storage device, or send it via Bluetooth, NFC, or another Wi-Fi or network connection to another device; this can happen with or without the end user’s knowledge or action.
    • Data remanence: Once the use of that decrypted data is complete, data still remains on the device. As shown in Chapter 5, special care must be taken to ensure that all copies of the decrypted data, including temporary, modified versions of it, are destroyed, wherever they might be on the endpoint. Endpoint management can provide significant levels of protection against loss or exposure of sensitive data in the event an endpoint (under management) is lost or stolen, as Chapter 6 examined.
    • Human covert paths: The human end user of this endpoint may deliberately or inadvertently commingle knowledge and information, exposing one set of sensitive data to the wrong set of users, for example. This can be as innocent as a consultant who “technically levels” what two clients are asking him to do into a proposal for a third (and might violate contracts, ethics, or both in doing so). It can also happen when endpoints are used for personal as well as work-related information processing.

Note that some data, and some encryption processes, can provide for ways to keep the data encrypted while it is being used and yet not have its meaning revealed by such use. This sounds like some kind of alchemy or magic going on—as if we are adding two secret numbers together and getting a third, the meaning of which we can use immediately but without revealing anything about the two numbers we started with. (Hmm. Said that way, it’s not so hard to imagine. There are an infinite number of pairs of numbers x, y that add together to the same result. Knowing that result doesn’t tell you anything about x or y, except perhaps that if one of them is larger than the sum, the other must be smaller. That’s not a lot to go on as a codebreaker.) Such homomorphic encryption systems are starting to see application in a variety of use cases.

Your chosen threat modeling methodology may help you gain additional insight into and leverage over these issues and others. (See Chapter 3’s “Threat Modeling” section for details.) It’s also worth discussing with your cloud services provider, if your organization has concerns about internal lateral movements of information, even if all of those information assets are hosted within the same cloud system. You may also find it useful to investigate (ISC)2’s Certified Cloud Security Professional (CCSP) program, as a way to sharpen your knowledge and skills in dealing with issues like these.

Identify and Analyze Malicious Code and Activity

It’s well beyond the scope of this book to look at how to analyze and understand what any particular strain of malware might be trying to do. The Grey Hat Hacker series of books provides an effective introduction to the whole field of reverse engineering of software, malware or not, and if you’ve an interest or a need to become that kind of ethical hacker, they are a great place to start. (It does require a good sense of what software engineering is all about, and although you do not need an advanced degree to reverse engineer software, it might help.)

It’s also becoming harder to separate the effects that malware may have on your systems, servers, and endpoints from the effects of malicious activity that is not using malformed code at all. That said, with millions of new types of malware appearing in the wild every week, your users are bound to encounter it. As new malware types proliferate and become part of any number of kill chains, it’s probably more useful to characterize them by broader characteristics such as:

  • End-user interaction: Scareware, ransomware, and many phishing payloads may display screens, prompts, or other content that attempt to get end users to take some kind of action. In doing so, the user unwittingly provides the malware with the opportunity to take other steps, such as download or install more of its payload, copy files, or even start up a ransom-related file encryption process.
  • End-user or endpoint passive monitoring: Keystroke loggers, screen capture, webcam and microphone access, and other tools can gather data about the system, its surroundings, and even its geographic location, as well as gather data about its end user, all without needing the end user to take any enabling actions. These tools are often used as part of reconnaissance and surveillance activities by attackers.
  • Command-and-control functions: These payloads seek to install or create processes and subject (user) IDs that have elevated privileges or otherwise grant capabilities that allow them to take greater control of the system.

Malicious activities, by stark contrast, do not require the installation of any new software capabilities into your servers or endpoints. A disgruntled employee, for example—or the spouse, roommate, or significant other of an otherwise happy and productive employee—might abuse their login privileges to find ways in which they can perform tasks that are detrimental to the organization. Users, for example, are supposed to be able to delete their own files and make backup copies of them—but not delete everyone else’s files or make their own backup copies of everything in the company’s databases or systems files.

Let’s take a closer look.

Malware

Malware, or software that is malicious in intent and effect, is the general name for any type of software that comes into your system without your full knowledge and consent, performs some functions you would not knowingly authorize it to, and in doing so diverts compute resources from your organization. It may also damage your data, your installed software, or even your computer hardware in the process. Cleaning up after a malware infestation can also be expensive, requiring thoroughly scanning many systems, thoroughly decontaminating them, reloading them from known clean backup copies, and then re-accomplishing all of the productive work lost during the infection and its cleanup. Malware has its origins in what we now call white-hat hacking attempts by various programmers and computer scientists to experiment with software and its interactions with hardware, operating systems, and other computing technologies. The earliest and most famous example of this was the “Morris worm,” which Robert Tappan Morris released onto the Internet in 1988. Its spreading mechanism heralded the new era of massive replication of malware, leading to estimates as high as ten million dollars’ worth of damaged and disrupted systems. Among other things, it also led to the first felony conviction for computer crime under U.S. law.

Malware is best classified not by type of malware but by the discrete functions that an attacker want to accomplish. For example, attackers might use malware as one way of:

  • Providing undetected or “backdoor” access into a system
  • Creating new users, including privileged users, surreptitiously
  • Gathering data about the target system and its installed hardware, firmware and software, and peripherals
  • Using the target system to perform reconnaissance, eavesdropping, or other activities against other computers on the same LAN or network segment with it
  • Installing new services, device drivers, or other functions into operating systems, applications, or utility programs
  • Elevating the privilege of a task or a user login beyond what normal system controls would allow
  • Elevating a user or task to “root” or full, unrestricted systems administrative privilege levels
  • Bypassing data integrity controls so as to provide undetected ability to modify files
  • Altering or erasing data from log files associated with system events, resource access, security events, hardware status changes, or applications events
  • Copying, moving, or deleting files without being detected, logged, or restricted
  • Bypassing digital signatures, installing phony certificates, or otherwise nullifying cryptographic protections
  • Changing hardware settings, either to change device behavior or to cause it to damage or destroy itself (such as shutting off a CPU fan and associated over-temperature alarm events)
  • Surreptitiously collecting user-entered data, either during login events or during other activities
  • Recording and later transmitting records of system, user, or application activities
  • Allocating CPU, GPU, and other resources to support surreptitious execution of hacker-desired tasks
  • Generating and sending network or system traffic to other devices or to tasks on other systems
  • Launching malware-based or other attacks against other systems
  • Propagating itself or other malware payloads to other hosts on any networks it can reach
  • Harvesting contact information from documents, email systems, or other assets on the target system to use in propagating itself or other malware payloads to additional targets
  • Establishing web page connections and transacting activity at websites of the hacker’s choice
  • Encrypting files (data or program code) as part of ransomware attacks
  • Establishing hidden peer-to-peer or virtual private network connections with other systems, some of which may possibly be under the hacker’s control
  • Running tasks that disrupt, degrade, or otherwise impact normal work on that system
  • Controlling multimedia devices, such as webcams, microphones, and so forth, to eavesdrop on users themselves or others in the immediate area of the target computer
  • Monitoring a mobile device’s location and tracking its movement as part of stalking or tracking the human user or the vehicle they are using
  • Using a variety of multimedia or other systems functions to attempt to frighten, intimidate, coerce, or induce desired behavior in the humans using it or nearby it

In general, malware consists of a vehicle or package that gets introduced into the target system; it may then release or install a payload that functions separately from the vehicle. Trojan horse malware (classically named) disguises its nefarious payload within a wrapper or delivery “gift” that seems attractive, such as a useful program, a video or music file, or a purported update to another program. Other types of malware, such as viruses and worms, got their names from their similarities with the way such disease vectors can transmit sickness in animal or plant populations. Viruses, for example, infect one target machine and then launch out to attack others; worms look to find many instances within the target to infect, making their eradication from the host problematic.

The payloads that malware can bring with them have evolved as much as the “carrier” codes have. These payloads can provide hidden, unauthorized entry points into the system (such as a trapdoor or backdoor), facilitate the exfiltration of sensitive data, modify data (such as system event logs) to hide the malware’s presence and activities, destroy or corrupt user data, or even encrypt it to hold it for ransom. Malware payloads also form a part of target reconnaissance and characterization activities carried out by some advanced persistent threats, such as by installing keyloggers, spyware of various types, or scareware. Malware payloads can also transform your system into a launch platform from which attacks on other systems can be originated. Payloads can also just steal CPU cycles by performing parts of a distributed computation by means of your system’s CPUs and GPUs; other than slowing down your own work, such cycle-stealing usually does not harm the host system. Codebreaking and cryptocurrency mining are but two of the common uses of such cycle-stealing. Rootkits are a special class of malware that use a variety of privilege elevation techniques to insert themselves into the lowest-level (or kernel) functions in the operating system, which upon bootup get loaded and enabled before most anti-malware or antivirus systems get loaded and enabled. Rootkits, in essence, can give complete and almost undetectable control of your system to attackers and are a favorite of advanced persistent threats.

It’s interesting to note that many of the behaviors of common malware can resemble the behavior of otherwise legitimate software. This can lead to two kinds of errors. False negative errors are when the malware detection system marks a legitimate program as if it were malware or quarantines or blocks attempts to connect to a web page mistakenly “known” to be a malware source. False positive errors occur when actual malware is not detected as such and is allowed to pass unreported.

Malware can be introduced into a system by direct use of operating systems functions, such as mounting a removable disk drive; just as often, malware enters a system by users interacting with “applications” that are more than what they seem and that come with hidden side effects. Malware often needs to target operating systems functions in order to be part of a successful attack. Most of what you have to do as an SSCP to protect your infrastructure from malware intrusions must take place inside the infrastructure, even if the path into the system starts with or makes use of the application layer. (Some applications, such as office productivity suites, do have features that must be tightly controlled to prevent them from being misused to introduce malware into a system; this can be another undesirable side effect of enabling tools that can create otherwise useful shadow IT apps.)

Malicious Code Countermeasures

One way to think about employing malware countermeasures is to break the problem into its separate parts.

  • Whitelisting, providing control over what can be installed and executed on your systems in the first place
  • Protecting the software supply chain, by using strong configuration management and controls that enforce policies about using digitally signed code from known sources
  • Prevention measures that attempt to keep malware of any sort from entering your systems
  • Access control enforcement of device and subject health and integrity, by means of quarantine and remediation subnets

Many different strategies and techniques exist for dealing with each part of this active anti-malware countermeasure strategy shown here. For example, software and applications whitelisting can be enforced by scanners that examine all incoming email; other whitelisting approaches can restrict some classes of end users or subjects from browsing to or establishing connections (especially VPN connections) to sites that are not on the trusted-sites list maintained organizationally.

Most anti-malware applications are designed as host intrusion detection and prevention systems, and they tend to use signature and rule-based definitions as they attempt to determine whether a file contains suspect malware. Active anti-malware defenses running continuously on a host (whether that host is a server or an endpoint) can also detect behavior by an active process that seems suspect and then alert the security team, the end user, or both that something may be rotten in the state of the system, so to speak. Anti-malware scanners can be programmed to automatically examine every new file that is trying to come into the system, as well as periodically (or on demand) scan every file on any attached storage devices. These scanners can routinely be searching through high capacity network storage systems (so long as they’re not encrypted, of course), and should be:

  • Scanning your system to check for files that may be malware-infected or malware in disguise
  • Inspecting the digital signatures of specific directories, such as boot sectors and operating system kernels to check for possible surreptitious changes that might indicate malware
  • Inspecting processes, services, and tasks in main memory (or in virtual page swap areas) to detect any infected executable code
  • Inspecting macros, templates, or other such files for suspicious or malicious code or values
  • Moving suspect files or objects to special quarantine areas, and preventing further movement or execution of them
  • Inspecting operating systems control parameter sets, such as the Windows Registry hives, for signatures or elements suggestive of known malware
  • Monitoring system behavior to detect possible anomalies, suggestive of malware in action
  • Monitoring incoming email or web traffic for possible malware
  • Monitoring connection requests against lists of blacklisted sites

Where this concept breaks down is that those hundreds of thousands of new species of malware that appear every day probably are not defined by the rule sets or recognized by the signature analysis capabilities of your anti-malware systems. There’s a great deal of interest in the marketplace for automated machine learning approaches that can generate new signatures—but these approaches tend to need dedicated software reverse engineering and test environments in which a suspect file can be subjected to hundreds of thousands of test cases, while the machine learning algorithm trains itself to separate friendly software from foe, if it can.

It’s probably wise to see anti-malware solutions as just one part in your overall defensive strategy. Your access control system, for example, cannot by itself recognize a piece of malware for what it is. Inspecting encrypted traffic moving laterally on your internal networks can defeat the best of anti-malware inspection and detection systems. The danger always exists, of course, that today’s malware definitions and signatures won’t recognize something as malware, and you’ll copy it into an encrypted backup or archive of a database, software library, or system image. It’s vitally important, therefore, to scan those images in an isolated sandbox, quarantine, or other clean system with the newest such malware signatures and definition files before you reload your systems from that data. This will reduce your risk of your systems being infected by something from the past; nothing, of course, can take that risk to zero (other than not plugging them in and turning them on in the first place).

  Anti-Malware Under Another Name?

A growing number of security systems vendors are offering products with capabilities known as endpoint detection and response, advanced threat detection, and others. As Chapter 6 points out, many of these incorporate next generation anti-malware approaches. Some may even use machine learning techniques to rapidly generate new signatures to use in behavioral analysis–based whitelisting or blacklisting approaches. At its most basic level, detecting, preventing, and responding to attempted intrusion of malware into your systems is a fundamental requirement. The architecture of your systems should then help you determine the right solution for your organization’s needs, including what mix of capabilities you need, hosted on what hardware elements, and where those elements are in your systems.

Malicious Activity

Setting aside our malware sensitivities for a moment, let’s consider malicious activity, which at a minimum is by definition a violation of your organization’s policies regarding acceptable use of its information systems, its information assets, and its IT infrastructure. This is an important point: if the organization has not defined these limits of acceptable use in writing, it may find its hands are tied when it tries to deal with systems usage that disrupts normal business operations, compromises private or proprietary data, or otherwise leads to safety or security concerns. By definition, malicious activity is a set of tasks, processes, or actions invoked by a person or people with the intention of satisfying their own agenda or interests without regard to any potential harm their actions may cause to the organization, its employees, customers, or other stakeholders. There does not have to be intent to harm in order for an action to be malicious; nor does there have to be intent to profit or gain from the activity, if it in fact does cause harm to the organization. While this may sound like a two-edged legal argument, it also causes some problems when trying to detect such activity.

What Kinds of Activities?

Almost any action that users might be able to take as part of their normal, everyday, authorized and permitted work tasks can become harmful or self-serving (and therefore malicious) if they are performed in a different context. Built-in capabilities to generate surveys and mass email them out to subscribers, for example, can be misused to support an employee’s private business, personal, or political agendas. In doing so, the organization’s email address, IP address, URL, or other recognizable marks on such outgoing email may be misconstrued by recipients to mean that the organization endorses the content of that email. This can lead to significant loss of business and damage to the organization’s reputation if the email’s content is significantly at odds with the way the organization positions itself in the marketplace. The unauthorized copying of private or proprietary data and removing the copy from the organization’s systems, or exfiltrating it, is defined in many jurisdictions as a crime of theft. The damages to the organization can be severe enough that its directors face time in prison and the company can go out of business.

One factor in common is that in each instance management might recognize (in the breach) that the activity in question was a violation of privilege: the user (whoever they were) did not have permission to conduct the tasks in question. Whether the organization’s access control systems implemented such privilege restrictions and effectively enforced them, however, is the other, more technical, side of the story.

Actual attacks on your systems, such as distributed denial-of-service attacks via zombie botnets, are of course malicious in nature and intent; but they’re covered in greater depth in the section “Understand Network Attacks and Countermeasures” in Chapter 6.

  Beware Attackers Living Off the Land

In July 2017, Symantec’s research showed an increasing number of ransom attacks—not ransomware!—in which the attackers used social engineering and other surreptitious, non-malware-based means to gain initial entry into target systems; they then used built-in systems functions to prepare target file systems for encryption at their command. In many cases, these attacks create few if any files at all on the target system, making it extremely difficult for most anti-malware, software whitelisting or intrusion detection and prevention technologies to recognize them for what they are. The attackers can also use the same systems functions to cover their tracks.

Symantec’s bottom-line recommendation: multifactor user identification, combined with strong access control, is still the foundation of any well-managed IT security program.

Who’s Doing It?

Although this may be one of the last questions that gets answered during an incident response and investigation, as a categorical, it helps us look at how we might detect such malicious activity in the first place. Several options exist and are in use today.

  • User behavioral modeling: Typically using machine learning approaches, these systems attempt to understand each authorized user’s business normal behavior, such as what apps they use, with data from which parts of the system, and how that pattern varies by hour, day, week, or even across a longer time span. For example, a user who normally works in accounts receivables or in purchasing probably has no legitimate need to be copying numerous payroll records (data outside of their normal span of duties), but their after-hours extensive use of purchasing records could be legitimate overtime activity or an attempt to mask their own role in a false invoicing or other fraud.
  • Endpoint behavior modeling: Similar techniques can be applied to endpoint devices, in an attempt to identify when an endpoint’s activities are potentially suspicious. Endpoints that characteristically show at most a handful of HTTPS sessions, that suddenly attempt to open hundreds of them, might be doing something worthy of investigation.
  • Access control: Access control accounting data may reveal patterns in attempts to access certain data objects by certain subjects, which may or may not reveal behavior that is of concern.
  • Security logs: These might indicate that a user ID is attempting to elevate its privilege too far or is attempting to modify the privilege constraints associated with objects in ways that may be suspicious.

The Insider Threat

People who are already granted access privileges to your systems, who then abuse those privileges in some deliberate way, are known as the insider threat. It’s also unfortunately true that many cases of insider-triggered malicious activity are done so via accident or mistake—the user simply did not perform the right action, or attempted to perform it but did it incorrectly, without a malicious intention. Nonetheless, your trusted insiders can and do get things wrong from time to time. Without trying to go too deep into possible motivations, it may be possible in some circumstances to exploit user behavioral modeling in conjunction with biometric data to determine whether a particular member of your team is behaving in ways that you should view with caution if not outright alarm. (It’s not just apocryphal humor that warns us to look closely if our systems administrators are driving Lamborghinis or otherwise living a lifestyle far beyond the salary and benefits they’re earning from your organization.)

User behavioral analytics (UBA) refers to the use of statistical and machine learning approaches to aid in detecting a possible security risk involving a specific employee or other end user. Analytics algorithms seek patterns in the data collected about a modeled behavior, while looking for data that signals changes in behavior that diverge from a known and understood pattern. Different types of analytics approaches represent the different timeframes that management might need to appreciate as they consider potential security implications.

  • Descriptive analytics looks at what happened, using behavioral profiles or signatures of past (or other current) events for comparison.
  • Inquisitive analytics looks for the proximate causes of an event or series of events.
  • Predictive analytics models the behavior of a system (or a person) and seeks to forecast likely or possible courses of action that system or person will take during a specified future time period.
  • Prescriptive analytics combines these (and other) analytics approaches to synthesize recommended courses of action to deal with forecasts (from predictive analytics).

At certain gross levels, UBA might seem to be quite similar to role-based access control: both ought to be able to detect in real time when a user makes repeated attempts to access resources beyond what their assigned roles allow them to use. UBA, it is argued, can look at longer dwell times than RBAC can: if user 6079SmithW attempts several such access in the space of a few minutes, to the same objects, RBAC can probably detect it and alert the SOC. If, however, 6079SmithW spreads those access attempts across several days, interspersed with many other accesses (legitimate and perhaps also outside his realm of privilege), they may all be natural mistakes or a low and slow reconnaissance probe. UBA promises to be able to look deep across the recent past history (in descriptive analytic terms) to identify possible patterns of behaviors of interest (“this user is getting too inquisitive for their own good”).

UBA and related approaches are also being developed to generate the data needed to configure RBAC systems, which can be notoriously difficult to configure and maintain for very large organizations, especially if fine-grained definition of roles and objects is required for enhanced security.

UBA data monitoring can go beyond just the surveillance and recording of a user’s interactions with your IT systems and include “off-board” data from third-party systems, social media sites, and traffic. It’s not yet clear that the UBA algorithms and tools are reliable enough to detect that a user’s personal life is about to cause them to become a “disgruntled employee” who might strike out at the company via its information systems. Before you think that implementing a military-style personnel reliability program is advisable, talk with your organization’s experts in human resources and employment law. Many sources of stress that employees can be affected by are beyond what employers can rightly ask about or attempt to intervene in. In most cases, all that the organization can do is make sure that the required and permitted behavior is clearly defined and that unacceptable behavior (or inappropriate or prohibited information systems use) is communicated in writing. All employees need to be familiarized with these as general standards, as well as being trained on the standards that apply to their particular tasks. Consistent deviations from those standards present many different human resources and management challenges, only some of which might involve an insider threat to your information systems security.

Malicious Activity Countermeasures

Let’s take a moment to quickly review the “top ten” countermeasures you’ll need as you attempt to limit the occurrence of malicious activities and the damage that they can inflict upon your information systems and your organization. Hearken back to that holistic framework, inspired by the OSI 7-Layer Reference model, that I shared earlier in this chapter (and have shared in different ways throughout this book).

  • Physical controls can mitigate introduction of malware, exfiltration of data or the entry of unauthorized persons into your premises and thereby into contact with your information systems infrastructure.
  • Logical controls implement the lioness’ share of the security policy decisions that the organization has made; these configure hardware, operating systems, network, and applications-level security features.
  • Administrative controls ensure that risk management decisions—and therefore information security decisions—that management and leadership have made are effectively pushed out to everyone in the organization and that they are used to drive how physical and logical security controls are put into effect.
  • Hardening strategies for systems, servers, applications, networks, and endpoints have been translated into procedures, and those procedures create the right mix of controls to ensure that identified vulnerabilities get fixed as and when prudent risk management dictates that they should be fixed.
  • Isolation, quarantine, and sandbox techniques are used when individual user, subject, or endpoint behavior suggests that something may be happening that deserves more in-depth investigation. Quarantine or isolation of an entire LAN segment and all systems on it may be tremendously disruptive to normal business operations and activities, but indicators of compromise may make that the least unpalatable of choices.

Last but definitely not least on your list should be to engage with your end users; motivate, educate, and train them to be as much a part of your information defensive posture as possible. Most of your teammates and co-workers do not want to live their lives as if they are suspicious of everything around them; they do not want to be the “Big Brother” eyes of the state (or the C-suite) and watch over their co-workers’ every move or utterance. Even so, you can go a long way toward creating a more effective security culture by properly inviting, welcoming, and receiving reports of potential security concerns from end users throughout the organization. In almost all cases, the majority of the end users of your systems do not work for you—you are not their boss, you do not write their performance evaluations—yet as part of “the establishment,” as part of the supervisory and control structure, you may still be perceived as part of the enforcement of cultural norms and expectations within the organization. Use that two-edged sword wisely.

Implement and Operate Endpoint Device Security

It’s at the endpoints of our systems that information inside those systems gets turned into human decisions and actions in the physical, real world; it’s also at the endpoints that data is captured by sensors (such as mice, touchpads or keyboards, transducers, cameras, and microphones) and turned into data that our systems then use to model or represent that real world. Endpoints are where users access our systems and where data flows into and out of our systems. Endpoints can be the modem/router where our LAN joins the ISP’s service drop to the Internet; endpoints are the laptop or desktop computers, the smartphones, and the IoT devices that we use for SCADA and industrial process control. An ATM is an endpoint, which outputs both data and cash, and in some cases receives deposits of cash or checks.

The full spectrum of security measures must be thoughtfully applied to endpoints across your systems.

  • Physical: Are they protected from being tampered with? Can removable media or memory devices (such as USB drives) be plugged into them, and if so, what logical controls are in place to prevent unauthorized uploading or downloading of content or executable files?
  • Logical: Does your access control and network management capability know when a device is turned on, rebooted, and attempts to connect to your networks or systems? If it’s a mobile device, can you ascertain its location as part of the authentication process? Can you verify that its onboard firmware, software, and anti-malware protection (if applicable) are up-to-date? Can you remotely inventory all endpoints that are registered to be part of your systems and profile or inventory their current hardware, software, and data configuration?
  • Administrative: Does your organization have policies, guidelines, and procedures published and in force that identify acceptable use of company-owned or managed endpoints, employee-owned endpoints, and endpoints belonging to or in the control of visitors, guests, or others? How are these policies enforced?
  • Monitoring and analysis: What security, systems, application, and hardware information is captured in log files by your networks and systems as it pertains to endpoints, their access attempts to the networks and systems, and their usage?
  • Data movement: Are you able to track or log the movement of data, particularly from sensitive or restricted-access files or data sets and to and from endpoints? Can you enforce whether data moved to and from an endpoint is encrypted in transit, or in use or at rest on the endpoint? Can you remotely manage storage facilities on the endpoint and delete data on it that is no longer required to be there, or is otherwise at risk of loss or compromise?
  • Data commingling: Does your organization have acceptable use or other policies that set limits or constraints on the commingling of personal data and data belonging to the organization on the same endpoint, regardless of whether it is company-owned or employee-owned? How are these policies implemented and monitored for compliance?
  • Configuration management and control: Is each endpoint under configuration management and control? Can you validate that the only hardware, firmware, or software changes made to the device were authorized by the configuration management decision process and were properly implemented?
  • Backups: What requirements for providing backups of endpoint software and onboard data does your organization have? How are these backups carried out, verified to be correct and complete, stored, and then made available for use if and when an endpoint needs to be restored (or a replacement endpoint brought up to the same configuration)?

  When Is “The Cloud” an Endpoint?

It’s easy to see that a USB removable storage device (or a hacking device disguised as a USB storage device) can be a data exfiltration avenue, one that endpoint management can and should be able to exert control over. When bring-your-own-infrastructure (BYOI) approaches encourage or allow users to store company data in their own privately managed cloud storage areas, the risk of exfiltration skyrockets. Containing that risk requires carefully managed security policies applied to those shared infrastructures as well as to the source data itself; you should also closely examine the business logic and processes that seem to require what could be a complex web of contracts, agreements, and services.

Mobile device management (MDM) systems can support many of these needs for laptops, smartphones, and some other mobile devices. Other IT asset management systems can be used to control, manage, and validate the configuration, health, and integrity of fixed-base systems such as desktop or workstation endpoints, regardless of whether they are connected via wireless, cable, or fiber links to your networks.

Printers and multifunction printer/copier/fax/scanner units represent a special class of endpoint security challenges. If sensitive data can be sent to a printer, it can leave your control in many ways. Some organizations find it useful to take advantage of steganographic watermarking capabilities built into many printers and multifunction units; this can make it easier to determine whether a file came into the system or was printed off or sent out by a particular device. Sometimes, just the fact that such steganographic watermarking is being used may deter someone from using the device to exfiltrate sensitive data.

Endpoints act as clients as far as network-provided services are concerned; but even the thinnest of clients needs an onboard operating system that hosts whatever applications are installed on that device. Many endpoints have significant processing and memory capacities, which makes it practical to install various integrated security applications on them. As the threats evolve and the defensive technologies evolve as well, the boundary lines between different types of products blur. As we see with network-based systems, intrusion detection, prevention, and firewall capabilities are showing up as integrated product offerings.

Let’s look at a few, in a bit more detail.

HIDS

Host-based intrusion detection or prevention systems (HIDS or HIPS) fall into two broad classes: malware protection systems and access control systems. If the endpoint’s onboard operating system is capable of defining multiple user roles and managing users by ID and by group, it already has the makings of a robust onboard access control system. Your security needs may dictate that you install a more rigorous client-side access control application as part of your overall organizational access control approach as a way of detecting and alerting security control personnel that unauthorized local device access attempts or user privilege elevations are being attempted.

Anti-malware systems have long been available as host-based installations. They are not truly applications, in that they don’t really run on top of the host operating system but rather integrate with it to make it more difficult for malware to take hold of the host. The anti-malware system is installed so that parts of it load along with the operating system kernel, almost immediately after the hardware bootstrap loader has turned control over to the boot software on the system disk or other boot device. In theory, those areas of a disk or boot device can be updated only by trusted installer user IDs or other superuser/systems administrator–controlled processes. Of course, if malware is already on the endpoint and installed in those boot or root areas of the boot device, the anti-malware installation probably cannot detect it.

Our trust in our anti-malware systems providers goes right to the very core of our information security posture. You may recall the near hysteria that resulted when some U.S. government sources started to allege in 2017 that Kaspersky Lab’s anti-malware products had backdoors built into them that allowed Russian intelligence agencies to insert malware of their choice onto systems supposedly under its protection. Even the European Union claimed, nearly a year later, that it had confirmed Kaspersky’s products were malicious. As of this writing, it’s not yet clear if these allegations have any basis in fact or not; regardless, it strongly suggests that having more than one way to scan all of your systems, servers, and endpoints—using more than one vendor’s products and more than one set of signature and rule data to drive them with—is a prudent if not necessary precaution.

Host-Based Firewalls

In a fully layered defensive system, each server and endpoint should be running its own host-based defenses, including its own firewall. Windows and Mac systems ship with a factory-installed firewall, which in the absence of anything else you should enable, configure, and use. Other Linux distributions may or may not come with a built-in firewall, but there are many reputable apps to choose from to add this to a server or endpoint. Android systems are left something in a bit of a lurch: Google considers a firewall unnecessary for Android, especially if you restrict your computing to only those trusted apps on the Google Play store. This is similar to having a trusted software supply chain as part of your overall risk management process. The argument is also made that mobile smartphones (the marketplace for Android OS) are not intended to be used as always-on servers or even as peers in a P2P relationship and that therefore you still don’t need a firewall. As a quick perusal of the Play store reveals, there are hundreds of firewall apps to choose from, many from well-known firewall providers.

Regardless of device type and its native OS, configuring your chosen host firewall’s ACLs needs to harmonize with your overall approach to access control and privilege management. You’ll also need to ensure that any MDM solutions that are managing the host device are also working compatibly with the host firewall and that both are working together seamlessly.

This suggests that your overall access control systems approach should ideally give you a one-stop-shopping experience: you use one management console interface to establish the constraints to check and the rules to use and then push that through to the MDM as well as to network and host-based systems such as firewalls, IPS and IDS, and even uniform threat management and Security Information and Event Management (SIEM) systems.

Application White Listing

Application white listing defensive technologies work at the level of an individual executable file; if the file is on the white list, the list of known programs that are permitted to run on a given endpoint, then that program file is loaded and executed. If the file is not on the whitelist, the whitelisting tool can be configured to block it completely or ask the user to input a justification. The justification (and the attempted execution) are sent to security personnel, who evaluate the request and the justification. In some environments, the whitelisting system can be configured to allow the user to grant permission to install or run an otherwise unknown piece of software.

In principle, this would keep any unauthorized program from being loaded and executed on any endpoint so configured. This can certainly prevent many malware processes from being installed and run. Malware that might arrive as part of an encrypted file attached to an email or downloaded from a network resource, for example, might not have been detected by anti-malware systems scanning those interfaces. It also can defend a host against new malware threats that are completely unknown to the malware community. This in and of itself is worth its weight in gold!

In practice, however, there are some issues to consider. PowerShell, for example, is installed on every Windows machine; yet, it can give an individual user (or a script file that tries to access it) unparalleled capability to make mischief. Application whitelisting can also fail to provide solid management of systems where the user (or a group of users) has a legitimate need to be building and running new programs, even as shadow IT apps. Psychologically conditioning the user to hit “authorize” dozens of times a day can lead to pushing that button one too many times in the wrong circumstances.

There’s also the risk that a series of executions of known, recognized, and permitted programs could still invite disaster; this crosses over into endpoint and end-user behavioral modeling and requires that you identify business normal patterns of applications use. Endpoint and end user behavior modeling is seeing a resurgence of interest as machine learning techniques are making it more approachable and scalable. In most such cases, the machine learning is performed in an offline environment to generate and test new signatures, rule sets, or both; then those are loaded into production protective systems such as applications whitelisting apps. Depending upon circumstances, that process can take as little as a day to generate a new signature.

It’s worthwhile investigating whether your organization’s security needs can identify policies that can be used to directly drive the ways in which an application whitelisting system can be put to use. Any whitelisting system you choose should be something you can integrate into your existing configuration management and control practices, particularly as they pertain to how your organization generates and uses endpoint golden images. You’ll also need to investigate whether you need a server-based whitelisting approach or need this deployed to each endpoint (and each mobile endpoint).

Endpoint Encryption

There are strong arguments that can be made that all endpoints, regardless of type, should have strong encryption methods used to completely lock down their internal or onboard storage. Some Windows- and Linux-based devices can in fact do this, using technologies such as Microsoft’s BitLocker to encrypt the end user’s data partition on an endpoint’s hard drive, while allowing Windows itself to boot from an unencrypted space. Laptops, desktops, and other workstation-type endpoints with significant storage, processing, and RAM are prime candidates to have encryption applied to their hard drives.

Mobile phones, regardless of the operating system that they use, are also prime candidates for such encryption. It’s beyond our scope to survey the different encryption options available, but suffice it to say that headline news of a few years ago had Manhattan District Attorney Cyrus Vance Jr. adding his voice to the choir attempting to convince everyone that putting encryption on smartphones protects more criminals than it protects law-abiding citizens. The numbers seem to show differently, as Wired’s Kevin Bankston pointed out in August 2015: millions of mobile phones and smartphones are stolen every year in the North American and European markets alone, while at best one or two headline-grabbing criminal cases involve encrypted smartphones that law enforcement cannot jailbreak.

There are some caveats, of course. If your MDM solutions or other device management approaches involve substantial device customization, such as flashing your own versions of ROMs in these devices, you may find that updated versions of the OS (such as Android) won’t support device-level encryption. You should also thoroughly investigate what type of encryption is used and what that suggests for organizational policies about passcodes or other authentication factors users will have to use to access encrypted data on the endpoint or use it via encrypted services with company systems. (A four-digit passcode does not strong security make after all.)

Data remanence is an issue with endpoints, as data unencrypted for use, display, or output to another device can still remain accessible in some form even if the device is powered off and its battery or power supply removed.

MDM systems vendors have increasingly added encryption management, password management, and other security features into their product and service offerings. Some of these provide a corporate server-based encryption system that will not allow the endpoint to access, display, or use company data without the server and the endpoint being in a managed session together; this prevents offline use of confidential data, which is either a business impact or a security blessing. Others look to full device encryption and locking, which may be appropriate for company-owned, company-managed devices.

If you have a personal smartphone and you’re not encrypting it, perhaps it’s time to learn more about how to do that.

Trusted Platform Module

Trusted platform modules (TPMs) are specialized hardware devices, incorporated into the motherboard of the computer, phone, or tablet, that provide enhanced cryptographic-based device and process security services. A TPM is provided in a sealed, tamper-resistant hardware package that combines cryptographic services, host computer state descriptor information, and other data. TPMs are embedded into the computer’s motherboard (so as to be nonremovable) and, in combination with device drivers and other software, achieve greater levels of security for that system. The Trusted Computing Group (TCG), a consortium of more than 120 manufacturers, software houses, and cybersecurity companies from around the world, develops and publishes standards that describe what TPMs should do and what they should not. The TCG defines trust in simple terms: a trusted device behaves in a particular, specified manner for a specified purpose. By storing key parameters about the host computer itself (chip-level serial numbers, for example), a TPM provides an extra measure of assurance that the computer system it is a part of is still behaving in the ways that its manufacturer intended. TPMs typically contain their own special-purpose, reduced instruction set computer; read-only memory for the control program; key, hash, and random number generators; and storage locations for configuration information, platform identity keys, and other data. TPMs are being incorporated into laptops, phones, and tablet systems, providing a world-class solution that is not strongly tied to or dominated by one manufacturer’s chip set, operating system, or hierarchy of trust implementation.

TPMs protect the hardware itself by making it less attractive to steal or less useful (easier to lock) when the host computer or phone is lost or mislaid. Although the TPM does not control any software tasks (system or application) running on the host, it can add to the security of processes designed to make use of it. It’s probably fair to consider a TPM as an additional hardware countermeasure to help make software and communications more secure.

In 2016, TCG published the latest edition of its standard, calling it TPM Main Specification 2.0; it was updated into the ISO/IEC 11889 standard later that year. TPM 2.0 was built with a “library approach” to allow greater flexibility, especially in implementing TPMs to serve the rapidly expanding world of IoT devices. TCG provided a brief overview of this in June 2019, https://trustedcomputinggroup.org/wp-content/uploads/2019_TCG_TPM2_BriefOverview_DR02web.pdf, which highlights the five key trust elements of TPM 2.0 and briefly sketches out the security features they can provide. Note that the first application suggested for the Discrete TPM is in control logic for automotive braking systems. Virtual TPMs can provide a cloud-based TPM element as part of a larger virtual systems design.

Mobile Device Management

Although mobile device management (MDM) was discussed in Chapter 6 as part of wireless network access control and security, it’s worth expanding on a few key points.

  • Regardless of whether the mobile endpoint is company-owned and managed, your users will complicate things greatly. Users may become excessively slow or uncooperative with keeping it fully updated and patched, for example. Windows 10 does allow end users to put off—sometimes for months—what otherwise might be an urgent push of an update. And in some instances, they are right to do so. Users who travel excessively may find that leaving a laptop powered up and operating while racing to an airport or a meeting is just too inconvenient; and if those users are senior managers or leaders, it’s really up to you and the IT department to make the user’s experience more effective, simpler, and even enjoyable. Horror stories abound of very senior officials whose government-issued smartphones, for example, went for more than a year without being patched or updated. At some point, that seems to cry out for an “office visit” by a technician who can completely swap out one device for a new one, with all of the user’s data already ported over. Support for your security measures by the powerful users (as opposed to the notion of power users) in the organization is vital to getting everyone on board with them.
  • Many factors are driving companies and organizations to deal with variations on the bring-your-own schemes, and as devices become more powerful, take on different form factors, and have personalization that appeals to individual end-user desires, more companies are finding that BYO variations are the approach that they need to take. MDM vendors clearly advertise and promote that their solutions can cope with an ever-increasing and bewildering array of smartphones, phablets, and laptops; at some point, smart watches and other device types will move out of the PAN and into broader use—presenting even more mobile devices to manage.

Already, many fitness watches can receive, display, store, and allow limited responses to emails, SMS, and other messaging; at some point, you may need to worry about whether the data flowing to such devices is potentially creating a data leak situation. It doesn’t always have to be gigabytes of data being exfiltrated that an adversary is after.

  Avoiding Covert Channels of the Mind

Mobile endpoint security considerations put another classic security problem in stark relief: in almost every organization, there are people who have to deal with sensitive or security-classified information that comes from multiple compartments. Consultants, for example, must take great pains to ensure that they do not reveal Client A’s significant challenges or new strategies to Client B, even by way of discussing those sensitive ideas or concerns with people within the consultant’s own organization. These unplanned cross-connects between sets of secure information are known as covert channels, since the flow of ideas and information tends to happen in ways that the responsible managers or owners of those sets of information are not aware of.

As more and more BYO-style mobile device situations arise, you may need to pay particular attention to how your mobile users are made aware of these types of concerns and to what extent familiarization, training, or even formal nondisclosure agreements (NDAs) are required.

Secure Browsing

Because our use of web browsers is such an integral part of the hierarchies of trust that we rely upon for secure e-commerce, e-business, and e-personal use of the Web and the Net, it’s important to consider two sides of the same coin: how well do our browsers protect our privacy while they purportedly are keeping us secure? The majority of web browser software is made freely available to users and systems builders alike; they come preinstalled by the original equipment manufacturers (OEMs) on many computer products for consumer or enterprise use. The development and support costs of these browsers are paid for by advertising (ads placed within web pages displayed to users), by analytics derived from users’ browsing history, or by other demographic data that browser providers, search engines, and websites can gather during their contact with users. Browsers support a variety of add-on functions, many of which can be used by websites to gather information about you and your system, leave session-specific or site-related information on your computer for later use, or otherwise gain more insight about what you’re doing while you are browsing than you might think possible or desirable.

Browsers, like many modern software packages, also gather telemetry data—data that supports analysis of the behavior and functioning of the browser while the user is interacting with it—and makes that telemetry available to its vendor. (Many products say that users opt into this to “improve the user experience,” whether the user feels such improvement or not.) Whether we recognize this or not, this paradigm has transformed the web surfer from user-as-customer into user-as-product. In some circumstances, that can be of benefit to the user—it certainly provides the revenue stream that developers and infrastructure builders and maintainers need, at no additional direct cost to the user. But it can also be of potential harm to the user, be that user an individual or a business enterprise, if that aggregation of user-entered data, action history, and analytics violates the user’s reasonable expectation of privacy, for example.

Let’s start with a closer look at each side of that coin.

Private browsing is defined as using a web browser in such a way that the user’s identity, browsing history, and user-entered data when interacting with web pages is kept confidential. Browsers such as Mozilla Firefox or Microsoft Edge provide ways for users to open a new window (supported by a separate task and process stream) for private browsing, in which location tracking, identification, cookie handling, and various add-ons may change the way that they provide information back to websites or leave session-tracking information on the user’s computer. For most mainline browsers, telemetry is still gathered and made available to the browser’s authors. To put “private browsing” into perspective, consider one data point: the unique identification of the system you’re browsing from. Fully nonrepudiable identification of your system would require every device on the Internet to have a unique key or ID assigned to it that was an amalgam of IP address, hardware identifiers, software identifiers, and even your user ID on that system. A suitable cryptographic hash of all of this data would produce such a unique ID, which could not be de-hashed (decrypted) to get back to your specific username, for example. But if the search engine or web page keeps a history of activity tagged to that system identification, then every time you browse, your unique history continues to be updated. If that concerns you, can’t you just avoid this by opening up a new private browser window, tab, or session? According to tests by the Electronic Frontier Foundation and others, no; so-called private browsing still generates an ID of your hardware, software, and session that is unique to one of a billion or more such addresses. And, of course, the browser telemetry is still going back “home” to its developers. In the meantime, private browsing usually does not prevent ads from being displayed or block pop-up windows from occurring, and some ad blockers and pop-up blockers are incompatible with private browsing modes.

Secure browsing is defined as using a web browser in such a way that it actively helps keep the user’s system secure, while more assertively or aggressively protecting the user’s privacy, data about the user’s system, and data about the user’s browsing history. Competition between the mainstream browsers as products (that is, as platforms for revenue generation for advertisers or for search engine providers) has driven some of them to incorporate more of these features, and so the line between “highly secure and safe” and “private” browsing continues to blur. Some of the more well-respected secure browsers, such as Waterfox or Pale Moon, are offshoots (or forks) from earlier points in the development of Mozilla Firefox. By eliminating many of the data-intensive add-in capabilities, telemetry gathering, and other features, these secure browsers are also relatively lightweight as compared to native Firefox (that is, they run faster and use fewer system resources to do so).

If you truly need private and secure browsing, consider using add-ons such as HTTPS-Everywhere, which go a step further by using HTTPS for all of your browsing and then routing it through The Onion Router (TOR). TOR, incidentally, was designed by the U.S. Naval Research Laboratory as a way to provide anonymous communication and web use for social advocates, journalists, and ordinary people living or working in repressive or totalitarian countries. TOR takes every packet exchange and routes it to different members of its peer-to-peer backbone infrastructure; by the time the connection leaves TOR and goes to the requested URL, the only thing the distant server can see is that last TOR node’s IP address. This is very similar to using a VPN to hide your pathway, but with a serious twist: most VPNs bulk encrypt from your entry node to the landing node, providing anonymity and security, but try to minimize dynamic rerouting of your path for improved performance. TOR, on the other hand, dynamically reroutes to further mask your path and your identity, at the cost of sometimes significantly slower browsing.

In June 2019, Restore Privacy LLC updated its review and round-up of current secure browser offerings in the marketplace; https://restoreprivacy.com/secure-browser/ provides useful comparisons while also illuminating some of the issues and concerns that might bring people in your organization to need secure browsing as part of their business processes.

One final approach to secure and private browsing is a sandbox system—a separate computer, outside of your organization’s demilitarized zone (DMZ), that has no organizational or individual identifying data on it. The system is wiped (the disk is hard reformatted and re-imaged from a pristine image copy) after each session of use. Most businesses and many individuals do not have need of such a sandbox approach, but when the occasion warrants it, it works. Strict data hygiene practices must be in force when using such a sandbox; ensure that the bare minimum of information is put in by users as they interact with external systems and either prevent or thoroughly scan, test, and validate any data or program brought in via the sandbox from outside before introducing it into any other system in your infrastructure. (This is an excellent opportunity to consider the “write-down” and “read-up” restrictions in some of the classical access control models, as they apply to systems integrity and data confidentiality protection.)

  The Downside of a VPN

VPNs can do a marvelous job of keeping not only your data but the fact of your connection to a specific web page totally confidential; it’s only on that “last hop” from the VPN’s landing site in the country of your choice to the website of interest that actual IP addresses get used as packets flow to and from. Whether cookies make it back to your system or whether browser telemetry makes it from your system to the browser’s home may require additional tweaking of the VPN and browser settings.

If your connection requires some rigorous security verification, however, you may need to turn off the VPN. This is particularly true if the server in question blacklists IP addresses originating in certain regions or countries. I discovered this some time ago as I spent a ten-minute Skype conversation with PayPal security without using a VPN. PayPal security noted that my previous login and transaction attempts, moments before, seemed to move around between six different countries on four continents, including Iran and Chechnya, in as few as five minutes. This caused PayPal’s security algorithms to block the transaction attempts. Turning off the VPN allowed a “static” IP address (assigned by my ISP) to be used for the next entire session, which was successful. I cannot blame PayPal for being overly protective of my bank information in that regard.

IoT Endpoint Security

Since early 2019, quite a number of publications, vendor white papers, and products that address the growing need to secure Internet of Things devices and the networks that they interface with have hit the marketplace. The GSM Association, the trade body that represents more than 800 mobile phone systems operators and more than 300 other companies in this ecosystem, published in March 2019 an updated “Security Guidelines Endpoint Ecosystem V2.1,” which makes for some thoughtful reading. Find it at https://www.gsma.com/iot/wp-content/uploads/2019/04/CLP.13-v2.1.pdf. This offers advice and insight regarding many threats and hazards, such as:

  • Device cloning
  • Securing the Internet of Things (IoT) endpoint’s identity
  • Attacks against the trust anchor
  • Endpoint impersonation
  • Service or peer impersonation
  • Tampering with onboard firmware or software
  • Remote code execution
  • Unauthorized debugging or instrumentation of the device or the system it’s a part of
  • Side-channel attacks
  • Compromised endpoints as threat vectors
  • Securely deploying devices which lack back-end connections
  • User safety, privacy, and security

Another site, Endpoint Security Solutions Review, offers a list of ten best practices in the context of IoT security. Many are aspects of prudent information systems security planning and operations; others, such as “turn off IoT devices when not in use” reflect the different nature of IoTs as endpoints versus what we may think of as normal for laptops, smartphones, and other endpoint devices. See https://solutionsreview.com/endpoint-security/iot-endpoint-security-best-practices-enterprises/ for more ideas.

Amazon Web Services and other cloud hosting companies are entering the market for IoT device monitoring, management, and security services. Clearly, the incredible potential for high-payoff IoT use cases combined with a rightful fear of advanced persistent threats (as well as amateur hackers) have combined to generate some timely conceptual, operational, and product and services thinking across the industry. Who knows? Maybe our fear of our IoTs being taken out of our control will be nipped in the bud.

Operate and Configure Cloud Security

In many respects, configuring cloud-hosted business processes and the virtualized, scalable resources that support them is technically and operationally similar to what you’d be doing to secure an on-premises data center system. The same risk management and mitigation planning processes would lead to comparable choices about risk controls; overall security needs would still dictate monitoring, analysis, and alarm needs. The same compliance regimes in law, regulation, contract, and marketplace expectations will still dictate how your organization must maintain data and metadata about its use, both for inspection and for analysis during routine audits as well as in support of e-discovery orders from a court of law.

One key difference, of course, is administrative: your organization has an entirely different sort of supply chain relationship with a cloud services provider than it would have with suppliers, vendors, original equipment manufacturers, and maintenance organizations if it was running its own local data center. That contractual relationship is typically spelled out in service level agreements (SLAs), sometimes called terms of reference (TORs, not to be confused with The Onion Relay). SLAs and TORs should establish with great clarity where the dividing lines are with respect to privacy, security, systems integrity, data integrity, backup and restore capabilities, business continuity and disaster recovery support, and support for investigations, audits, and e-discovery. SLAs also lay out specific constraints on ethical penetration testing that their customer organizations can do; pen testing can sometimes get out of hand, and your cloud provider understandably has many other customers to protect from your pen test activities.

(ISC)2 and the Cloud Security Alliance have combined the knowledge and experience of their memberships to create the Certified Cloud Security Professional (CCSP) credential, as a way of growing the common core of skills that business and industry need as they move further into the cloud. This may be a logical path for you to consider as part of your ongoing professional growth and development.

With all of that in mind, let’s take a brief look at the basic concepts of migrating your business processes to the cloud from the standpoint of keeping them safe, secure, private, and compliant.

Deployment Models

Cloud services are the data storage, processing, applications hosting, network connectivity, user experience, and security management services that corporate or individual end users would normally engage with on a local physical data center, but with a difference. Data center designs look to having specific hardware servers running copies or instances of applications, such as web apps that access a database. By contrast, cloud-hosted services use virtual machines—software-created instances of a complete operating system and application environment—that are created and run on shared computing hardware to meet moment-by-moment demand. In the traditional on-premises data center design, the customer organization had to buy or lease servers, storage units, connectivity, and everything else; if they sized that set of “bare metal” too small, they limited their overall throughput; if they sized it too large, they had a lot of expensive excess capacity sitting around idle. In the cloud service model, the owner or operator of the cloud provides all of the bare-metal servers, storage, networks, and interconnection, and they lease it out as virtual machines moment by moment to a wide variety of customers. Each customer pays for what they use, and no more. Maintenance, engineering support, environmental and physical infrastructure, and security of that cluster of processing and storage equipment are handled by the cloud services provider.

This concept started in the 1960s with a business model known as the service bureau; in fact, a book written at that time about the problems with the service bureau defined the business models of today’s cloud services in some detail. Several ownership and use cases are now in common use.

  • Private clouds operate on hardware owned or leased by one organization; they are used only to support the processing needs of that organization.
  • Public clouds are operated by a cloud services provider organization, which leases time to any and all customers who want to migrate their business processes into their cloud. Government customers, private businesses, nonprofits, and individuals can all lease resources in the same private cloud data center, without having to worry about who else is sharing the machines with them. Isolation between customers, their processing, and their storage is the responsibility of the cloud services company, and they’re usually quite diligent in how they do this.
  • Hybrid clouds can be a combination of private cloud services (owned or operated by the customer) and public cloud services. Increasingly, hybrid deployments are involving multiple cloud hosting service providers. Geographic diversity in data center locations, communications capabilities, and services can often mean that public cloud customers cannot get exactly what they want and need from one supplier alone.
  • Community clouds provide for affiliations and associations of local government or civic groups, and businesses that they work with, to operate in a somewhat federated fashion in a shared cloud services environment.
  • Govcloud is a highly secure cloud hosting environment, which meets stringent U.S. and Canadian government security standards; Govcloud users can be businesses working with government agencies on classified projects or projects that have an unusually demanding information security set of needs.

In cloud parlance, deployment and migration are terms for the whole set of managerial, technical, and physical tasks necessary to take existing or planned business processes, including all of their data, software, and people-facing procedural elements, and move them into an appropriate cloud environment. Deployment or migration should also include well-defined test and evaluation plans and of course very thoroughgoing security planning. Logistic support should also consider how existing employees or staff members will be trained on using the new cloud-hosted business systems. Backup and restore capabilities must also be planned, and business continuity and disaster recovery planning need to reflect this migration as well.

Service Models

For more than two decades there have been three main service models that define in general terms the architecture of what your organization is leasing from its cloud services provider. These models are considered “classical” since they derive from the mainframe-computer-based service bureaus that were the infrastructure of affordable computing from the 1960s through the 1980s. They also facilitate straightforward migrations of existing business logic, systems, and data into public and hybrid cloud environments as a result. Figure 7.1 illustrates these three hierarchies of services; above the stair-step line across the middle are the services and data that the customer organization provides and that have direct configuration management, control, and information security responsibilities for. Below the stair-step line across the middle of Figure 7.1 is the land of the cloud services provider; these providers own or lease, manage, maintain, and keep secure all of the hardware, systems software, cabling, power, air conditioning and environmental controls, and of course the physical, administrative, and logical security for these elements.

Image of a "cloud service model." Following three cloud service model can be seen: infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS).

FIGURE 7.1 Cloud service models

Scaling up from the bare-iron server boards to greater levels of integrated functionality, we find the following:

  • Infrastructure as a service (IaaS): This is equivalent to leasing or renting a room full of computers but with nothing installed on them. Basic networking services (in all cases, software-defined networks) are available, as these provide the only way that one VM can talk to another or to the outside world. The underlying server architecture provides CPU and GPU core processors, RAM, and channel connectivity to the network and to storage subsystems. The servers run what is called a bare-metal hypervisor, and the user then specifies (and either provides or leases rights to) the operating systems to be loaded into each virtual machine definition. If, for example, they need Windows Server 2016 scalable across up to 100 VMs, then, as workload expands, the cloud resource balancing and dispatch functions will instance more copies of the same Windows Server–based VM, including its predefined applications, database connections, and user identities. IaaS is well suited when the customer organization has significant investment tied up in fully functional systems, from the operating system on up through user-facing data manipulation interfaces, that they just want to move to a more scalable, durable, and perhaps more affordable hosting environment.
  • Platform as a service (PaaS): This generally provides everything up through a database engine. The customer organization brings in its database definitions, back-end applications, stored queries, and other apps that they need to fully host their business processes. Most web app systems that are built around a back-end database-facing application will migrate to PaaS to gain much greater scalability; this shifts the database back-end load balancing, for example, more to the underlying PaaS environment’s scheduling and control services.
  • Software as a service (SaaS): This provides the customer organization with a fully integrated and supported application environment. Office 365 is a well-known example of this; the same user experience and the same manner of using files, integrating local endpoint and cloud-hosted storage, and inter-applications communication and data sharing are all supported and maintained via the SaaS application and the supporting platform and infrastructure underneath it. Salesforce is another example of a SaaS environment. In SaaS, the customer’s business logic is built into procedures, scripts, macros, forms, or other application-dependent structures and definitions.

It’s important to emphasize that in many respects, nothing technical changes from what you’re used to in your on-premises data center or LAN environment as you move business logic into the clouds or as you move from IaaS to PaaS to SaaS. The same security considerations you would use to segment your organizational networks on hardware you own and manage are what you should do as you define the virtual machines you want to deploy to get work done with. You still write a technical specification: what can Machine X talk to, what should not be allowed to talk to it, query it, or attempt to connect to it, for example, is a dialogue you have about each machine on your network now, and will still need to work through when you migrate to the cloud. It’s not mysterious.

However, much as each vendor’s network hardware, management tools, SIEMs or unified threat management systems, firewalls, and access control systems have their own quirks, special features, and traps for the unwary, so do the tools and capabilities that each cloud services provider can make available for you to use. The number-one cause of cloud hosted services suffering data breaches, according to Amazon Web Services in 2018, were user errors in configuring storage resources.

As your organization plans to migrate to the cloud—or if it’s already out there—get with your IT department and learn with them about how the chosen cloud hosting provider’s security services need to be configured. Cloud service providers offer many training options, starting with free courses (and after all, the better trained their customers are, the more services they safely use; so free training pays for itself quickly). Take as much as you can; learn everything about this new world.

Before we look at the new concepts in cloud computing—serverless services delivery—let’s take a deeper look at several key aspects of life in the cloud and at what you need to keep in mind as you secure what your organization has moved out there or is migrating out there.

Virtualization

Virtualization is a set of techniques that allow for far more efficient utilization of computing resources. Since the 1960s, many computer systems have been designed with virtual memory, which allows each program, task, or process to run as if it had the entire address space that the CPU could reach available to it. If the CPU had a 4GB address space, then every program would be built as if it had that entire space available to it. When loaded into memory, a combination of hardware and operating systems features would let the system assign blocks of RAM (called pages) to each task or process as it needed them. Since most processes do a tremendous amount of I/O to disk drives, networks, or people-facing input and output devices and these devices run much more slowly than the CPU does, many different processes can get a time slice of the CPU while others are waiting on their disks, mice, keyboards, or networks to respond.

Virtual Machines

A virtual machine is taking that concept one giant step further: instead of just virtualizing one process or program, take a copy of the operating system, its virtual memory management, its network and disk subsystem managers and device drivers, and the applications you want to run on it, and make one complete image file of it, which you can reload into RAM and start executing any time you want. Hibernation does this in many OS environments today; it takes very little time to load an 8GB hibernation file into RAM, as compared to bootstrap loading the operating system, then the runtime environments, then the applications, and then initializing everything again. Each VM is defined by what you build into its virtual hard disk (VHD) image file. The hypervisor then loads as many copies of the VM as you want, onto as many CPUs as your runtime performance needs and load balancing controls tell it you will need; each VM is its own separate world and its own separate address space and connection to a virtual network. Each copy of a VM that gets started up is an instance of that VM. Each instance can be separately stopped, restarted, shut down, hibernated, saved for later use, or deleted, without impacting other instances of the same VM and without altering the definition (or template) of that VM.

The hypervisor is a slimmed-down operating system, which specializes in building VHDs of VMs, loading them upon request, supervising their use of the baremetal processor, RAM, and system resources, and then shuts the VMs down and cleans up after them as required. Hypervisors can run directly on the bare metal server itself, as the booted-in operating system; or they can run as a special class of application on top of an installed and bootstrapped operating system.

Security for Virtual Machines

At their initial definition and creation, a VM cannot communicate with anything outside of itself and the devices it is logically attached to, such as the endpoint associated with the user running that VM and whatever virtual disk drives the VM’s definition declares to be a part of it. The hypervisor can allow it to see LAN segments that you define, by means of virtual firewalls, virtual or distributed routers, or most any other type of network architecture and device imaginable. But the VM cannot just reach out and find these. VMs cannot share memory, unless you’ve defined ways in which the VMs can do that by means of services. Much as you’d do for physical machines, you’ll need to take steps to allow each instance of a VM to have access to the outer world or to be visible to other VMs or to users on the Internet.

As a result, VMs are often containerized in sets of VMs related to a particular use. One container could contain a definition of a front-end web server, a back-end database application, and their supporting servers for DNS, networks, firewalls, email, and access control. That complete system has all the same functionality that the “real” deployed set of VMs would if they were connected to the Internet; but as a separate container, it can be run in isolation for software testing, for malware reverse engineering, for incident investigation and analysis, or for user or maintainer training. Containerizing VMs allows the system architects to choose what exactly they need in that environment and what they don’t need. What runs inside that container stays inside that container, unless steps are taken to connect it to other networks or to the Internet. Ethical penetration testing often starts small and local by testing against containerized, scaled-down segments of your overall systems architecture.

In sum, all of the security and performance options that you would normally select and configure for on-site hardware, software, and network systems are available to you via the cloud. The same range of flexibility, responsiveness, scalability, security, risk and reward are all there. The only real difference is that the wait time to take delivery of a new instance of a VM or a container full of VMs might be just a few minutes; instancing new copies of a container as processing demands increase is also a matter of just a minute or so, especially if automatic load balancing has been set up to manage this for you. If you only need 128 CPUs worth of web server for two very busy hours every week, you don’t need to buy or lease them for the other 166 hours and let them sit idle. And that’s really the only difference.

This is a two-edged sword: you can vastly increase the resources available to support your organization’s business needs in a handful of minutes—and increase its threat surface proportionately in the same few minutes. Achieving the same scale of expansion with real, physical systems takes longer; you have more capital tied up in it, so perhaps you work more deliberately. Perhaps you’re also under greater schedule pressure to turn them on and get them producing revenues. On the other hand, by letting the hypervisor do the repetitive work for you, you’ve got more time to be more thorough in your analysis, inspection and verification of configuration settings, and your testing. This is just one aspect of your organization’s choice to migrate to or expand their presence in the cloud.

Serverless Services

The classical cloud services model reflects the batch-mode, file-focused data processing models that were developed in the early days of computing. Those models gave birth to the concept of a database as a structured, centralized repository for information—the flow of data through the organization’s business processes all centered around that database. Typical of such applications environments might be a major airline ticketing and reservations systems such as Apollo or Sabre, in which substantial portions of the business logic are implemented as individual applications programs that work around that database. The database was bundled with a server application, which provided the programming interfaces to the business applications (such as “will you quote” or “book now”) that embodied the business logic. As these applications grew and demanded higher capacity, faster throughput, and greater levels of global access, the apps themselves would migrate from one operating system to another, from one set of hardware to another. Migrating such apps to the cloud continues to require focus by the end-user organization on the underlying infrastructure issues.

Serverless architectures break the thought model in two important ways. First, they challenge designers to think about data as events. A customer calls to request a flight; treat this as an event, rather than merely as a source of new data that needs to be put in a file (or a database) someplace. Instead of designing an integrated application that enters into the dialogue with the customer, the sales agent, and the database, the designer identifies functions or services that need to be performed on the new data as it comes in. By chaining a series of functions together, new business logic can be developed, hosted, and put into use more easily and more quickly. Because of this emphasis on small, lightweight functions, these cloud models are often referred to as functions as a service (FaaS). The second major change introduced by the serverless concept is that these smaller, lighter bits of applications logic do not need to be hosted and running in a VM while waiting for an event to give them new data to process. The functions take up storage space (in libraries) while not in use, but they are not running up RAM or CPU costs while idle.

Major cloud services providers have been offering functional or serverless architecture components and services for about ten years now.

Because they are so inexpensive to create and use, it is tempting for customer organizations to set up user accounts with default settings and then dive right in and start making work happen. This happens all too frequently in greenfield or startup entrepreneurial settings, which can be long on enthusiasm and short on in-depth, focused risk management. As with any default settings, most of these serverless architectures make early entry and exploration simple but inherently insecure. Default accounts and user IDs often have unrestricted privileges; users must actively set security restrictions, such as establishing access controls for data storage, function invocation (calls to execute them), and the redirection of function outputs to storage, to display, or to other functions. Because it is so easy to get up and running in a serverless architecture, naïve users can be lulled into thinking that they have “nothing to worry about” from a security perspective. Nothing, of course, is further from the truth. As with “classical” cloud computing, serverless architectures are nothing new in terms of security fundamentals. Yes, they each have different knobs and levers to activate the controls needed to establish vitally needed information security measures. But they don’t require us to rethink what security is or what the organizational information security requirements are, just how to carry them out.

Tip

If you’d like to learn more about serverless architectures and security, check out Lynn Langit’s course at https://www.lynda.com/Developer-tutorials/Serverless-Architectures/774902-2.html.

Legal and Regulatory Concerns

Chapter 2 detailed many of the most significant legal and regulatory frameworks that apply to the use of information by businesses, nonprofit organizations, and public or government agencies. That chapter also examined how these laws drive us to develop and use our information systems in ways that assure the confidentiality, integrity, availability, nonrepudiation, authentication, privacy, and safety of the information in those systems and the actions our organizations take because of that information. These CIANA+PS imperatives apply whether our systems are on-premise hardware with no external network connections or are serverless processing frameworks hosted in hybrid public clouds. These legal, regulatory, and ethical requirements are platform-agnostic, not tied to any particular technology. Increasingly, they also apply in most jurisdictions worldwide.

These legal and regulatory frameworks focus on the following sets of issues and concerns:

  • Privacy-related data must be protected: It’s referred to by many names and acronyms—PII, NPI, personal data, and probably more. Used correctly, it provides verifiable ways to relate data about events or activities to a specific person or group of people, and in doing so, it creates value for those who hold that data. It adds value to the data in the process. Legal and ethical principles have already established rules governing how such data can be gathered, aggregated, stored, used, shared, moved, displayed, and disposed of. These same principles also dictate what rights (in law and ethics) the individual at the heart of that data has over the use of that data, and whether they have rights in reviewing it. Laws also define procedures (in some jurisdictions) to allow such individuals rights to appeal for such data to be corrected or even expunged (witness the EU’s “right to be forgotten” concept, which is part of the GDPR).
  • Failure to apply known and recognized lessons-learned is no excuse: In law and ethics, the burdens of due care and due diligence require senior executives, directors, and owners of information systems to apply both common sense and current wisdom to all of their responsibilities. If a managing director had the opportunity to learn that their systems were compromised and yet failed to learn this and act to mitigate the harm, that managing director failed in their duties. We accept this for the captains of our ships of business and commerce; what we as SSCPs need to appreciate is that we, too, have the opportunity to act better based on what we know. That’s part of being a professional after all.
  • Recordkeeping systems—in fact, any information system, automated or paper-based—must be auditable: Senior executives and directors of organizations are held accountable for the decisions they’ve made regarding the assets, information, people, and objectives that have been entrusted to their care. Think of this as applying “trust nothing, verify everything” to the social decision-making networks that run your organization—this is zero-trust architecture applied to the total organization, not just to its IT infrastructure. Every decision needs its audit trail; audit trails depend upon maintaining the pedigree of the data in custody, the data that is used, and even the data that was not used in making certain decisions. Those needs for transparency, auditability, and accountability combine to produce strict requirements for information security and risk mitigation controls.
  • E-discovery is becoming the norm: Three trends are pushing us toward a future of more e-discovery, rather than less. The first is the growing complexity of collaborative business relationships and the intermeshing value chains and streams by which business gets done. The second is the more vigorous enforcement of the ever-increasing complexity of the legal and regulatory environment that surround them. The third trend is the increasing sophistication, reach, and impact of cyberattacks on businesses, nonprofits, and government activities. Whether it’s to better resolve a risk situation or properly identify the responsible parties, the use of digital discovery continues to grow.

The key to successfully navigating these legal and regulatory waters is administrative: written policy directives, implemented into operational procedures, must exist and be used to respond to situations involving these issues. These policies and procedures should be reviewed by the organization’s legal advisors, their compliance officers, or other professionals. Those procedures are also a great situational awareness checklist for you when you’re challenged by a situation or an issue that seems new and different. This is especially true when dealing with technologies that are new to the organization—or poorly understood by its information security specialists.

Jurisdiction and Electronic Discovery in the Cloud

Before we can address anything about legal and regulatory concerns, we must first consider what jurisdictions, plural, your organization may fall under. A purely local business in a town in Idaho, for example, is already answerable to at least three layers of U.S. government at the city, state, and federal levels. If your organization has customers, vendors, partners, or investors in other countries, those countries may have legal claims of jurisdiction over your organization’s activities. Shipping goods or information across state or national borders also exposes the organization to the legal, regulatory, cultural, and in some cases religious authorities of another country. In any given case, such as an employee whose inappropriate use of company-owned IT resources, it may not be immediately clear which country’s laws or other constraints may have the most power to dictate to the organization. Legal and social concepts about privacy, data ownership, data protection, and due diligence, for example, can take on entirely different meanings in different societies. As a member of the information security team, you cannot sort these issues out yourself—you must depend upon senior leadership and management to sort them out for the benefit of the organization as a whole.

Once your organization migrates any of its business processes and hence its information into the cloud, you must now contend with additional jurisdictional concerns. What jurisdictions are your cloud service providers’ data centers located in? What other countries might those data centers’ connections to the Internet transit through, on their way into the great public switched network? What cultural or religious authorities might believe that they, too, have an interest in controlling or constraining your information activities?

If your SLA provides you with geographically dispersed data center support from your cloud services provider, your data could be moving from country to country or continent to continent quite frequently. This could be as part of optimizing service delivery to your customers, for routine operational backups, or as part of your business continuity planning. Such movements of information may also bring with them jurisdictional constraints and considerations.

E-discovery (electronic discovery) orders encompass any kind of legal document, such as a search warrant or subpoena, which has legal authority to demand that information be retrieved, removed, or copied from IT systems and made available to the courts, investigators, or other officials. These legal orders can take many forms, and in some countries and societies, written court orders are not needed for government officials to search, collect, copy, or just remove information from your organization’s workplace, its computers, and computers it may be connected to that are located elsewhere. If you’re the information security person on watch when such a digital discover order is served on the company and it and the officer of the law come to your workspaces, you cannot and should not attempt to block, resist, or deter them in their efforts to get what they’ve come for. However, unless your organization’s attorneys have advised you to the contrary, you should try to do the following:

  • Ask if you can notify your supervisor or manager and the organization’s compliance officer or legal advisor of the e-discovery order being served. If the officials say no, do not argue with them.
  • Ask to see identification, and make note of names, badge numbers, and the organization the officers are from.
  • Ask for a copy of the order.
  • Ask to remain in the area during the search, as an observer.
  • Make notes, if you can, of what they search and what they take with them.
  • Ask for an inventory of what they copy or take with them.

In almost all circumstances, it will do no good and in fact may cause grave damage if you attempt to argue with the officers serving the e-discovery orders; you may think that they are exceeding their authority, but your opinion won’t prevail in court (or keep you out of jail if worst comes to worst).

E-discovery orders might be served on your organization or on a cloud services provider that hosts your business processes and data. Such discovery orders served on the cloud hosting service could end up taking copies of portions of your data, even though your data was not the subject of the order, if it is on the same physical devices that are being seized under the order. Suitable use of encryption, by you and by your cloud hosting service, should protect you in this instance.

  Beware the Constraints on Discussing E-Discovery Processes

There are a number of circumstances in which you might be the on-scene organizational representative who is served with an order that says you are prohibited from discussing the discovery order or search warrant with anyone. In U.S. law, a national security letter (NSL) is a much more powerful form of a search warrant, and it carries with it severe penalties if you violate its prohibition of disclosure, especially if you discuss even the fact that you’ve been served with an NSL with company officials or lawyers. NSLs are issued by the Foreign Intelligence Surveillance Court in classified, closed-door sessions.

Other constraints are less dramatic and perhaps seem obvious: discovery orders about a particular customer, client, or employee (even a senior officer of the organization) should not be disclosed to the subject of that discovery order, nor to anyone who might reasonably be expected to inform that subject. Penalties for such tipping off can be severe as well.

Talk with your company’s compliance officer, or its legal team, to seek any advice or information they think you should have before such a discovery process happens.

Cooperative E-Discovery for Regulatory, Audit, and Insurance Purposes

E-discovery (or even the old-fashioned request for paper or printed records) may come from many sources other than a court of law. Many of these result from processes that your organization willingly chooses to comply with and participate in, as part of doing a particular type of business. Many different regulatory, insurance, and contractual requirements will subject your organization to audits; those audits require that the auditors be granted access to information and your information systems, and you and other staff members may have to assist in making that information available to them. Civil litigation, such as claims for damages being made to the company by an aggrieved party, may need to have e-discovery done to support attorneys, auditors, or investigators for both sides.

Such cooperative e-discovery processes shouldn’t come to you, as one of the information security team, as surprises; your managers should be well aware that the audit or investigation is ongoing and that the organization has agreed to support it. Nonetheless, be careful, and be diligent: if such a surprise auditor comes to your work area and is asking for information for whatever purpose, take a moment and check with management to make sure that this is not something to be alarmed about. It might be an ethical penetration tester, or an intruder, or a disgruntled fellow employee who’s about to take a wrong turn.

Ownership, Control, and Custody of Data

Many different legal and cultural systems have tried to establish the meaning of these terms in clear and unambiguous ways. The EU General Data Protection Regulation (GDPR), for example, specifies that the person (or organization) that actually has the data in their possession is its custodian; custodians have responsibilities to protect such information from unlawful disclosure, or disclosure in violation of contracts that they are a party to. Data controllers are the ones who can dictate what processing must be done to the data, including its dissemination or destruction. The owner of the data is the one who had a legal right to require others to pay for using the data or otherwise has a right in that data that others must respect. Your company may create vast quantities of data, some of which they have an ownership interest or claim on; they may then contract with or direct others to do processing with it or sell it or pass it on to third parties in the course of doing business. E-discovery will ultimately require the custodian to grant access to the requested data, regardless of whether the data’s owners or controllers are tasked by that discovery order or process as well.

Privacy Considerations

Private data, whether personally identifying information (PII) or nonpublic personal information (NPI), may be a major component of your organization’s various databases and systems.

  • PII may be more familiar to you; it’s generally accepted that this means the set of information necessary and sufficient to identify, locate, or contact a specific, individual person. PII is therefore generally the basis of a claim of identity, such as by showing a government-issued passport, identity card, or driver’s license. Compromised or stolen PII is generally the starting point for identity theft.
  • NPI is a superset of PII. Defined in U.S. law by the Financial Services Modernization Act of 1999, also known as the Gramm-Leach-Bliley Act (GBLA), NPI is all of the information associated with an individual that is held by financial institutions or businesses as a part of their business records. This can include residential address histories, employment histories, financial transaction and account details, and more. Internationally, multiple regulatory frameworks such as GDPR, the Second Accord of the Basel Committee on Banking Supervision (commonly referred to as Basel II), and the EU Privacy Directive extend the protection requirements of privacy over what seems to be a never-ending list of information about any specific person.

You may have personally experienced this extension of NPI into your private history, if you deal with a bank or financial institution that asks you questions about the current balance, the last few transactions over a certain amount, or other authentication challenges that substantiate your claim to be the knowledgeable owner or user of the account in question.

As you might expect, different jurisdictions may have significantly different terms and conditions that apply to such data and its use, retention, and proper disposal procedures. The most common mistake businesses make in this regard is to assume that if they do not have an operating location in a jurisdiction, then that jurisdiction’s laws do not apply to the business. This impression is mistaken: the presence of even one customer or supplier in a city, state, or country is sufficient to apply that jurisdiction’s laws to the business. As private data from your systems transits national borders and enters or leaves data centers on foreign soil, your organization faces more potential legal complexities. Chapter 2 provides an in-depth examination of the legal and regulatory frameworks for protection of privacy-related information; those principles should guide you in your choices of access control methodologies (as covered in Chapter 1) to minimize the risks of compromise, loss, or theft of this class of sensitive data.

Blockchain, Immutable Ledgers, and the Right to Be Forgotten

In the next five years, the use of blockchain technologies to create unalterable, permanent records of event-level data will grow by almost a factor of ten—from $706 million USD in 2017 to $60 billion by 2024, according to a recent report by Wintergreen Research.3 This is one of many technologies that has the potential to upset the classical paradigm of data processing focused around huge databases. Blockchain technologies are being applied to more than 200 different business use cases ranging from financial transaction processing, citizens services, asset management, payment processing, supply chain management, digital identity, healthcare, and many other activities. There are even recent patents on making editable blockchains, which does seem to be a contradiction in terms; yet, there is a real need for a systematic way to allow all parties to a blockchain to agree to fix errors in data stored in that blockchain. (The current paradigm is to record the corrections in the blockchain as edit transactions that add entries to the blockchain’s ledger, rather than go back and change data in a particular transaction or ledger entry.)

What may be missing around the use of this technology is the cultural, process-oriented thinking that helps customers, individuals, systems users, and systems builders know how to use a “never forgets” technology in ways that protect rights of privacy, rights to appeal incorrect or damaging data, and a right to be forgiven or forgotten. Law and regulation are still struggling with the implications of this technology’s widespread adoption.

Surveillance Data and the Cloud

Many organizations operate a growing variety of surveillance systems in and around their workplaces; these can be part of meeting safety or security needs or as part of monitoring workplace behavior and conduct of their employees. Different legal jurisdictions have very different interpretations as to when and how surveillance violates a person’s privacy, while also imposing different constraints as to how that data can be used, shared with others, how long it can be retained, and how it must be disposed of. If your perimeter security surveillance system, for example, is storing its data in the cloud, which jurisdiction that cloud is in may have impact on your plans for using that data in the course of doing business.

Data Storage and Transmission

Possibly the greatest benefit that organizations can gain from migrating into the cloud is how easily the cloud can support the need for resilience in both processing power and data availability. Three major characteristics inherent to cloud structures contribute to this. Compared to a fixed-size traditional on-premises data center, cloud-hosted systems provide the following:

  • Extensive use of virtualization creates the flexibility and scalability to meet dynamically changing demands for throughput, data transfer rates, and service delivery.
  • Virtual networking allows for rapid rehosting of data and business logic, whether across the hardware in one cloud service data center or to data centers anywhere on the Internet.
  • Virtual storage definition and management allows for many different data migration, backup, restore, mirrored or parallel processing, and archiving to meet users’ operational, continuity, or disaster recovery needs.
  • Extensive use of encryption of data at rest and in motion within the cloud data center, and to and from user locations, provides for very high isolation of one customer organization’s data from all other customers’ data at that center.

Taken all together, this means that your organization can tailor its needs for availability, continuity, security, throughput, and processing power to meet a continually changing mix of needs.

Third-Party/Outsourcing Requirements

Unless your organization is running a private or hybrid cloud, your cloud services are all being outsourced to third parties; and through them, no doubt, other layers of outsourcing are taking place. A typical cloud hosting data center might be operated by one corporate structure, in terms of whom you as customer contract with and interact with directly; that structure may only lease the underlying data center hardware, networks, operating systems, security, and communications capabilities, and have their maintenance and support subcontracted out to multiple providers. In a very real sense, your SLA—the service level agreement—provides the contractual “air gap” that insulates you from the management and contractual responsibilities of making that data center into an effective cloud for customers to use. In some cases, the cloud services provider may not actually own or operate a physical data center at all but instead are reselling someone else’s cloud services. Application platforms as integrated service models can often be provided this way. Educational and training organizations that use products such as Instructure’s Canvas system are paying for the integration of Instructure’s application design and functionality, layered on top of somebody’s cloud services. The customer doesn’t care whose cloud it is; their SLA with Instructure specifies their needs for geographic dispersion of data and processing to achieve their school’s needs for availability, access, throughput, reliability, and recovery. Instructure, as a business, has to translate the hundreds or thousands of SLAs it has with its customers into a master SLA with its actual cloud hosting services provider (although in reality, they start from their master SLA and parcel it out into productized service offerings to their customers).

Lifecycles in the Cloud

Applying a lifecycle model to your organization’s data, metadata, and business logic (as embodied in software, scripts, and people-facing procedures) leads to identifying two critical tasks that must be done in order to have the levels of information security (on all attributes, from confidentiality through safety) that you need.

  • Your needs must be documented in the SLA, clearly identifying which tasks or measures of merit are the service provider’s responsibilities and which ones are yours. This includes specifying how the service provider gives you visibility into their security services as they apply to your presence in their cloud and how they support you with monitoring data, alarms, and response capabilities.
  • Your IT and information security team must know how to configure, manage, monitor, and assess the effectiveness of all of the tasks that are your own responsibility, with respect to their deployment into the cloud, including all security-related monitoring, inspection, alert, and incident response functions.

Both of those are very tall orders for most organizations. Your organization is the one paying the bills to the cloud services provider; your own due care and due diligence responsibilities dictate that your organization must clearly and completely specify what it needs done, what standards will be used to measure how well that service is provided, what provisions allow and support you in auditing that service delivery and performance, and how misunderstandings, miscommunications, mistakes, and disputes will be settled between the two parties. This has to happen throughout the contractual life span of your agreement with that cloud provider, as well as across the lifecycle of the individual data sets, software, and service bundles that make up the business logic that is migrated into the cloud. This includes the following:

  • Audit support for your systems, data, and processing: You need to identify any special provisions that your SLA must incorporate so that your organization can comply with audits you have to respond to. This involves data and processes hosted in the cloud provider’s systems.
  • Audits of the cloud provider’s systems and processes: You need to determine what roles your organization will have, if any, when the cloud provider is audited. Know the audit requirements (standing and ad hoc or special) that they have to comply with and how frequently they are audited. Check if the SLA states whether these are guaranteed to be zero impact to your business operations or whether they specify a level of tolerable disruption or downtime. (In many respects, this is parallel to how your organization needs to have a working understanding with its own customers when they are subject to audits that might require involvement or support from your side of the relationship.)
  • Data and software disposal needs: These must be clearly specified, and the roles that other third-party service providers fulfill should be identified and understood. All software and data systems reach an end-of-useful-life point. Data retention in many instances is limited in law—companies must destroy such data after a certain number of years have passed, for example. Software and user-facing procedural scripting becomes obsolete and are ultimately retired (either the business function is no longer required, or something new has taken its place). Measures of merit or standards of performance should be specified via the SLA, and the SLA should also lay out how your organization can verify that such third-party services have been correctly and completely performed.
  • Nondisclosure agreements (NDAs): Most cloud services providers have nondisclosure terms in their standard SLAs or TORs that set limits on both parties as to what they can disclose to others about the business or technical operations of the two parties, as well as restricting the disclosure of information belonging to each party. This needs to be carefully reviewed as part of negotiating your SLA or TOR with such service providers, for it is a two-edged sword: it restricts your organization as well as the service provider, by requiring both of you to protect each other’s information, business practices, and many other types of information.

Shared Responsibility Model

That lifecycle perspective is known as the shared responsibility model. Look back at Figure 7.1; as you peel the layers off of each large grouping of functionality, it’s not hard to see that as a customer, your organization has a vital and abiding interest in knowing not only how things work in the cloud service provider’s domains of responsibility but also how well they work. Your organization’s due diligence responsibilities make it answerable to its external and internal auditors, to its stakeholders, to its employees, and to those whom it does business with. The more that your business moves into the cloud, the more that your cloud services provider becomes your strategic partner. By definition, a strategic partner is one that enables and empowers you to have the flexibility, depth, breadth, resilience, power, and presence to take on the objectives that keep your organization growing, thriving, and competitive. Treat it like one.

Or, if your organization uses a hybrid cloud deployment that marries you up with two or more cloud hosting service providers, treat them all as part of one big happy family. While there may be sound business reasons to play one services supplier off of another; at the end of the day, it may be a greater win for your organization to find ways to grow those pairwise shared responsibility models into a multiway modus vivendi, a way of living and working together to everyone’s mutual benefit. Such public hybrids do impose a burden on the customer organization to be mindful of keeping within the nondisclosure aspects of their agreements with each provider.

Layered Redundancy as a Survival Strategy

The advocates of zero-trust architectures advise us to trust nothing and always verify. Migrating your organization’s business logic and information assets to a cloud services provider may be taking all of your most valuable eggs out of one basket—the one you hold—and moving them into another, solitary basket, managed and operated by that single cloud services provider. Once again, apply the OSI seven-layer thought model, but with an extra added dose of redundancy. Redundancy can be at the same layer of the stack or at different layers, of course.

Backup and archival storage of important information assets vital to your business survival are a case in point. Consider the following two headline news stories from the same week in June 2019 that indicate the need to have diversity, distribution, and useful, verifiable retrieval and re-activation tactics for the “secret sauce” that makes your organization stay alive and thrive:

  • Norsk Hydro, struck by a ransomware attack, has spent more than £45 million (about $60 million USD) in recovering its business operations using archived paper copies of sales orders and processing manifests, which relied on bringing retired workers back to work to help them understand these documents and use them to get Norsk Hydro’s 170 operating locations worldwide back into business.
  • Universal Studios’ archive vault fire in 2008, according to articles in the New York Times, destroyed not only the primary copies of studio recording sessions by many Universal artists but also the “safety backups” of those archive recordings, which had been kept in the same vault. Lawsuits filed shortly thereafter claim breach of contract and are seeking hundreds of millions of dollars in damages. Artist Sheryl Crow commented to the BBC that this “feels like we’re slowly erasing things that matter.”

It is, of course, too early to jump to too many conclusions about either of these incidents. Even so, they strongly suggest a possible frame of mind to use. Let’s start thinking about how alternate approaches can help us ensure the safety and security of what we value most. For example:

  • If you currently protect high-value business processes by having them deployed to a public cloud environment, are they important enough that keeping the paper-based, human-driven process knowledge safe in a fire-proof off-site vault is worthwhile?
  • If your organization is using an email-enabled transaction process, complete with strong nonrepudiation features built right in, are there some transactions, with some customers or clients, that are so important that an additional form of verification and validation helps protect both parties and enhances the relationship?

In many cases, it may not make sense to spread the risk by using alternative processing, storage, or reuse tactics in a loosely parallel fashion this way. Not everything is that critical to your most important organizational goals. But in a few cases, it might be worthwhile.

And the smaller and more entrepreneurial your organization is—and the more dependent upon the creative energy of its people it is—the greater the potential return on a little parallel or alternate investment such as these might become.

Operate and Secure Virtual Environments

It’s good to start with NIST Special Publication 800-125, which was released in January 2011. You might think that this means it is somewhat outdated, but you might want to think again. In its short 35 pages, it provides a succinct roundup of the key topics that every virtual systems user should pay attention to as they consider their information security and business continuity needs. These recommendations focus on the five primary functions that the hypervisor provides to make a cloud what you need it to become.

  • Execution isolation and monitoring of virtual machines
  • Device emulation, and access control of devices and to devices on your virtual systems and networks
  • Privileged mode execution by processes on VMs or on the hypervisor on behalf of those VMs, as well as privilege escalation and de-escalation by processes, tasks, or subjects
  • Overall management strategies for VMs, including mechanisms for creation, dispatch, termination, and resource sanitization and release after termination
  • Administration, including configuration and change management, of hypervisors and host operating systems and other virtual environment software, firmware, and hardware

  Privileged Accounts: Do You Have Them Under Control?

In the eyes of many information security professionals and a number of security solutions vendors, mismanagement of privileged accounts provides the wide-open door to more than 80 percent of the advanced threat attacks that take place every year. This problem existed in many organization’s IT systems before they migrated to the cloud, and the lack of good privileged account hygiene seems to have migrated to the cloud as well.

The logical capabilities for controlling privileged accounts and for eliminating (not just reducing) their proliferation are built into most of the operating systems in use today. What seems to be missing is the organizational and administrative emphasis on controlling them from the start.

Best practice recommends an integrated logical and administrative approach that can routinely scan all connected servers, network devices, and endpoints, logging all of the privileged user or subject IDs and credentials, and then auditing this data against what your organizational access control AAA solution says should be the official list of privileged account holders. Ideally, this scanning would occur as part of initial connection authentication, at intervals throughout a connection session, and in the event of any other indicators of possible concern.

Let’s look at some of these areas from NIST 800-125 in further detail:

  • Isolation and Monitoring of Guest Operating Systems Guest operating systems (those that are loaded and executing in virtual machines) need to be isolated from each other, and closely monitored, if security is to be achieved and maintained. A number of strategies should be examined.
    • Partitioning: This can include how you control the mix of physical and logical assets that are partitioned and made available to VMs to use during execution, as well as protecting virtual machines from break-out or escape attacks. Rogue tasks running on a VM should be prevented from affecting the execution of other VMs or from accessing or contaminating resources that are not allocated directly to them by the hypervisor; such escapes, if not controlled, could result in malicious code or data spreading across a whole fleet of VMs in a matter of minutes.
    • Covert channel isolation: When the hypervisor is run on top of a host operating system that controls the underlying hardware, it often provides access to guest tools that allow instances of other OSs, running in the VMs it is hosting, to share overhead and resources when accessing file systems, directories, the system clipboard, and many others. If not properly controlled, these can provide attack vectors or information leaks to hostile processes running in other VMs or across the virtual-to-real network connection. Bare-metal hypervisors eliminate the opportunity for guest tools by eliminating this type of sharing.
    • Side-channel attacks: These seek to determine characteristics of the underlying hardware, hypervisor, and the guest OS itself, with an eye to finding exploitable vulnerabilities.
  • System Image and Snapshot Management A system image or snapshot file captures the contents of a VM’s memory, CPU state, and other information regarding its connections to virtual and physical resources. Similar to the hibernation file on a modern endpoint, these images can be quickly reloaded into virtual memory and then activated, as well as replicated to create other instances of that VM as many times as are required. While this makes for resilient systems, it also means that security-sensitive data in that image can be easily exported from your systems by exfiltrating a copy of that image file.

    Managing and securing these images is becoming an acute problem, as the number of endpoint devices capable of supporting multiple virtual machines continues to grow. As images proliferate (or sprawl, in NIST’s colorful terminology), your security risks spread almost without limit. One VM on one workstation or phablet might contain an app with an exploitable vulnerability, and an image of that app can quickly be spread to other similar devices throughout your organization in short order. Since the VM is running under the control of a hypervisor, there’s nothing for an application’s whitelisting utility to discover or send up alarms about; the vulnerability itself may not have malware installed with it, so HIPS or HIDS may not be able to detect and prevent its sprawling across other devices either.

  • Containerize for Improved Security Containerized VMs are becoming much more necessary (not just desirable) in software development and test activities. This adds even more images to the management challenges you’re facing. Software testing, ethical penetration testing, security assessment, internal user training, and even customer or external partner training are just some of the business activities that are benefiting from shared use of VM images. While it’s simple to say “Bring each VM image or snapshot under configuration management and control,” there are thousands of such images to control, if not more, across a typical organizational IT architecture.

    There is some good news in all of this, however. Since VM system images are an encapsulated copy of the system, those images can easily be further quarantined or isolated for malware scanning or forensic analysis if necessary. Scanning an image file with anti-malware tools can often spot rootkits or other malware that can otherwise obscure its existence when it is running. Stopping a VM and capturing a complete image of it is also a lot easier than capturing a systems image from a bare-metal server or endpoint system since the hypervisor can halt the VM’s CPU without causing any data or processor state information to be lost.

  • Endpoint and Desktop Virtualization Security Endpoint devices are becoming more and more capable and useful as VM hosts in many business use cases; in many respects, host-based hypervisors make using preconfigured VHD images or snapshots as easy as mounting a DVD and double-clicking a program. A common use case involves supporting special-purpose hardware that has been orphaned by its vendor and is no longer compatible with modern operating systems. The United Kingdom’s National Health Service, for example, got hit hard by the WannaCry ransomware attack in part because NHS as a system could not afford the costs to upgrade printers, fax machines, and clinical systems in tens of thousands of healthcare providers’ clinics across the country to bring them off of Windows XP or Windows 7 and onto Windows 10. Running one Windows 10 system in such a clinic and using a hosted hypervisor as the controller for these orphaned devices might have been a workable solution, but it was not explored in a timely way.

    Everything said previously regarding image and snapshot management applies for each endpoint that is hypervisor-capable in your system. This includes guest devices, mobile devices, and employee-owned endpoints (whether they are under MDM or not). Each hypervisor-capable endpoint and each user who has access to that endpoint dramatically increases your threat surface. This may put a significant burden onto your preexisting access control, network security, monitoring, and analysis capabilities; yet, if you do not step up to the security management challenge of dealing with widespread virtualization use on your endpoints, you may be leaving your systems wide open to a surreptitious entry and attack (or an outright obvious one, such as a ransom attack).

Software-Defined Networking

Stop and think a moment about software-defined networks. Once you rise up from the hard realities of the physical layer, everything you do in configuring, testing, managing, securing, and repairing your networks takes place via software. The GUIs in the routers, switches, HIPS and HIDS, and SIEMs that you use to manage these devices are implemented via software. More sophisticated network and systems management, configuration control, and security systems have you using additional software to define the network. All of the tools you use at the touch-the-box level of networking, such as NAT, PAT, and even DHCP and DNS, are all ways of taking a logical or administrative definition of your connectivity needs and mapping that onto the underlying hardware. The software you use to define your networks is, in effect, taking your picture of the network at the control, management, and data planes, and commanding the hardware elements to configure themselves to match. The software doesn’t move cables around at a patch panel, of course, and segmentation of networks that requires physically inserting a device between two segments will require some “hardware-defined” action (carried out by one of your carbon-based network and security assets).

Virtual networks take this a bit further, and while we don’t have the scope here to look at them in greater depth or detail, let’s look at the basics:

  • The physical substrate is the layer of physical routers, cables, switches, and servers that provide the interconnections. This substrate provides a full OSI 7-layer stack (unless you tailor things out of it). It provides the connection to physical gateway devices, such as your ISP’s point of presence or another on-site set of physically separate LAN segments. This layer has physical NICs talking to physical connections.
  • Each virtual network is defined on top of that substrate but not necessarily on top of each other. Each virtual network has to define its IP addressing scheme and its subnets and supply all of the capabilities that virtual machines that users of that virtual network will need. Virtual routing and switching functions, virtual NIDS or NIPS, are all provided as software services that get invoked as the network stack on each virtual machine makes service requests of the virtual network (all via our old friends in the TCP/IP and OSI 7-Layer protocols). These definitions live as data sets, not as switch settings or physical connections between devices; much as NAT, PAT, or DHCP help map your internal LAN addressing and port structures into what the outside world interacts with. The NICs (and other hardware elements) are now reduced to software virtualizations of what they’d do in the real world.

Once again, you’re starting from the same information security requirements, and you may have already mapped them onto your current physical network infrastructure. Mapping them onto a new virtual network involves the same choices and the same issues; it’s just that you can replicate and expand a dynamically adjustable firewall or router capability in a VLAN by using a well-debugged script file, rather than having to finally go and buy equipment, install it, run cables, and then interact with its internal firmware and GUIs or web pages to configure it.

It’s important to stay focused: keeping your entire virtualized infrastructure safe and secure is really no different than keeping a physical infrastructure of servers, applications, networks, endpoints, and ISP connections secure. The same best practices apply, but with one important caution added: for some reason, many organizations that migrate to the clouds assume that their cloud services provider will handle all of their security needs, or certainly the toughest ones; they then go about the business of building and running their own virtual data center as if it was completely isolated from the outside world and has no need for special security measures to be put in place. This is akin to believing that if only we use enough cryptography, we can solve all of our security problems. Both are false and dangerous myths.

Hypervisor

This is another area that is naturally high on NIST SP 800-125’s recommended focus areas for virtual systems security. Bare-metal hypervisors and hosted hypervisors have dramatically different security profiles, and it’s probably not overgeneralizing to say that unless you have a specific use case for hosted hypervisor use, don’t. (One reasonable use case would be in teaching and training situations, wherein students are developing their own knowledge and skills in using hypervisors and building virtual machines. A bare-metal hypervisor doesn’t provide enough of a user-friendly environment that other learning tasks and resources might require. On the other hand, you ought to be able to keep such machines isolated more easily.) If you must run hosted hypervisors, take a hard look at severely limiting the number and types of applications you install and use in each VM type that you run on that hypervisor; this will also help limit or reduce your threat surface.

Key recommendations from NIST include the following:

  • Maintain the hypervisor as fully patched and updated using vendor-provided distribution kits. Most hypervisors contain features that let them hunt for updates and notify your systems administrator about them, and then install them automatically. If that doesn’t fit with your configuration management and builds and control processes, then implementing a centralized, managed push process is called for.
  • Tightly restrict and audit all access to the management interface of the hypervisor. This is the interface used to create, configure, and set control parameters for new VM types that you define, as well as editing existing machines and controlling the ways in which they get instantiated and are executed. NIST recommends that either a dedicated (out-of-band) communications interface be used to access hypervisor management interfaces or that a FIPS 140-2 validated authentication and encryption process is used to secure and protect in-band use of the management interface. Log all attempts at using the management interfaces, and audit those logs for anomalies that might indicate an intrusion attempt or attack.
  • Synchronize all hypervisors to a trusted, authoritative time source, ideally the same source you use to synchronize clocks on all machines in your system.
  • Disconnect unused physical hardware from the bare-metal server supporting the hypervisor; in particular, remove or disable unused network interfaces, USB ports, or other storage devices. External devices used for periodic backup or update should be attached and connected for that task and then disconnected from the system to reduce the threat surface.
  • Disable all sharing services between guest and host OSs unless they are mandatory for your use case and situation. If they are, ensure that they can be tightly controlled and closely monitored.
  • Use hypervisor-based security services to monitor the security of each guest OS that is launched under the hypervisor’s control.
  • Carefully monitor the hypervisor for any signs of compromise, and thoroughly assess all hypervisor activity logs on an ongoing basis.

Peel the cloud layer back far enough and all of that virtualization encounters the raw physicality of servers, memory, real NICs, and storage subsystems, all cabled together; they all eat power and generate heat as well as revenue. Even the smallest of virtualization labs in your software development or security testing work areas needs to be physically protected from all of the hazards of accident, sabotage, intrusion, theft, or misuse. These servers and everything that supports and interfaces with them need to be part of your overall information systems security plans, programs, and procedures. If all of your hypervisor use is via your cloud services providers, then double-check your SLA to make sure you understand what it commits your provider to do. Review, too, how that provider subjects itself to independent audit of its security measures, as well as how you as a customer can verify that they are living up to their end of the SLA.

Virtual Appliances

As applications have slimmed down into apps, so have applications and apps been further transformed into appliances. These are not the washing machine or vacuum cleaner type of appliances but software packaged into a form factor that simplifies its use in a very narrow way. Handheld point of sales devices epitomize this trend: a small box, not much bigger than a smart phone, that allows a retail salesperson to enter details of a customer purchase, swipe a credit or debit card for payment, and provide a receipt, with the order or purchase details dispatched to a fulfillment process that picks the items from inventory or prepares the customer’s meal for them. A virtual appliance (VA) is a virtual machine preconfigured with its guest OS, applications, and interfaces, tailored to do its specific set of functions. Whether the virtual appliance is then loaded onto a hardware platform for direct execution or onto a cloud-hosted hypervisor for rapid deployment at scale is dictated by the underlying business need.

As with all software and machine images or snapshots, configuration management and control, user and device access control, and intrusion detection and prevention capabilities are a must. The onboard applications should be tailored to as limited a set of function points as are absolutely necessary to meet the need for the appliance; every extra function widens the threat surface of the appliance.

Another use of VAs is to preconfigure and deploy more feature-rich applications for end users to interact with from their endpoints. This can simplify software maintenance and configuration control: if you have 200 end users needing to frequently use a financial modeling application, you can either install it on every endpoint, install it once in a VA and let end users access and run instances of that VA (cloud hosted or locally hosted on their endpoint), or look to the software vendor for a platform solution. Each has its sweet spot as you balance cost, complexity, and management effort.

Continuity and Resilience

To some systems planners, managers, and security specialists, continuity and resilience are two sides of the same coin. They both measure or express how well your systems can continue to survive and function correctly in the face of rapid and potentially disruptive change, but while continuity describes your ability to recover from upsets, accidents, attacks, or disasters, resilience measures your systems’ ability to cope with too much good news. Book publishing provides a case in point: printing, binding, and shipping books to retailers who stock their bookshelves with them and wait for customers to buy them can end up with a lot of unsold books if you print too many but can also result in many unhappy customers who still don’t buy the book if you print far too few. E-publishing or other print-on-demand strategies trade off some of that risk for the immediacy of a download.

Cloud systems and their inherent virtualization provides both of these capacities, as we’ve seen from several perspectives throughout this chapter. Your organization’s SLA with your cloud provider can give your business processes the geographic dispersion to survive most any natural disaster, terrorist action, or political upheaval, while providing the resiliency to scale dynamically to meet your needs.

Attacks and Countermeasures

Throughout this chapter—and in other places throughout this book—we’ve seen that our systems are exposed to attacks simply because they are connected to a public-facing Internet. Whether those systems run the seven-layer OSI stack on our private in-house data center or in a cloud hybrid of services from multiple cloud hosting providers doesn’t really make much difference. Your threat surface is overwhelmingly defined by what you specify, select, include, enable, configure, and use; your choices in systems hardening, your prioritization of which vulnerabilities to fix today and which ones can wait a while are really not affected by where your organization hosts its business logic.

So, what is different about being in the cloud versus in your own in-house systems, from an attack and defense perspective?

  • Shared responsibilities via your SLA, as a service model itself, can provide you with a far greater pool of experience and technology than your organization can generate, field, and afford to pay for by itself. The big ten cloud services providers spend billions of dollars in providing the best suite of security capabilities, tools, services, monitoring and alarm facilities, data isolation, and support that they can. These are core capabilities that are also the razor’s edge of their competitiveness; if the marketplace did not fundamentally see these companies as providing services that can be fully secured, by the customers, to meet their various needs and use cases, these companies would not be as profitable as cloud providers as they are.
  • Shared constraints are defined by your SLA. If you own and operate your own data centers and networks, you have a greater degree of inherent flexibility and freedom to bring in the ethical penetration testers and have them go at your systems and your organization, be that as white-box, black-box, or gray-box testing. You will still need to do security assessment testing, including penetration testing; first, though, check with your cloud services provider.

Your SLA and your relationship with your cloud services provider (or providers, plural, if you’re doing a hybrid approach) is your most powerful tool when it comes to keeping your cloud-hosted, cloud-entangled business logic, processes, and information safe, secure, available, confidential, and private. You will pay for extra services that you ask for, such as the provider’s assistance in tailoring security controls or assisting you in investigating an incident. But you would probably have to pay to bring in outside expert help anyway.

As a roundup, consider some of the various “top ten” lists of security issues related to cloud deployments; they commonly list insider threat, malware injections, use of insecure APIs, abuse of cloud services, denial-of-service attacks, data loss, shared vulnerabilities, and failure of the user’s own due diligence processes. Account hijacking, privilege escalation, and catastrophic failure of backup and disaster recovery procedures add to these lists. Nothing is new there, just because your business has migrated into the clouds (plural, as more of you go hybrid in doing so).

The list of top countermeasures remains much the same:

  • Multifactor access control.
  • Rigorous management and auditing of privileged account use.
  • Enforcing access control protection mechanisms on data storage.
  • Intrusion detection and prevention for networks, hosts, and servers.
  • Thorough, effective configuration management and control for all hardware, firmware, software, and interconnections.
  • Engaging, informing, and involving all of your end users to be part of your active defenses.
  • Anti-malware and application whitelisting.
  • Secure business logic and process design, such as with separation of duties for critical or high-risk tasks.
  • Audit, inspect, review, log, analyze, and monitor.
  • Screen and filter against spam, phishing, and vishing attacks.
  • Trust nothing; verify and authenticate everything.
  • Endpoint security is paramount.

Shared Storage

Questions about shared storage in the cloud tend to reflect two broad sets of concerns.

  • Physical storage devices and subsystems at the cloud services provider almost always contain data for multiple customers. This is probably the most common concern raised and the most common situation at major cloud services provider data centers. From the service provider’s perspective, they simply cannot afford the legal complexities of allowing multiple customers’ data to be freely commingled on a single disk drive, with no protections, barriers, or isolating technologies used to keep it separate. What most providers do is to encrypt individual user data files, encrypt the logical storage volumes that they create for that user or customer, and then stripe those storage volume extents or granules out onto individual disk drives in their storage subsystems. Each drive will no doubt have multiple such stripes of data from multiple customers on it, but since the storage volume’s logical structure is separately encrypted for each customer, as well as each file being encrypted, someone who had access to the drive itself would need all of those decryption keys to start trying to read the data off of the disk drive. With proper key management security in place (such as a hardware security module being used to manage the keys), the likelihood of such cracking becoming successful is very, very low. Drives that are pulled from service due to malfunction or incipient fault are sanitized prior to being turned over to repair, salvage, or disposal. (If you’re dealing with a cloud services provider who does not provide this type of security, you are strongly cautioned to start shopping around. Quickly.)
  • Sharing of data with other users, such as external partners, vendors, or customers, which is stored in your cloud-hosted systems environment. This might involve a hybrid cloud system as well, if (for example) your main business processes and data are hosted in an Azure-based or AWS cloud environment, but you have groups of end users who routinely use Google Drive or Dropbox to share files with other parties. Your security requirements, in terms of function and performance, have not changed; this is no different than if you were hosting all of that data on an in-house server on your networks and providing these other users access to the data via SSO, perhaps mediated by specific applications.

In both cases, your organization’s existing information security classification guides and policies, and your security requirements, plans, and programs, should already be driving you to implement safeguards for your existing on-premises servers, networks, endpoints, and SSO or federated shared access. The shared responsibility model and your SLA may move some of these concerns (especially the first group) to your cloud services provider; you still need to own the responsibility to see that your security needs are being met, especially by transferring these risks to your cloud host.

Summary

We’ve come full circle in this chapter, starting from looking at what makes software so insecure through what is or is not different about your information security needs and posture when you migrate your business processes and data into the cloud. We’ve tested the notion of using the OSI 7-layer reference model as a paradigm, not so much as just a protocol stack; we let it teach us to think about implementing and validating our systems security in layers from the physical to the personal. One clear observation is that it’s becoming impossible—or perhaps not terribly useful—to try to separate systems security from network security, or software and applications security from people, procedural, and communications security. Convergence has welded these together, and therein we can find what we need to keep our businesses, our personal information, and our societies a bit safer, one day at a time.

Notes

  1. 1 Fazzini, Kate. 2019. Kingdom of Lies: Unnerving Adventures in the World of Cybercrime. ISBN 978-1250201348. St. Martin’s Press.
  2. 2 Note that north-south data flows are the ones between servers and external clients or users.
  3. 3 Wintergreen Research (2018). Blockchain Market Shares, Market Strategies, and Market Forecasts, 2018 to 2024. Available at https://www.ibm.com/downloads/cas/PPRR983X
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.157.186