CHAPTER 2
Modern Approach to Multi-Cloud Threat Hunting

Multi-Cloud Threat Hunting

According to Flexera's state of the cloud report from 2020 (https://info.flexera.com/SLO-CM-REPORT-State-of-the-Cloud-2020), shown in Figure 2.1, demand for a multi-cloud strategy has been high compared to single public and single private strategies, and since 2018, the trend is going higher as well.

Schematic illustration of Flexera's state of the cloud report

Figure 2.1: Flexera's state of the cloud report

Multi-cloud environments usually refer to the distribution of cloud assets, software, application, infrastructure, and resources across several cloud-hosting environments/providers. Typically a multi-cloud architecture utilizes two or more public clouds as well as multiple cloud service providers (CSPs) with the aim of eliminating the dependency and achieving a higher level of resiliency within the environment.

Each of the CSPs is responsible for a particular activity; for instance, one serves as an IaaS, another one as a PaaS, and another one is a SaaS service. Considering that each of these CSPs is a third-party vendor, they carry a number of risks with different pros and cons. Figure 2.2 demonstrates a simplified vision of the multi-cloud environment.

Schematic illustration of the simplified multi-cloud environment

Figure 2.2: Simplified multi-cloud environment

Multi-cloud computing enables organizations to reduce the downtime and/or prevent data loss. It is always advised to avoid a single point of failure, and using this approach helps if one CSP failed, because organizations still can operate resiliently. There are many CSPs around the globe and each has its technological, geological, and regulatory requirements. Using multi-cloud environment also allows one to leverage competitive pricing.

There are several advantages of deploying a multi-cloud architecture, including reducing reliance on any single vendor, cost-efficiencies, increasing flexibility through choice, adherence to local policies, and compliances requiring organizations to host certain types of data in a specific country or a geographical location.

Due to advanced and integrated technologies and the need to comply with a variety of customers and regulators, it could be claimed that many of the CSPs' security postures are more reliable and mature compared to many legacy organizations. However, it is important to perform due diligence and due care to find the gaps and work together to resolve them.

One of the challenges with a multi-cloud environment is obtaining a single source of truth of current security controls and asset inventories, different workflow management and dashboards, and provisioning or deprovisioning the assets where and when needed.

Another challenge that organizations face during cloud migration to a multi-cloud environment is that data will be shared among different parties, which may concern the data protection and privacy issue.

Roles and responsibilities should be clearly defined between each party of CSPs and the organizations with respect to each security implementation, maintaining the compliance state, and for governing the security posture regularly.

To meet these requirements, CSPs need to provide a high level of transparency, where an organization's stakeholders (such as the CISO and CIO) can confidently engage the CSPs. Lack of sufficient transparency can be one of the challenges an organization will face during this migration. In a multi-cloud environment, the key challenge for security teams is to have an integrated vision of the full IT environment. CISOs need to ensure that threat-detecting systems can operate on all platforms and preferably send information to one integrated dashboard. There are other multi-cloud challenges that are beyond this chapter.

After migrating to the cloud, an organization should deploy required security controls and precautionary action, including but not limited to the following:

  • Asset inventory: Identifying asset inventories on the cloud, especially crown jewels to help threat hunters identify the threat actors, their TTPs, and defined use cases.
  • Configuration management and patching: Threat hunters should monitor misconfigured VMs and workloads. Special use cases should be defined to identify those new assets that have recently joined the network, especially if they have high targeted vulnerabilities or are being used by relevant threat actors.

    Another area that threat hunters need to invest more time in is identifying exposed instances or APIs that integrate multiple clouds. This section requires extensive threat modeling to know what APIs are in place, what the mitigation controls are, and where they can fail, as well as which area requires additional monitoring by threat hunting.

  • Authentication and authorization: Administrator activity and privileged users' activity are also important to monitor via threat hunters. Due to the nature of the cloud, whereby assets are accessed remotely, secure remote connection and monitoring is crucial. Being remotely accessible ideally invites curious devils and insiders to start digging for a hole to get access.
  • Threat-hunting solutions: After deploying all related security hygiene, the SecOps (Security Operations) team should plan for readiness of the threat hunting to kick off, as discussed in Chapter 1. While we have discussed threat hunting mainly in the on-premise environment, the SecOps team should choose their threat-hunting solution to resolve the challenges that they may face.

While the stored data in CSPs may contain other customer data as well, the threat-hunting solution should be able to filter the unnecessary info and only obtain the metadata for data processing purposes.

Multi-Tenant Cloud Environment

Threat hunting helps reduce the operation overload as well as boosts identifying threats in an on-premises environment and prioritizes remediation planning.

Cloud computing is more common than ever, not only for organizations using different cloud services, but also using multiple CSPs, which is also known as a multi-cloud environment. In other words, you have one organization or community using several CSPs in once place. These CSPs could be private, public, or even hybrid cloud. On the contrary, one cloud provider serves a variety of organizations and each organization or customer is known as the tenant; this is called a multi-tenant environment.

In this section, we discuss the common risks in multi-cloud and multi-tenant environments and how threat hunting helps to address these risks.

Multi-tenant is defined as multiple customers or organizations accessing the same CSP, whereby different systems (IaaS, PaaS, SaaS) from different companies are hosted on one pool of physical servers. Multi-tenant architectures apply to both public cloud and private cloud environments, allowing each tenant's data to be separated from each other. There are several multi-tenancy model types, all with varying levels of complexity and costs. Multi-tenancy is one of the main criteria in defining cloud computing, where different customers access the secured and isolated, abstracted system without interfering with each other or even noticing other users. One benefit of multi-tenanting is that many users can get access to one server and use the resources at maximum level with lower costs, all at the same time.

In a multi-tenant environment, customers enjoy the affordable cost and integration via APIs with other applications. Most of the asset maintenance and updating/upgrading is under the CSP roles and responsibilities. It is important that the shared responsibility model be discussed at the contract level. On the contrary, multi-tenant offers limited customization, security changes, and upgrades dependencies and the vendor lock-in issue.

In a nutshell, if customers need a particular customization or security features to be applied, it depends on CSP approval and their roadmap. Other CSPs might provide the same customization, maybe with a lower cost on demand without waiting for the CSP roadmap.

In addition to these operational challenges, there are a number of cyber risks and compliance issues that led to many organizations being hesitant to adopt this new approach and migrate to the cloud. Many of the cyber risks in the multi-tenant (single CSP) environment also exist in a multi-cloud environment, which is discussed in the next section.

Such challenges drive many customers and organizations to look into the multi-cloud environments that address such limitations.

Threat Hunting in Multi-Cloud and Multi-Tenant Environments

An organization should know the risk of migrating to the cloud, be it multi-cloud or multi-tenant cloud, in advance, such as data security and privacy, shared model responsibility, increased dependencies on vendors, poor configuration, and access management, especially on admin and privileged users, unpatched VMs and platforms/infra, and many more. Of course, many security controls have been put in place, such as separation of duties, network security tools, integration and single management console and dashboard, IDS, IPS, encryptions, and other next-generation fantasy stuff.

However, if there is no proper monitoring considered, regardless of how secure the CSPs and on-premise platforms are, it will be difficult to reduce the cyber risk and identify them before an incident takes place or identify the APT attacks that could potentially be in the environment.

As discussed earlier in this chapter, threat hunting and threat modeling should be applied in the cloud environment as well.

As organizations migrate from a physical infrastructure/on-premise environment to a cloud environment, threat identification will be more challenging due to difficulties in compliance and configuration transparency, remote data sources and infrastructures, core security capabilities, and the number of APIs. In a nutshell, as the attack surface is expanding, threat hunting requires more attention, so the challenges related to threat hunting in a multi-cloud and/or multi-tenant environment must be discussed.

Now consider the challenges of a multi-tenant environment whereby different customer data types are hosted; the threat hunter could possibly breach the data privacy if certain data protection controls as well as role-based access controls (RBAC) are not in place.

On the other hand, threat-hunting complexity increases on a multi-cloud environment when you have heterogeneous infrastructures, services, vendors with different security postures, unbalanced security postures, different security controls in place where proper integration might not be in place, and more. Poor visibility on data at rest, in transit, and in process prevents the threat hunters from performing the threat hunting at all levels, and this is limited based on the access provided by the CSP and customers.

While customers can define application logs to be pumped into the SIEM or log collectors for threat hunting and incident management correlation, a number of the logs such as hypervisor logs, packet details, or system logs might not be shared with the customers.

Another challenge that threats hunters usually face in a modern cloud environment is that many organizations are moving to containerization and serverless approaches, whereby the instance constantly turns off and on whenever needed. This is typically a case where the organization is heavily invested in DevOps tools.

One of the main benefits of the cloud is you pay as you go. While the customer only pays for the amount of resources used, such as bandwidth, data, and processing resources, threat hunters should choose the required data carefully, as it will impact cost.

There are different types of threat-hunting commercial products and services offered by different vendors and service providers that could meet these challenges. Security-as-a-Service is one such approach that allows customers to subscribe and deploy into their multi-tenant cloud environment to boost their incident response and threat-hunting capabilities.

During threat-hunting maturity readiness, it is advisable that you build a successful threat-hunting program by starting small. As the maturity level increases, you then expand the scope of threat hunting to avoid overwhelming your hunters with thousands of events. In a multi-cloud or multi-tenant environment, the number of alerts received by log collectors and hunters increases tremendously over a period of time. Using the reliable threat source, you will now have an opportunity to use automation in threat-hunting for building and maintaining various use cases and hypothesis in a timely manner.

Building Blocks for the Security Operations Center

First, let's understand the two main concepts in cybersecurity—SOC and SIEM—and discuss the main differences between the two.

A Security Operations Center (SOC) is an organization function or a centralized unit with various security roles, such as Security Analyst, Threat Hunter, Red Team/Blue Team members, etc. It deals with defending and protecting the organization against various security-related issues using a variety of tools. One of the main tools used by the SOC team is Security Information and Event Management (SIEM).

So, the SIEM tool collects, normalizes, and analyzes application logs and events against the set of correlation rules. When these rules are triggered, they create a series of events that human analysts analyze and respond to.

Usually, you will not find an SOC without an SIEM but may find an organization with an IT team with a mature security understanding using an SIEM tool without a dedicated SOC unit. You will also often find that many organizations typically outsource their SOC capability to a third-party vendor that specializes in providing such cybersecurity services.

An SOC is composed of many entities used for cybersecurity incidents and event monitoring within an organization technology environment, such as people, processes, procedures, security tools, and software like Firewall, CASB, Proxy, WAF, etc.

To build an SOC, you must understand that there are many moving parts and you must think of them as sections and treat each as a threat modeling exercise. At the end of the day, the main objective of having an SOC is to manage the threat.

Setting up a modern SOC is not an easy task and choosing the right people, processes, and technologies can be a real challenge. Table 2.1 compares the SIEM, SOC, and threat-hunting processes.

Table 2.1: Comparing SIEM, SOC, and Threat Hunting

SIEMSOCTHREAT HUNTING
Use caseUse SIEM for basic security control and for automating log analyses.SOC can extend and maximize security control above and beyond an SIEM. One key differentiator is the processes that run within an SOC.Threat hunting is used to further identify highly sophisticated attacks, patterns, and indicators of compromise.
ExampleAn SIEM can detect multiple failed login attempts due to brute force attacks.If an SOC team detects a failed login attempt, they take further action by calling the employee, locking the account, and further investigating the reason for the lockout.Threat hunting is an extended capability within SOC and uses the same infrastructure. Threat hunting uses behavior analysis to identity the pattern, such as: Was any admin account created after multiple login failures? Is the geolocation of the user the same as the login attempt logs? Is there a mismatch in the user's normal time of login?

Let's divide the decision-making process into logical steps, i.e., elements of a modern SOC.

The elemental pillars include the people, process, and technology aspects required to support and defend the organization (see Figure 2.3). By utilizing these elements in SOC, you can improve existing functions and develop those that are lacking, creating both opportunity and advantages for the SOC that end in desired results for the organization.

Schematic illustration of the elements of a modern SOC

Figure 2.3: Elements of a modern SOC

Scope and Type of SOC

There is no one-size-fits-all approach when it comes to establishing and creating an SOC. An SOC should be scaled to either the global footprint of the organization or to the span of control for the business sector that operates the SOC.

Many large corporations have a main SOC, called a Global Security Operations Center (GSOC), which could be supported by smaller (child) SOCs around the globe.

These smaller SOCs also provide alerts and threat feeds to the main GSOC. There is always some redundancy built into SOCs, so if one is offline, another one can manage the load and sustain operations.

Services, Not Just Monitoring

Define all your SOC requirements and then develop a roadmap. Which services do you want to create within your SOC? It's not just about monitoring a few logs and events from your infrastructure, which used to be the case in a traditional SOC.

Modern SOCs provide a number of proactive services, which are resource-intensive operation tasks. This includes:

  • Security management and event monitoring
  • Security orchestration, automation, and response (SOAR)
  • Log management
  • Incident response
  • Vulnerability management and penetration testing
  • Managed defense and red team/blue team service
  • Threat hunting and threat monitoring
  • ICS and SCADA security monitoring
  • Business resiliency
  • Controls and compliance reporting based on certain regulatory requirements, such as PCI-DSS, HIPAA, SOX, and FISMA

SOC Model

The SOC model can include an onsite (in-house) dedicated SOC, outsourced (remote) SOC, or both.

Today's cybersecurity market is full of small, medium, large, and boutique managed service providers who offer you specific services for your security monitoring based on your specific needs. This includes threat hunting, penetration testing, red team/blue team, etc.

Each model has its own pros and cons. You choose from remote monitoring and analysis, a dedicated SOC center operated on your premises, or, for maximum security and cost effectiveness, a hybrid solution, which combines both.

Define a Process for Identifying and Managing Threats

This is the most crucial step and a key process one must embark upon when building an SOC, which is a threat modeling exercise. Threat modeling entails answering the following questions:

  • Which threats are applicable to my organization and do I care the most about?
  • What do these threats look like and for which sets of assets?
  • How does SOC identify, detect, and block these threats?
  • What run book and playbook do I need to create?

Once these questions are answered, playbooks are built in order to document how to respond, set severity, and escalate these specific threat types. Other important processes to consider are shift time and models.

Tools and Technologies to Empower SOC

The tooling in the SOC (see Figure 2.4) is a mixture of centralized breadth capabilities and specialized tools that enable high-quality alerts and an end-to-end investigation and remediation experience.

Snapshot of SOC tooling

Figure 2.4: SOC tooling

One key element for managing an SOC is to ensure that the technology and platforms sync well with the information systems. You want to build a tool chest of software that can perform security audits, log analysis, penetration tests, and port scans. There are many commercial systems that can provide intrusion prevention, intrusion detection, and analyses.

You should have a good service/help desk ticketing system, documentation system, and inventory system. You should also stay on top of all the security trends by connecting to websites and security threat feeds that will update you on current events and on the threat landscape.

People (Specialized Teams)

You need to organize the SOC with specialized teams, allowing them to better develop and apply deep expertise, which supports the overall goals of reducing time to acknowledge and remediate threats.

Figure 2.5 represents the key SOC functions within Microsoft: threat intelligence, incident management, and SOC analyst tiers.

Snapshot of SOC teams reference model

Figure 2.5: SOC teams reference model

You need a highly trained team of security analysts who are familiar with security-based alerts and scenarios. As security threats are constantly changing, analysts need to adapt and think outside the box when it comes to solving problems. Attacks can come in a variety of different forms and types, so having people who can learn on the fly is important.

Finding the right people can be challenging. You may need to evaluate other options, such as outsourcing (via managed security service providers, known as MSSPs) or even hiring specialists to provide surge incident response (IR) support. I believe a hybrid of these options functions quite effectively.

In a SANS Incident Response report, 61% of respondents called upon their own surge staff to manage serious incidents and 58% had a specialized response team.

Cyberthreat Detection, Threat Modeling, and the Need for Proactive Threat Hunting Within SOC

This section covers cyberthreat detection models, threat modeling in general, and the need for proactive threat hunting within an SOC. Let's start with a look at the history of cyberthreat detection.

Cyberthreat Detection

The first cybercrime took place many decades ago, when Bob Thomas developed the first computer worm in 1971. It showed a popup message on victims' computers that read “I'm the creeper, catch me if you can.” Following that, cybercrimes started to take place one after another. The first Denial of Service (DOS) attack happened in 1989, by launching a worm attack and slowing the Internet for a few days. The estimated damage was about $100,000, up to $10mil.

During those early days, organizations were always looking for a way to fix the unknown and unpredicted incident. Gradually, software and security vendors come up with patches and hot fixes and anti-hacking, antivirus, and malware software and hardware was invented.

Over time, with advancing technologies, threat detection improved, using known attacks and viruses, malware signatures, and indicators of compromise (IOCs). All companies invested in signature-based detection control mechanisms. A lot of security products and server providers made a great deal of money by selling the latest signatures.

Following that, the speed and accuracy of identifying the known signature and IOCs became the winning point. That is where Intrusion Detection and Prevention Systems (IDPS) came in. Soon, organizations realized that while signature-based methods are perfectly accurate and fast enough (thanks to high-speed processors and memories), they could only mitigate against known signatures and IOCs. Hackers (and even script kiddies with minor changes in their code) could manage to penetrate the network and perform their malicious activity. See Figure 2.6.

Thereafter, threat detection from signature-based options changed to behavior and anomaly based detection. Technologies expanded and a variety of detection methods were introduced by security researchers.

Threat detection is no longer about identifying IOCs; in fact, now it's all about hunting the threats in the wild. Not to mention that regardless of the amount of investments and complex technologies and creative and strategic security defense controls, intruders can still manage to break into your network and remain undetected. Threat hunting allows you to proactively look for known and unknown threats and the modus operandi to mitigate the risk and prevent another headline in the news. Threat hunters constantly look for suspicious activity, codes, and unauthorized activity prior to data exfiltration, ransomware attacks, or Advanced Persistent Threats (APTs).

To avoid operational day-to-day activity and meeting the OPs deadline and SLAs, this role should be absolutely separate from operational duties and mainly focus on data collection and hunting. It should define and refine new procedures and automate and maintain them to increase efficiency.

Threat hunters may be part of SecOps organization, which may also depend on the size of an organization. They may also sit between the SecOps and enterprise and security architects teams and the business units.

Snapshot of SOC reference architecture

Figure 2.6: SOC reference architecture

Threat-Hunting Goals and Objectives

Threat hunting is like fishing in a big ocean. You need to know which ocean you are in and how much you know about it—how deep can you go, what type of fish are you planning to fish, what is your existing equipment, how skillful are you, and more. However, knowing what fish you are looking for is as important as knowing what type of threats you are looking for in your environment to hunt.

Threat hunting has been introduced to meet different goals and objectives. For instance, based on my experience, there is no organization that can claim they are running with fully 100% patched systems free of any vulnerabilities or misconfigurations. Not to mention that there are, from time to time, unknown assets in organizations that just pop up like mushrooms. On the other hand, there are many companies running on legacy infra and apps that are EOL and EOS. Businesses have accepted the risk and are running the operation by relying on existing mitigation controls.

One of the main objectives of the CISO should be to get to know the crown jewels and vulnerable systems and prioritize them based on system criticality. Then, based on threat modeling, prepare the hunting objectives and scope. Scoping is discussed in more detail in "Threat Modeling and the SOC" later in the chapter.

The next objective is to reduce the false positives and noise in the SOC and prioritize the events by relying on threat hunting and threat intelligence IOCs. Security Operations Centers usually receive thousands of false and true positive events and alerts. Threat hunting helps the SOC team prioritize which events to investigate and respond to. In complex and large organizations with heterogeneous software, applications and middleware, databases, operating systems and mainframes, and hardware and network devices, there are tons of patches to be deployed. Threat hunting, with the aid of reliable IOCs and TTPs, helps the threat and vulnerability management team prioritize which security patch should be deployed first and which vulnerability should be remediated.

Threat hunting helps the incident response and digital forensics team identify where the first compromise was initiated, what other machines are involved in this series of exploits, and when, how, and what the modus operandi was.

CISOs and security operations teams should take into account the lesson-learned strategy and define the goal of threat hunting. There is a saying that all companies have been compromised, the only difference is whether they know it or not.

Research and news show that many cyberattacks take place when the cybercriminal and attackers had presence in the organization network months in advance. While some attackers may act noisy and launch ransomware and DDOS attacks, either destroying the assets or being caught by security control systems, many APTs successfully evade such security controls and silently coexist in the environment without raising a flag or alarm. Threat hunters, using the TTP and known vulnerabilities and recent attack news and intel, should look for possible incidents and inform the CISO by taking precautionary action.

The output of such goals helps the remediation team to know, among reams of the missing patches, which patch and vulnerability should be prioritized, which system configuration should be fixed first, and identify the gap in process documents and standards. Hence, threat hunting is not just another checkbox for regulatory requirements or best practices to check. It requires a certain level of skill, organization security defense maturity, and the right equipment in place.

Senior management should also understand that the effectiveness of threat hunting is based on the available visibility, maturity level, and defined scope. Hence, it's best to define the scope based on the crown jewels and relevant threat actor present in their industry. As maturity levels improve, the visibility and coverage should be expanded. In that case, the ROI will be more justifiable.

Threat Modeling and SOC

Threat modeling is an important task when planning and building a successful SOC. Think about threat modeling as a practice to understanding your adversaries, their methods for attacking your organization, and how you will identify and respond to the attacks. Threat modeling will help you define scope, determine interesting log sources, select appropriate tools and technologies, and many other aspects to make SOC successful.

Make threat modeling exercises a standard practice to model and remodel threats. In the absence of proper threat modeling, you will end up wasting significant time, money, and energy in collecting irrelevant logs and investigating events that don't matter much. Threat modeling will also help you create appropriate use cases and filter out noise. Threat modeling details are covered in the following sections.

The Need for a Proactive Hunting Team Within SOC

Cybersecurity can often feel like a game of whack-a-mole. As our tools get better at stopping one type of attack, our adversaries innovate new tactics. Sophisticated cybercriminals burrow their way into network caverns, avoiding detection for weeks or even months, as they gather information and escalate privileges. If you wait until these advanced persistent threats become visible, it can be costly and time-consuming to address. It's crucial to augment reactive approaches to cybersecurity with proactive ones. Human-led threat hunting, supported by machine learning–powered tools like Microsoft Azure Sentinel, can help you root out infiltrators before they access sensitive data.

Assume Breach and Be Proactive

Traditional cybersecurity is reactive. Endpoint detection tools identify potential incidents, blocking some and handing off others to people to investigate and mitigate. This works for many of the routine, automated, and well-known attacks—of which there are many. However, our most sophisticated adversaries understand how these security solutions work and continuously evolve their tactics to get around them. The goal of the attackers is to remain undetected so they can gain access to your most sensitive information. To stop them, first you must find them.

Threat hunting is a proactive approach to cybersecurity, predicated on an “assume breach” mindset. Assume breach is a mindset that guides security investments, design decisions, and operational security practices. Assume breach limits the trust placed in applications, services, identities, and networks by treating them all—both internal and external—as not secure and probably already compromised. Just because a breach isn't visible via traditional security tools and detection mechanisms doesn't mean it hasn't occurred. Your threat-hunting team doesn't react to a known attack, but rather tries to uncover indications of attack (IOA) that have yet to be detected. Their job is to outthink the attacker.

Invest in People

Because threat hunting is concerned with emerging threats rather than known attack methods, people take the lead. It's therefore important that they have the time and authority to research and pursue hypotheses. This isn't possible if they are bogged down with security alerts. Many SOCs, including those at Microsoft, establish a three-tier model to address known and unknown threats. Tier-1 and Tier-2 analysts respond to alerts. Tier-3 analysts conduct research focused on revealing undiscovered adversaries. See Figure 2.7. You can learn more about how Microsoft organizes its SOC in “Lessons Learned From The Microsoft SOC—Part 2a: Organizing People.”

Snapshot of SOC using a three-tier approach: Tier 1 addresses high-speed remediation, Tier 2 performs deeper analysis and remediation, and Tier 3 conducts proactive hunts.

Figure 2.7: SOC using a three-tier approach: Tier 1 addresses high-speed remediation, Tier 2 performs deeper analysis and remediation, and Tier 3 conducts proactive hunts.

Develop an Informed Hypothesis

Threat hunting starts with a hypothesis. Threat hunters may generate a hypothesis based on external information, such as threat reports, blogs, and social media. For example, your team may learn about a new form of malware in an industry blog and hypothesize that an adversary has used that malware in an attack against your organization. Internal data and intelligence from past incidents also inform hypothesis development.

Once the team has a hypothesis, they examine various techniques and tactics to uncover artifacts that were left behind. A great tool for helping with hypothesis development and research is the MITRE ATT&CK (adversarial tactics, techniques, and common knowledge) framework.

These adversary tactics and techniques are grouped within a matrix and include the following categories:

  • Initial access: Techniques used by the adversary to obtain a foothold within a network, such as targeted spear-phishing, exploiting vulnerabilities, or configuration weaknesses in public-facing systems.
  • Execution: Techniques that result in an adversary running their code on a target system. For example, an attacker may run a PowerShell script to download additional attacker tools and/or scan other systems.
  • Persistence: Techniques that allow an adversary to maintain access to a target system, even following reboots and credential changes. An example of a persistence technique would be an attacker creating a scheduled task that runs their code at a specific time or on reboot.
  • Privilege escalation: Techniques leveraged by an adversary to gain higher-level privileges on a system, such as local administrator or root.
  • Defense evasion: Techniques used by attackers to avoid detection. Evasion techniques include hiding malicious code within trusted processes and folders, encrypting or obfuscating adversary code, or disabling security software.
  • Credential access: Techniques deployed on systems and networks to steal usernames and credentials for re-use.
  • Discovery: Techniques used by adversaries to obtain information about systems and networks that they are looking to exploit or use for their tactical advantage.
  • Lateral movement: Techniques that allow an attacker to move from one system to another within a network. Common techniques include “Pass-the-Hash” methods of authenticating users and the abuse of the remote desktop protocol.
  • Collection: Techniques used by an adversary to gather and consolidate the information they were targeting as part of their objectives.
  • Command and control: Techniques leveraged by an attacker to communicate with a system under their control. One example is that an attacker may communicate with a system over an uncommon or high-numbered port to evade detection by security appliances or proxies.
  • Exfiltration: Techniques used to move data from the compromised network to a system or network fully under control of the attacker.
  • Impact: Techniques used by an attacker to impact the availability of systems, networks, and data. Methods in this category would include denial of service attacks and disk- or data-wiping software.

Cyber Resiliency and Organizational Culture

Cyber resilience is the ability to prepare for, respond to, and recover from cyberattacks. It helps organizations protect themselves from cyber risks, defend against and limit the severity of attacks, and ensure that business operations continue to function. See Figure 2.8.

Cyber resiliency should be part of a holistic approach to security that takes all aspects of the business into consideration, from employees and partners to the board of directors. Improving security is not a one-time project, but a program of continuous improvement.

The U.S. National Institute of Standards and Technology (NIST) defines cyber resiliency as “the ability to anticipate, withstand, recover from, and adapt to adverse conditions, stresses, attacks, or compromises on systems that include cyber resources.” More realistically, cyber resiliency is also about establishing a policy and process that help an organization survive and continue to execute its long-term strategy in the face of evolving security threats.

To become cyber resilient, enterprises must strike a balance between protecting critical assets, detecting compromises, and responding to incidents. Making the IT landscape cyber resilient requires investments in areas such as infrastructure, design, and development of systems, applications, and networks. At the same time, organizations must create and foster a resilience-conscious culture, of which security, security operations, and threat management are all essential parts.

Cybersecurity resiliency includes starting with the right mindset, technology, approach, focus on hygiene, and measurement of success. First and foremost, security initiatives and priorities must be aligned with the organization's business strategy to avoid wasting effort on unrelated activities and neglecting critical business assets.

Resiliency starts with the right “assume breach” mindset. Organizations must first accept the fact that attackers will successfully compromise resources in their environment. If an organization falsely assumes that they can be fully immune to all attacks, their investment choices are typically much less effective. The same cloud technologies that are inspiring business transformations can also be used to transform security strategy and operations.

Schematic illustration of cyber resilience which is the ability to prepare for, respond to, and recover from cyberattacks.

Figure 2.8: Cyber resilience is the ability to prepare for, respond to, and recover from cyberattacks.

Security organizations aim to increase their resilience by tapping into vast resources, investments, and knowledge using cloud technology (including threat intelligence). They rapidly provision new security capabilities from the cloud provider to enable rapid adaptation to attacker innovations.

Organizations made decisions about how to architect and operate their IT environments prior to cybersecurity being a significant priority. These legacy decisions represent a “technical debt” that organizations must pay down over time. By identifying these security hygiene risks, prioritizing them, and investing in remediating them, an organization can significantly lower their risk to both known attacks as well as likely future attacks.

Organizations should focus on measuring how difficult/expensive it is to attack them (especially for well-known attack patterns) as well as their ability to rapidly boot out attackers that successfully attack them.

Skillsets Required for Threat Hunting

To avoid similar misconceptions, bear in mind that threat hunting is not only an automated process that you can reactively live your life on. It requires constant tuning, remediation, and removing of false positives. This requires security analyst experience and proactive investigations.

Manual and automated penetration testing tools require experienced penetration testers to fine tune them and increase their efficiency. Furthermore, threat hunters are not incident responders that jump on and fix the incidents.

Rotate SOC analysts into the threat-hunting team for learning and development purposes.

Their main job is to obtain events, alerts, packets, and other relevant feeds to understand what was or could potentially be happening. Depending on the size of the company, the data size might vary. In small organizations and/or at the early stage of setting up the foundation of threat hunting, spreadsheets could be enough; as the maturity level increases and the amount of data increases, the size of the dataset will increase drastically as well.

That is where data analytics tools and skillset may require more. It is important not to define the threat-hunting scope so large as to cover the entire organization at this early stage. Instead, expand the coverage as the maturity level improves. Always start small and grow over time.

Just because it is about data analysis, does not mean that data scientists can become threat hunters. While their knowledge in analyzing data is useful, train your security analyst with data analysis courses or train your data analysts with security analysis arsenals.

Good threat hunters have an eye for detail and a sharp analytics mindset. They are proactive and out-of-the-box thinkers and have patience to look at the bigger picture.

As part of the maturity assessment and planning process when developing a threat-hunting program, the security analyst team requires a wide range of skillsets and knowledge.

Security Analysis

Threat hunters should be able to understand and work with network packets, parse the IOC feed, work with log correlation tools and SIEM, use security appliances such as firewall and IDS, reverse engineering malware, and know about exploits. They must constantly update their knowledge of threat actors, attack tools, and tactics. Similar to any other security analyst role, they must know how different operations work, what their critical files and processes are, and what their basic network protocols and services run. For instance, a threat hunter should be able to profile every department's normal behavior, like working hours, usual activities, and required tools, especially administrative tools such as PowerShell. By knowing what is considered “normal,” anomalies can be identified with a minimum of false positives.

Data Analysis

Threat hunters should be able to know how to combine different data together in a structured and unstructured manner. They need visual demonstrations, machine learning tools, elastic searches, and relevant skills. Due to the high volume of data that hunters deal with, experience and knowledge in Machine Learning (ML) helps them by training the ML tools about normal behavior, cluster the known, bad, and questionable activities, and finally group them for further profiling and investigations.

Programming Languages

With most IT and security roles, basic skills in programming and scripting is always in good demand. However, if you are expecting your team to be able to customize and automate the procedures, reverse engineer and perform data analysis, having knowledge of scripting languages such as Python, Perl, and C/C++ is mandatory.

Analytical Mindset

Hunters should have an analytical mindset to be able to develop different hypotheses and build various use cases. They need to analyze the output and tune their processes and procedures.

Soft Skills

Threat hunters need to talk with different technical and nontechnical coworkers from time to time. Soft skills help them build an efficient relationship and obtain the required info and support. Threat hunters also present reports to management about the latest states. Knowing how to write technical and nontechnical reports tailored for different audiences is highly desirable. Based on the seniority level of this role, deep knowledge and practical hands-on skills may vary.

Outsourcing

Many organizations, due to a lack of security talent pools, prefer to outsource managed security services. While managed services provide advanced technologies and rich intel feeds, human factors play a critical role in threat hunting. CISOs need to ensure that threat hunting is not a one-day contract kind of job. Consultants can't simply walk in, set up the platforms, run the exercise, wash their hands, and walk away. It requires the constant presence of hunters who know the business and IT environment and work on cases and artifacts 24/7.

To summarize, based on my experience, threat hunters require the following core skills:

  • A mindset of curiosity
  • Log analysis and general analytical skills
  • Understanding of normal network behavior
  • Understanding of normal endpoint user and application behavior
  • Understanding the threat landscape and the use of CTI
  • System administrator experience across Windows/Linux/common security products

Threat-Hunting Process and Procedures

There are different methods and processes for finding malicious users in your organization. There are three common methods that most threat-hunting teams leverage on their day-to-day jobs, all of which fully require skilled human-base analysis with the aid of relevant tools and services. See Figure 2.9.

Schematic illustration of threat-hunting data collection steps

Figure 2.9: Threat-hunting data collection steps

  • Hypothesis-based method: Hypothesis-based is one of the most common and most preferred approaches. Threat hunters should always have a hypothesis that hackers are already in their organizations and should find a way to identify them by coming up with different hypotheses and testing that.
  • IOC- and TTPS-based method: Another approach is relying on threat intelligence feeds—such as indicators of compromise (IOC) and tactics, techniques, and procedures (TTP)—by searching existing data. This approach depends greatly on your threat intel feed quality and accuracy.
  • Data-driven method: The third approach takes advantage of identifying the suspicious leads using data scientific arsenals, including but not limited to machine learning and data visualization to prepare the initialization so experienced threat hunters can further investigate them.

Regardless of the method you choose, these steps help formalize the threat-hunting process and lead to a repeatable and reliable expected output. The stability of this level helps you to reach a higher maturity level.

Metrics for Assessing the Effectiveness of Threat Hunting

No information security program can be effective and successful without proper metrics and tracking. Metrics help management strategize their planning, prioritize their investments, and keep things accountable.

Metrics should be defined based on the key risk indicators (KRI), key performance indicators (KPI), and service level agreements (SLAs) mapped with organization policy and standards. For instance, the organization defined the remediation time of critical vulnerabilities on critical systems within a certain time. The KRI should be defined according to the indicated SLA in the Information Security Policy of the company and get the thresholds approved by senior management. Presenting the defined metrics could be quantitative or qualitative. Finally, to ensure that the performance meets the defined requirements, defined KPIs are required.

If you cannot measure it, you cannot manage it, and consequently you cannot secure it. Defining strong and comprehensive metrics helps the management ensure that the Return of Security Investment (ROSI) is justifiable and successful and the organizational goals and objectives have been achieved. Successful programs should detect a number of incidents prior to them happening, hunts where existing controls have limitations, the coverage increase over time, number of false positive SOCs received decrease, number of automated procedures increase, and they are reviewed and addressed in a timely manner.

Foundational Metrics

  • Scope: One of the key metrics that needs to be defined is the total number of organizational assets and the number of included assets into the threat-hunting scope. You can break down this metric into critical and very important systems and others. In that case management always has a proper overview of their crown jewels. To demonstrate the improvement and expanding of the scope over time, always keep the trend liner charts.
  • Visibility: Visibility metrics are important. Dropping the number of assets from what have been defined could be a red flag. Threat hunters need to keep the number in mind at all times, and if there are any changes, take the required action.
  • Functionality metrics: Installed security tools and sensors should function correctly and maintain a healthy state on all assets. For instance, say the antivirus or EDR agent is not working properly, meaning it either failed to report back to the console, didn't get the latest update signature, or couldn't send back the logs to the server or execute the administrative commands. This metric is one of the important ones, because security tools not functioning could be a sign of compromise as well. Many hackers not only bypass security tools, but they also sometimes disable them to avoid making any noise at the SOC level.

Low compliance rate of any of these metrics leads to inaccurate threat-hunting reports.

Operational Metrics

  • Number of hunted items vs. number of incidents
  • Number of open hunting investigation vs. number of closed based on defined SLAs
  • Number of hunted items based on environment and business criticality
  • Detecting time
  • Number of total hypothesized vs. verified hypothesis
  • Number of hunts based on the threat intelligence feeds
  • Number of automated procedures vs. manual procedures
  • Duration of each hunting process end to end (categorized based on automated and manual)
  • Total relevant threat actors specific to an industrial, directly targeted the organization vs. number of defined procedures and used cases
  • Total number of reported hunts vs. number of open and closed issues based on hunting (remediation)
  • Duration of remediation from the time the hunt has been reported
  • Used technique for hunting (% of technique's effectiveness)
  • Data source used for each hunt
  • Type of finding and root cause analysis (e.g., broken process, system malfunction, human error, misconfiguration, data breach, and other cyber incident categories)
  • Type of vulnerabilities

These metrics can be used for different purposes. In addition to having oversight and monitoring the threat-hunting program, they also show how effectively the program is serving the organization. On the other hand, they can help the CISO obtain strategic oversight as well. For instance, most of the reports show the number of security agents having a malfunction, which would be an alarming message to infra and the SecOps team. The following section explains how threat hunting can help the effectiveness of other compliance functional and operational reports.

On the other hand, the total number of reported hunts in comparison to timely remediation of them shows how proactively another team is acting. Reporting hunts does not prevent the cybercriminals; remediation does.

The total number of use cases and procedures demonstrates how the threat-hunting team operates—passively or proactively. Similar to SOCs, many of the defined use cases are outdated and obsolete with invalid signatures and nonexistent IPs, focusing on low-risk items.

Meeting the investigation SLA, threat-hunting lead, SOC, and CISO should determine whether automation is in place and if the number of resources and staff are adequate enough to meet the objective.

The patch compliance report shows the overall patch deployment state, representing most identified compromise could potentially demonstrate the accuracy and effectiveness of the patch management program as well. This same strategy can be applied to security configuration or vulnerability management reports as well.

The number of human errors, such as initiated cyber incidents, is identified due to social engineering or phishing attacks. Maybe the information security awareness program is not proactive enough and should be revised (see Table 2.2).

Table 2.2: Example of Threat-Hunting Metrics

METRIC DESCRIPTIONMETRIC TYPE
Number of incidents identified proactively (vs. reactively)Trend, Comparison
Number of vulnerabilities identified proactively (vs. vulnerability assessments)Trend, Comparison
Dwell time of proactively discovered incidents (vs. reactively)Trend, Comparison
Containment time of proactively discovered incidents (vs. reactively)Trend, Comparison
Effort per remediation of proactively discovered incidents (vs. reactively)Trend, Comparison
Data coverage (data types and coverage of estate)Percentage
Hypotheses per MITRE ATT&CK tacticPie Chart
Hunts per MITRE ATT&CK tacticPie Chart
Incidents per MITRE ATT&CK tacticPie Chart
Percentage of successful hunts that result in a new detection analytic or ruleService Level
Sensitivity and specificity of analytics or rules derived from hunts (true & false positive rates)Service Level

Ultimately, the value of any metric is how useful it is to the recipient, often a senior manager such as a CISO, so all metrics should be developed in collaboration between the threat-hunting team and its relevant senior managers.

Adopt organizationally relevant metrics, such as Table 2.2, to drive improvements and show the return on security investment (ROSI) over time.

Threat-Hunting Program Effectiveness

As discussed, there are many elements that help a company's threat-hunting efforts meet the objectives and goals and become successful. CISO and the SecOPs team, prior to establishing any threat-hunting program, need to take into account these items, especially considering the maturity and readiness of the organization.

In a nutshell, successful threat hunting requires the elements outlined in Figure 2.10.

Schematic illustration of threat hunting components

Figure 2.10: Threat hunting components

Educating stakeholders and staff is particularly important, because lack of proper training at each level could lead to a cyber incident. The training could be awareness programs for staff and senior management and cyber drills to refresh the readiness of the incident responders. It should include educating privileged users with admin accounts, high-risk staff who have access to sensitive information, and IT people who are responsible for setting up the IT environment for business.

Summary

  • Threat hunting over a multi-cloud architecture is a complex activity, so it's important to understand the scope of the cloud services involved.
  • Sophisticated and modern attacks use stealthy or novel techniques designed to bypass automated monitoring and detection. Continuous threat hunting is the best way to detect and prevent sophisticated or persistent attacks.
  • While technology is clearly critical in the fight to detect and stop intrusions, the end user remains a crucial link in the chain to stop breaches.
  • Well-trained staff can be an asset in combating the continued threat of phishing and related social engineering techniques.
  • Continue to train your threat-hunting team and rotate your analyst role within your security operation.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.192.247