According to Flexera's state of the cloud report from 2020 (https://info.flexera.com/SLO-CM-REPORT-State-of-the-Cloud-2020
), shown in Figure 2.1, demand for a multi-cloud strategy has been high compared to single public and single private strategies, and since 2018, the trend is going higher as well.
Multi-cloud environments usually refer to the distribution of cloud assets, software, application, infrastructure, and resources across several cloud-hosting environments/providers. Typically a multi-cloud architecture utilizes two or more public clouds as well as multiple cloud service providers (CSPs) with the aim of eliminating the dependency and achieving a higher level of resiliency within the environment.
Each of the CSPs is responsible for a particular activity; for instance, one serves as an IaaS, another one as a PaaS, and another one is a SaaS service. Considering that each of these CSPs is a third-party vendor, they carry a number of risks with different pros and cons. Figure 2.2 demonstrates a simplified vision of the multi-cloud environment.
Multi-cloud computing enables organizations to reduce the downtime and/or prevent data loss. It is always advised to avoid a single point of failure, and using this approach helps if one CSP failed, because organizations still can operate resiliently. There are many CSPs around the globe and each has its technological, geological, and regulatory requirements. Using multi-cloud environment also allows one to leverage competitive pricing.
There are several advantages of deploying a multi-cloud architecture, including reducing reliance on any single vendor, cost-efficiencies, increasing flexibility through choice, adherence to local policies, and compliances requiring organizations to host certain types of data in a specific country or a geographical location.
Due to advanced and integrated technologies and the need to comply with a variety of customers and regulators, it could be claimed that many of the CSPs' security postures are more reliable and mature compared to many legacy organizations. However, it is important to perform due diligence and due care to find the gaps and work together to resolve them.
One of the challenges with a multi-cloud environment is obtaining a single source of truth of current security controls and asset inventories, different workflow management and dashboards, and provisioning or deprovisioning the assets where and when needed.
Another challenge that organizations face during cloud migration to a multi-cloud environment is that data will be shared among different parties, which may concern the data protection and privacy issue.
Roles and responsibilities should be clearly defined between each party of CSPs and the organizations with respect to each security implementation, maintaining the compliance state, and for governing the security posture regularly.
To meet these requirements, CSPs need to provide a high level of transparency, where an organization's stakeholders (such as the CISO and CIO) can confidently engage the CSPs. Lack of sufficient transparency can be one of the challenges an organization will face during this migration. In a multi-cloud environment, the key challenge for security teams is to have an integrated vision of the full IT environment. CISOs need to ensure that threat-detecting systems can operate on all platforms and preferably send information to one integrated dashboard. There are other multi-cloud challenges that are beyond this chapter.
After migrating to the cloud, an organization should deploy required security controls and precautionary action, including but not limited to the following:
Another area that threat hunters need to invest more time in is identifying exposed instances or APIs that integrate multiple clouds. This section requires extensive threat modeling to know what APIs are in place, what the mitigation controls are, and where they can fail, as well as which area requires additional monitoring by threat hunting.
While the stored data in CSPs may contain other customer data as well, the threat-hunting solution should be able to filter the unnecessary info and only obtain the metadata for data processing purposes.
Threat hunting helps reduce the operation overload as well as boosts identifying threats in an on-premises environment and prioritizes remediation planning.
Cloud computing is more common than ever, not only for organizations using different cloud services, but also using multiple CSPs, which is also known as a multi-cloud environment. In other words, you have one organization or community using several CSPs in once place. These CSPs could be private, public, or even hybrid cloud. On the contrary, one cloud provider serves a variety of organizations and each organization or customer is known as the tenant; this is called a multi-tenant environment.
In this section, we discuss the common risks in multi-cloud and multi-tenant environments and how threat hunting helps to address these risks.
Multi-tenant is defined as multiple customers or organizations accessing the same CSP, whereby different systems (IaaS, PaaS, SaaS) from different companies are hosted on one pool of physical servers. Multi-tenant architectures apply to both public cloud and private cloud environments, allowing each tenant's data to be separated from each other. There are several multi-tenancy model types, all with varying levels of complexity and costs. Multi-tenancy is one of the main criteria in defining cloud computing, where different customers access the secured and isolated, abstracted system without interfering with each other or even noticing other users. One benefit of multi-tenanting is that many users can get access to one server and use the resources at maximum level with lower costs, all at the same time.
In a multi-tenant environment, customers enjoy the affordable cost and integration via APIs with other applications. Most of the asset maintenance and updating/upgrading is under the CSP roles and responsibilities. It is important that the shared responsibility model be discussed at the contract level. On the contrary, multi-tenant offers limited customization, security changes, and upgrades dependencies and the vendor lock-in issue.
In a nutshell, if customers need a particular customization or security features to be applied, it depends on CSP approval and their roadmap. Other CSPs might provide the same customization, maybe with a lower cost on demand without waiting for the CSP roadmap.
In addition to these operational challenges, there are a number of cyber risks and compliance issues that led to many organizations being hesitant to adopt this new approach and migrate to the cloud. Many of the cyber risks in the multi-tenant (single CSP) environment also exist in a multi-cloud environment, which is discussed in the next section.
Such challenges drive many customers and organizations to look into the multi-cloud environments that address such limitations.
An organization should know the risk of migrating to the cloud, be it multi-cloud or multi-tenant cloud, in advance, such as data security and privacy, shared model responsibility, increased dependencies on vendors, poor configuration, and access management, especially on admin and privileged users, unpatched VMs and platforms/infra, and many more. Of course, many security controls have been put in place, such as separation of duties, network security tools, integration and single management console and dashboard, IDS, IPS, encryptions, and other next-generation fantasy stuff.
However, if there is no proper monitoring considered, regardless of how secure the CSPs and on-premise platforms are, it will be difficult to reduce the cyber risk and identify them before an incident takes place or identify the APT attacks that could potentially be in the environment.
As discussed earlier in this chapter, threat hunting and threat modeling should be applied in the cloud environment as well.
As organizations migrate from a physical infrastructure/on-premise environment to a cloud environment, threat identification will be more challenging due to difficulties in compliance and configuration transparency, remote data sources and infrastructures, core security capabilities, and the number of APIs. In a nutshell, as the attack surface is expanding, threat hunting requires more attention, so the challenges related to threat hunting in a multi-cloud and/or multi-tenant environment must be discussed.
Now consider the challenges of a multi-tenant environment whereby different customer data types are hosted; the threat hunter could possibly breach the data privacy if certain data protection controls as well as role-based access controls (RBAC) are not in place.
On the other hand, threat-hunting complexity increases on a multi-cloud environment when you have heterogeneous infrastructures, services, vendors with different security postures, unbalanced security postures, different security controls in place where proper integration might not be in place, and more. Poor visibility on data at rest, in transit, and in process prevents the threat hunters from performing the threat hunting at all levels, and this is limited based on the access provided by the CSP and customers.
While customers can define application logs to be pumped into the SIEM or log collectors for threat hunting and incident management correlation, a number of the logs such as hypervisor logs, packet details, or system logs might not be shared with the customers.
Another challenge that threats hunters usually face in a modern cloud environment is that many organizations are moving to containerization and serverless approaches, whereby the instance constantly turns off and on whenever needed. This is typically a case where the organization is heavily invested in DevOps tools.
One of the main benefits of the cloud is you pay as you go. While the customer only pays for the amount of resources used, such as bandwidth, data, and processing resources, threat hunters should choose the required data carefully, as it will impact cost.
There are different types of threat-hunting commercial products and services offered by different vendors and service providers that could meet these challenges. Security-as-a-Service is one such approach that allows customers to subscribe and deploy into their multi-tenant cloud environment to boost their incident response and threat-hunting capabilities.
During threat-hunting maturity readiness, it is advisable that you build a successful threat-hunting program by starting small. As the maturity level increases, you then expand the scope of threat hunting to avoid overwhelming your hunters with thousands of events. In a multi-cloud or multi-tenant environment, the number of alerts received by log collectors and hunters increases tremendously over a period of time. Using the reliable threat source, you will now have an opportunity to use automation in threat-hunting for building and maintaining various use cases and hypothesis in a timely manner.
First, let's understand the two main concepts in cybersecurity—SOC and SIEM—and discuss the main differences between the two.
A Security Operations Center (SOC) is an organization function or a centralized unit with various security roles, such as Security Analyst, Threat Hunter, Red Team/Blue Team members, etc. It deals with defending and protecting the organization against various security-related issues using a variety of tools. One of the main tools used by the SOC team is Security Information and Event Management (SIEM).
So, the SIEM tool collects, normalizes, and analyzes application logs and events against the set of correlation rules. When these rules are triggered, they create a series of events that human analysts analyze and respond to.
Usually, you will not find an SOC without an SIEM but may find an organization with an IT team with a mature security understanding using an SIEM tool without a dedicated SOC unit. You will also often find that many organizations typically outsource their SOC capability to a third-party vendor that specializes in providing such cybersecurity services.
An SOC is composed of many entities used for cybersecurity incidents and event monitoring within an organization technology environment, such as people, processes, procedures, security tools, and software like Firewall, CASB, Proxy, WAF, etc.
To build an SOC, you must understand that there are many moving parts and you must think of them as sections and treat each as a threat modeling exercise. At the end of the day, the main objective of having an SOC is to manage the threat.
Setting up a modern SOC is not an easy task and choosing the right people, processes, and technologies can be a real challenge. Table 2.1 compares the SIEM, SOC, and threat-hunting processes.
Table 2.1: Comparing SIEM, SOC, and Threat Hunting
SIEM | SOC | THREAT HUNTING | |
---|---|---|---|
Use case | Use SIEM for basic security control and for automating log analyses. | SOC can extend and maximize security control above and beyond an SIEM. One key differentiator is the processes that run within an SOC. | Threat hunting is used to further identify highly sophisticated attacks, patterns, and indicators of compromise. |
Example | An SIEM can detect multiple failed login attempts due to brute force attacks. | If an SOC team detects a failed login attempt, they take further action by calling the employee, locking the account, and further investigating the reason for the lockout. | Threat hunting is an extended capability within SOC and uses the same infrastructure. Threat hunting uses behavior analysis to identity the pattern, such as: Was any admin account created after multiple login failures? Is the geolocation of the user the same as the login attempt logs? Is there a mismatch in the user's normal time of login? |
Let's divide the decision-making process into logical steps, i.e., elements of a modern SOC.
The elemental pillars include the people, process, and technology aspects required to support and defend the organization (see Figure 2.3). By utilizing these elements in SOC, you can improve existing functions and develop those that are lacking, creating both opportunity and advantages for the SOC that end in desired results for the organization.
There is no one-size-fits-all approach when it comes to establishing and creating an SOC. An SOC should be scaled to either the global footprint of the organization or to the span of control for the business sector that operates the SOC.
Many large corporations have a main SOC, called a Global Security Operations Center (GSOC), which could be supported by smaller (child) SOCs around the globe.
These smaller SOCs also provide alerts and threat feeds to the main GSOC. There is always some redundancy built into SOCs, so if one is offline, another one can manage the load and sustain operations.
Define all your SOC requirements and then develop a roadmap. Which services do you want to create within your SOC? It's not just about monitoring a few logs and events from your infrastructure, which used to be the case in a traditional SOC.
Modern SOCs provide a number of proactive services, which are resource-intensive operation tasks. This includes:
The SOC model can include an onsite (in-house) dedicated SOC, outsourced (remote) SOC, or both.
Today's cybersecurity market is full of small, medium, large, and boutique managed service providers who offer you specific services for your security monitoring based on your specific needs. This includes threat hunting, penetration testing, red team/blue team, etc.
Each model has its own pros and cons. You choose from remote monitoring and analysis, a dedicated SOC center operated on your premises, or, for maximum security and cost effectiveness, a hybrid solution, which combines both.
This is the most crucial step and a key process one must embark upon when building an SOC, which is a threat modeling exercise. Threat modeling entails answering the following questions:
Once these questions are answered, playbooks are built in order to document how to respond, set severity, and escalate these specific threat types. Other important processes to consider are shift time and models.
The tooling in the SOC (see Figure 2.4) is a mixture of centralized breadth capabilities and specialized tools that enable high-quality alerts and an end-to-end investigation and remediation experience.
One key element for managing an SOC is to ensure that the technology and platforms sync well with the information systems. You want to build a tool chest of software that can perform security audits, log analysis, penetration tests, and port scans. There are many commercial systems that can provide intrusion prevention, intrusion detection, and analyses.
You should have a good service/help desk ticketing system, documentation system, and inventory system. You should also stay on top of all the security trends by connecting to websites and security threat feeds that will update you on current events and on the threat landscape.
You need to organize the SOC with specialized teams, allowing them to better develop and apply deep expertise, which supports the overall goals of reducing time to acknowledge and remediate threats.
Figure 2.5 represents the key SOC functions within Microsoft: threat intelligence, incident management, and SOC analyst tiers.
You need a highly trained team of security analysts who are familiar with security-based alerts and scenarios. As security threats are constantly changing, analysts need to adapt and think outside the box when it comes to solving problems. Attacks can come in a variety of different forms and types, so having people who can learn on the fly is important.
Finding the right people can be challenging. You may need to evaluate other options, such as outsourcing (via managed security service providers, known as MSSPs) or even hiring specialists to provide surge incident response (IR) support. I believe a hybrid of these options functions quite effectively.
In a SANS Incident Response report, 61% of respondents called upon their own surge staff to manage serious incidents and 58% had a specialized response team.
This section covers cyberthreat detection models, threat modeling in general, and the need for proactive threat hunting within an SOC. Let's start with a look at the history of cyberthreat detection.
The first cybercrime took place many decades ago, when Bob Thomas developed the first computer worm in 1971. It showed a popup message on victims' computers that read “I'm the creeper, catch me if you can.” Following that, cybercrimes started to take place one after another. The first Denial of Service (DOS) attack happened in 1989, by launching a worm attack and slowing the Internet for a few days. The estimated damage was about $100,000, up to $10mil.
During those early days, organizations were always looking for a way to fix the unknown and unpredicted incident. Gradually, software and security vendors come up with patches and hot fixes and anti-hacking, antivirus, and malware software and hardware was invented.
Over time, with advancing technologies, threat detection improved, using known attacks and viruses, malware signatures, and indicators of compromise (IOCs). All companies invested in signature-based detection control mechanisms. A lot of security products and server providers made a great deal of money by selling the latest signatures.
Following that, the speed and accuracy of identifying the known signature and IOCs became the winning point. That is where Intrusion Detection and Prevention Systems (IDPS) came in. Soon, organizations realized that while signature-based methods are perfectly accurate and fast enough (thanks to high-speed processors and memories), they could only mitigate against known signatures and IOCs. Hackers (and even script kiddies with minor changes in their code) could manage to penetrate the network and perform their malicious activity. See Figure 2.6.
Thereafter, threat detection from signature-based options changed to behavior and anomaly based detection. Technologies expanded and a variety of detection methods were introduced by security researchers.
Threat detection is no longer about identifying IOCs; in fact, now it's all about hunting the threats in the wild. Not to mention that regardless of the amount of investments and complex technologies and creative and strategic security defense controls, intruders can still manage to break into your network and remain undetected. Threat hunting allows you to proactively look for known and unknown threats and the modus operandi to mitigate the risk and prevent another headline in the news. Threat hunters constantly look for suspicious activity, codes, and unauthorized activity prior to data exfiltration, ransomware attacks, or Advanced Persistent Threats (APTs).
To avoid operational day-to-day activity and meeting the OPs deadline and SLAs, this role should be absolutely separate from operational duties and mainly focus on data collection and hunting. It should define and refine new procedures and automate and maintain them to increase efficiency.
Threat hunters may be part of SecOps organization, which may also depend on the size of an organization. They may also sit between the SecOps and enterprise and security architects teams and the business units.
Threat hunting is like fishing in a big ocean. You need to know which ocean you are in and how much you know about it—how deep can you go, what type of fish are you planning to fish, what is your existing equipment, how skillful are you, and more. However, knowing what fish you are looking for is as important as knowing what type of threats you are looking for in your environment to hunt.
Threat hunting has been introduced to meet different goals and objectives. For instance, based on my experience, there is no organization that can claim they are running with fully 100% patched systems free of any vulnerabilities or misconfigurations. Not to mention that there are, from time to time, unknown assets in organizations that just pop up like mushrooms. On the other hand, there are many companies running on legacy infra and apps that are EOL and EOS. Businesses have accepted the risk and are running the operation by relying on existing mitigation controls.
One of the main objectives of the CISO should be to get to know the crown jewels and vulnerable systems and prioritize them based on system criticality. Then, based on threat modeling, prepare the hunting objectives and scope. Scoping is discussed in more detail in "Threat Modeling and the SOC" later in the chapter.
The next objective is to reduce the false positives and noise in the SOC and prioritize the events by relying on threat hunting and threat intelligence IOCs. Security Operations Centers usually receive thousands of false and true positive events and alerts. Threat hunting helps the SOC team prioritize which events to investigate and respond to. In complex and large organizations with heterogeneous software, applications and middleware, databases, operating systems and mainframes, and hardware and network devices, there are tons of patches to be deployed. Threat hunting, with the aid of reliable IOCs and TTPs, helps the threat and vulnerability management team prioritize which security patch should be deployed first and which vulnerability should be remediated.
Threat hunting helps the incident response and digital forensics team identify where the first compromise was initiated, what other machines are involved in this series of exploits, and when, how, and what the modus operandi was.
CISOs and security operations teams should take into account the lesson-learned strategy and define the goal of threat hunting. There is a saying that all companies have been compromised, the only difference is whether they know it or not.
Research and news show that many cyberattacks take place when the cybercriminal and attackers had presence in the organization network months in advance. While some attackers may act noisy and launch ransomware and DDOS attacks, either destroying the assets or being caught by security control systems, many APTs successfully evade such security controls and silently coexist in the environment without raising a flag or alarm. Threat hunters, using the TTP and known vulnerabilities and recent attack news and intel, should look for possible incidents and inform the CISO by taking precautionary action.
The output of such goals helps the remediation team to know, among reams of the missing patches, which patch and vulnerability should be prioritized, which system configuration should be fixed first, and identify the gap in process documents and standards. Hence, threat hunting is not just another checkbox for regulatory requirements or best practices to check. It requires a certain level of skill, organization security defense maturity, and the right equipment in place.
Senior management should also understand that the effectiveness of threat hunting is based on the available visibility, maturity level, and defined scope. Hence, it's best to define the scope based on the crown jewels and relevant threat actor present in their industry. As maturity levels improve, the visibility and coverage should be expanded. In that case, the ROI will be more justifiable.
Threat modeling is an important task when planning and building a successful SOC. Think about threat modeling as a practice to understanding your adversaries, their methods for attacking your organization, and how you will identify and respond to the attacks. Threat modeling will help you define scope, determine interesting log sources, select appropriate tools and technologies, and many other aspects to make SOC successful.
Make threat modeling exercises a standard practice to model and remodel threats. In the absence of proper threat modeling, you will end up wasting significant time, money, and energy in collecting irrelevant logs and investigating events that don't matter much. Threat modeling will also help you create appropriate use cases and filter out noise. Threat modeling details are covered in the following sections.
Cybersecurity can often feel like a game of whack-a-mole. As our tools get better at stopping one type of attack, our adversaries innovate new tactics. Sophisticated cybercriminals burrow their way into network caverns, avoiding detection for weeks or even months, as they gather information and escalate privileges. If you wait until these advanced persistent threats become visible, it can be costly and time-consuming to address. It's crucial to augment reactive approaches to cybersecurity with proactive ones. Human-led threat hunting, supported by machine learning–powered tools like Microsoft Azure Sentinel, can help you root out infiltrators before they access sensitive data.
Traditional cybersecurity is reactive. Endpoint detection tools identify potential incidents, blocking some and handing off others to people to investigate and mitigate. This works for many of the routine, automated, and well-known attacks—of which there are many. However, our most sophisticated adversaries understand how these security solutions work and continuously evolve their tactics to get around them. The goal of the attackers is to remain undetected so they can gain access to your most sensitive information. To stop them, first you must find them.
Threat hunting is a proactive approach to cybersecurity, predicated on an “assume breach” mindset. Assume breach is a mindset that guides security investments, design decisions, and operational security practices. Assume breach limits the trust placed in applications, services, identities, and networks by treating them all—both internal and external—as not secure and probably already compromised. Just because a breach isn't visible via traditional security tools and detection mechanisms doesn't mean it hasn't occurred. Your threat-hunting team doesn't react to a known attack, but rather tries to uncover indications of attack (IOA) that have yet to be detected. Their job is to outthink the attacker.
Because threat hunting is concerned with emerging threats rather than known attack methods, people take the lead. It's therefore important that they have the time and authority to research and pursue hypotheses. This isn't possible if they are bogged down with security alerts. Many SOCs, including those at Microsoft, establish a three-tier model to address known and unknown threats. Tier-1 and Tier-2 analysts respond to alerts. Tier-3 analysts conduct research focused on revealing undiscovered adversaries. See Figure 2.7. You can learn more about how Microsoft organizes its SOC in “Lessons Learned From The Microsoft SOC—Part 2a: Organizing People.”
Threat hunting starts with a hypothesis. Threat hunters may generate a hypothesis based on external information, such as threat reports, blogs, and social media. For example, your team may learn about a new form of malware in an industry blog and hypothesize that an adversary has used that malware in an attack against your organization. Internal data and intelligence from past incidents also inform hypothesis development.
Once the team has a hypothesis, they examine various techniques and tactics to uncover artifacts that were left behind. A great tool for helping with hypothesis development and research is the MITRE ATT&CK (adversarial tactics, techniques, and common knowledge) framework.
These adversary tactics and techniques are grouped within a matrix and include the following categories:
Cyber resilience is the ability to prepare for, respond to, and recover from cyberattacks. It helps organizations protect themselves from cyber risks, defend against and limit the severity of attacks, and ensure that business operations continue to function. See Figure 2.8.
Cyber resiliency should be part of a holistic approach to security that takes all aspects of the business into consideration, from employees and partners to the board of directors. Improving security is not a one-time project, but a program of continuous improvement.
The U.S. National Institute of Standards and Technology (NIST) defines cyber resiliency as “the ability to anticipate, withstand, recover from, and adapt to adverse conditions, stresses, attacks, or compromises on systems that include cyber resources.” More realistically, cyber resiliency is also about establishing a policy and process that help an organization survive and continue to execute its long-term strategy in the face of evolving security threats.
To become cyber resilient, enterprises must strike a balance between protecting critical assets, detecting compromises, and responding to incidents. Making the IT landscape cyber resilient requires investments in areas such as infrastructure, design, and development of systems, applications, and networks. At the same time, organizations must create and foster a resilience-conscious culture, of which security, security operations, and threat management are all essential parts.
Cybersecurity resiliency includes starting with the right mindset, technology, approach, focus on hygiene, and measurement of success. First and foremost, security initiatives and priorities must be aligned with the organization's business strategy to avoid wasting effort on unrelated activities and neglecting critical business assets.
Resiliency starts with the right “assume breach” mindset. Organizations must first accept the fact that attackers will successfully compromise resources in their environment. If an organization falsely assumes that they can be fully immune to all attacks, their investment choices are typically much less effective. The same cloud technologies that are inspiring business transformations can also be used to transform security strategy and operations.
Security organizations aim to increase their resilience by tapping into vast resources, investments, and knowledge using cloud technology (including threat intelligence). They rapidly provision new security capabilities from the cloud provider to enable rapid adaptation to attacker innovations.
Organizations made decisions about how to architect and operate their IT environments prior to cybersecurity being a significant priority. These legacy decisions represent a “technical debt” that organizations must pay down over time. By identifying these security hygiene risks, prioritizing them, and investing in remediating them, an organization can significantly lower their risk to both known attacks as well as likely future attacks.
Organizations should focus on measuring how difficult/expensive it is to attack them (especially for well-known attack patterns) as well as their ability to rapidly boot out attackers that successfully attack them.
To avoid similar misconceptions, bear in mind that threat hunting is not only an automated process that you can reactively live your life on. It requires constant tuning, remediation, and removing of false positives. This requires security analyst experience and proactive investigations.
Manual and automated penetration testing tools require experienced penetration testers to fine tune them and increase their efficiency. Furthermore, threat hunters are not incident responders that jump on and fix the incidents.
Rotate SOC analysts into the threat-hunting team for learning and development purposes.
Their main job is to obtain events, alerts, packets, and other relevant feeds to understand what was or could potentially be happening. Depending on the size of the company, the data size might vary. In small organizations and/or at the early stage of setting up the foundation of threat hunting, spreadsheets could be enough; as the maturity level increases and the amount of data increases, the size of the dataset will increase drastically as well.
That is where data analytics tools and skillset may require more. It is important not to define the threat-hunting scope so large as to cover the entire organization at this early stage. Instead, expand the coverage as the maturity level improves. Always start small and grow over time.
Just because it is about data analysis, does not mean that data scientists can become threat hunters. While their knowledge in analyzing data is useful, train your security analyst with data analysis courses or train your data analysts with security analysis arsenals.
Good threat hunters have an eye for detail and a sharp analytics mindset. They are proactive and out-of-the-box thinkers and have patience to look at the bigger picture.
As part of the maturity assessment and planning process when developing a threat-hunting program, the security analyst team requires a wide range of skillsets and knowledge.
Threat hunters should be able to understand and work with network packets, parse the IOC feed, work with log correlation tools and SIEM, use security appliances such as firewall and IDS, reverse engineering malware, and know about exploits. They must constantly update their knowledge of threat actors, attack tools, and tactics. Similar to any other security analyst role, they must know how different operations work, what their critical files and processes are, and what their basic network protocols and services run. For instance, a threat hunter should be able to profile every department's normal behavior, like working hours, usual activities, and required tools, especially administrative tools such as PowerShell. By knowing what is considered “normal,” anomalies can be identified with a minimum of false positives.
Threat hunters should be able to know how to combine different data together in a structured and unstructured manner. They need visual demonstrations, machine learning tools, elastic searches, and relevant skills. Due to the high volume of data that hunters deal with, experience and knowledge in Machine Learning (ML) helps them by training the ML tools about normal behavior, cluster the known, bad, and questionable activities, and finally group them for further profiling and investigations.
With most IT and security roles, basic skills in programming and scripting is always in good demand. However, if you are expecting your team to be able to customize and automate the procedures, reverse engineer and perform data analysis, having knowledge of scripting languages such as Python, Perl, and C/C++ is mandatory.
Hunters should have an analytical mindset to be able to develop different hypotheses and build various use cases. They need to analyze the output and tune their processes and procedures.
Threat hunters need to talk with different technical and nontechnical coworkers from time to time. Soft skills help them build an efficient relationship and obtain the required info and support. Threat hunters also present reports to management about the latest states. Knowing how to write technical and nontechnical reports tailored for different audiences is highly desirable. Based on the seniority level of this role, deep knowledge and practical hands-on skills may vary.
Many organizations, due to a lack of security talent pools, prefer to outsource managed security services. While managed services provide advanced technologies and rich intel feeds, human factors play a critical role in threat hunting. CISOs need to ensure that threat hunting is not a one-day contract kind of job. Consultants can't simply walk in, set up the platforms, run the exercise, wash their hands, and walk away. It requires the constant presence of hunters who know the business and IT environment and work on cases and artifacts 24/7.
To summarize, based on my experience, threat hunters require the following core skills:
There are different methods and processes for finding malicious users in your organization. There are three common methods that most threat-hunting teams leverage on their day-to-day jobs, all of which fully require skilled human-base analysis with the aid of relevant tools and services. See Figure 2.9.
Regardless of the method you choose, these steps help formalize the threat-hunting process and lead to a repeatable and reliable expected output. The stability of this level helps you to reach a higher maturity level.
No information security program can be effective and successful without proper metrics and tracking. Metrics help management strategize their planning, prioritize their investments, and keep things accountable.
Metrics should be defined based on the key risk indicators (KRI), key performance indicators (KPI), and service level agreements (SLAs) mapped with organization policy and standards. For instance, the organization defined the remediation time of critical vulnerabilities on critical systems within a certain time. The KRI should be defined according to the indicated SLA in the Information Security Policy of the company and get the thresholds approved by senior management. Presenting the defined metrics could be quantitative or qualitative. Finally, to ensure that the performance meets the defined requirements, defined KPIs are required.
If you cannot measure it, you cannot manage it, and consequently you cannot secure it. Defining strong and comprehensive metrics helps the management ensure that the Return of Security Investment (ROSI) is justifiable and successful and the organizational goals and objectives have been achieved. Successful programs should detect a number of incidents prior to them happening, hunts where existing controls have limitations, the coverage increase over time, number of false positive SOCs received decrease, number of automated procedures increase, and they are reviewed and addressed in a timely manner.
Low compliance rate of any of these metrics leads to inaccurate threat-hunting reports.
These metrics can be used for different purposes. In addition to having oversight and monitoring the threat-hunting program, they also show how effectively the program is serving the organization. On the other hand, they can help the CISO obtain strategic oversight as well. For instance, most of the reports show the number of security agents having a malfunction, which would be an alarming message to infra and the SecOps team. The following section explains how threat hunting can help the effectiveness of other compliance functional and operational reports.
On the other hand, the total number of reported hunts in comparison to timely remediation of them shows how proactively another team is acting. Reporting hunts does not prevent the cybercriminals; remediation does.
The total number of use cases and procedures demonstrates how the threat-hunting team operates—passively or proactively. Similar to SOCs, many of the defined use cases are outdated and obsolete with invalid signatures and nonexistent IPs, focusing on low-risk items.
Meeting the investigation SLA, threat-hunting lead, SOC, and CISO should determine whether automation is in place and if the number of resources and staff are adequate enough to meet the objective.
The patch compliance report shows the overall patch deployment state, representing most identified compromise could potentially demonstrate the accuracy and effectiveness of the patch management program as well. This same strategy can be applied to security configuration or vulnerability management reports as well.
The number of human errors, such as initiated cyber incidents, is identified due to social engineering or phishing attacks. Maybe the information security awareness program is not proactive enough and should be revised (see Table 2.2).
Table 2.2: Example of Threat-Hunting Metrics
METRIC DESCRIPTION | METRIC TYPE |
---|---|
Number of incidents identified proactively (vs. reactively) | Trend, Comparison |
Number of vulnerabilities identified proactively (vs. vulnerability assessments) | Trend, Comparison |
Dwell time of proactively discovered incidents (vs. reactively) | Trend, Comparison |
Containment time of proactively discovered incidents (vs. reactively) | Trend, Comparison |
Effort per remediation of proactively discovered incidents (vs. reactively) | Trend, Comparison |
Data coverage (data types and coverage of estate) | Percentage |
Hypotheses per MITRE ATT&CK tactic | Pie Chart |
Hunts per MITRE ATT&CK tactic | Pie Chart |
Incidents per MITRE ATT&CK tactic | Pie Chart |
Percentage of successful hunts that result in a new detection analytic or rule | Service Level |
Sensitivity and specificity of analytics or rules derived from hunts (true & false positive rates) | Service Level |
Ultimately, the value of any metric is how useful it is to the recipient, often a senior manager such as a CISO, so all metrics should be developed in collaboration between the threat-hunting team and its relevant senior managers.
Adopt organizationally relevant metrics, such as Table 2.2, to drive improvements and show the return on security investment (ROSI) over time.
As discussed, there are many elements that help a company's threat-hunting efforts meet the objectives and goals and become successful. CISO and the SecOPs team, prior to establishing any threat-hunting program, need to take into account these items, especially considering the maturity and readiness of the organization.
In a nutshell, successful threat hunting requires the elements outlined in Figure 2.10.
Educating stakeholders and staff is particularly important, because lack of proper training at each level could lead to a cyber incident. The training could be awareness programs for staff and senior management and cyber drills to refresh the readiness of the incident responders. It should include educating privileged users with admin accounts, high-risk staff who have access to sensitive information, and IT people who are responsible for setting up the IT environment for business.
3.146.178.165