Inevitably, at some point, your organization will experience some form of security breach (incident) within a layer of their infrastructure. This could be the result of a simple misconfiguration within a deployment, thus creating a vulnerability, or from a malicious attacker external to your organization trying to obtain confidential data and compromise your systems. Either way, how you respond to a security incident as soon as it has been identified is critical to ensuring that the blast radius of the attack is minimized effectively and rapidly, thereby reducing the effect it has on the rest of your infrastructure.

Unfortunately, it is not possible to stop all security incidents from arising. As technology changes, new vulnerabilities, threats, and risks are introduced. Combine that with human error and incidents will undoubtedly occur. Because of these factors, there is a need to implement an incident response (IR) policy and various surrounding processes.

In this chapter, you'll learn how to prepare for such incidents and the necessary response actions you can use to isolate the issue. As a result, we will cover the following topics:

Where to start when implementing effective IR
Making use of AWS features
Responding to an incident

Technical requirements

This chapter is largely theory and provides some best practices and recommendations in the event of a security incident occurring. As a result, there are no technical requirements in order for for you to be able to follow this chapter.

Where to start when implementing effective IR

To understand how to respond to a security incident within an organization, the business must first provide training for its staff and educate them on their corporate practices, procedures, and processes that are in place to safeguard the infrastructure. Security incidents can come in all shapes and sizes, from small and unlikely threats to immediate and imminent attacks whereby sensitive customer data could be stolen and extracted from your systems. With this in mind, training is an ongoing pursuit and, as such, you should employ and adopt runbooks that provide clear instructions and actions that you can carry out for particular security incidents that may occur and perhaps could be foreseen.

Over time, these security incident runbooks will be modified and adapted as new processes, techniques, and technologies are applied within your organization. Staff should be familiar with the security tools and features offered by AWS to help them prepare and manage the security incident. Again, this all begins with sufficient and adequate training.

You should also ensure that your security and support teams, as well as anyone who will be involved in responding to an incident, has the correct permissions. You might need to review your access controls, including federated access, access to assume roles cross-account roles, and general IAM permissions. You might want to create a number of roles with more privileged access that can only be assumed by specific users for use during incident response.

If you are not familiar with the AWS Cloud Adoption Framework (CAF), which is a framework that has been designed by AWS to help you transition and migrate solutions into the AWS cloud based on best practices and recommendations, then I suggest that you review its contents. The following resource focuses on the security aspects of this framework: https://d0.awsstatic.com/whitepapers/AWS_CAF_Security_Perspective.pdf

As stated in the preceding link, this framework addresses four primary control areas:

Directive controls: Establish the governance, risk, and compliance models the environment will operate within
Preventive controls: Protect your workloads and mitigate threats and vulnerabilities
Detective controls: Provide full visibility and transparency over the operation of your deployments in AWS
Responsive controls: Drive the remediation of potential deviations from your security baselines

By following the recommendations highlighted by AWS CAF, you can start from a strong foundation when it comes to performing effective IR across your infrastructure and AWS accounts.

In addition to AWS CAF, I highly recommend that you review and read AWS Security Incident Response Guide, which can be found here: https://d1.awsstatic.com/whitepapers/aws_security_incident_response.pdf. It was published in June 2020.

On top of this, being familiar with the AWS shared responsibility model should be mandatory for all security engineers involved with IR. It can't be highlighted enough that you need to understand where the boundary lies in terms of what you, as the customer, are responsible for, and what AWS is responsible for from a security perspective. You may remember from Chapter 2, AWS Shared Responsibility Model, that AWS has three different shared responsibility models – Infrastructure, Container, and Abstract – all of which have varying levels of responsibility between cloud customers and AWS. So, depending on your chosen service within AWS and which model it falls within, your responsibility for managing the security around that service will vary.

Now that we have a basic understanding of where to start, we can start looking at the different AWS services that will help us build an effective incident response strategy.

Making use of AWS features

AWS offers a wide scope of features and capabilities when it comes to assisting and helping you manage incident response, from investigative measures to proactive monitoring. In this section, we will quickly look at these features of AWS, some of which we've already learned about, or will learn about in the upcoming chapters in detail. Here, we will look at them from the IR perspective.

Logging

AWS has numerous services that offer logging capabilities that capture meaningful and vital information when it comes to analyzing the source of a threat and how to prevent it. Where possible, when using your chosen services, you should enable logging. This is often overlooked, which can be a huge regret for organizations should the worst happen. With active logging, you will have a much higher chance of being able to rectify an incident quickly and efficiently, or even prevent it from occurring by spotting patterns and trends.

Logging offers you the opportunity to baseline your infrastructure of what's normal and what can be considered abnormal operations. This helps identify and isolate anomalies easily, especially when combined with third-party logging and analysis tools.

Again, having logs running continuously and automatically by the supported AWS services allows you to view the state of your environment prior, during, and after an incident. This helps you gather intelligence and insight into where in your infrastructure the incident occurred and how to prevent it from happening again in the future.

Some examples of services that offer logging in AWS include Amazon CloudWatch logs, AWS CloudTrail logs, Amazon S3 Access logs, VPC Flow logs, AWS Config logs, and CloudFront logs. There are many more examples of logging within AWS and this will grow as AWS itself evolves. The main point is that logging is a great method of helping you resolve a security incident as and when it occurs, and these logs should be readily and easily accessible in the event you are responding to an incident as part of your IR policy.

We'll cover logging as a whole in more detail in Chapter 12, Implementing Logging Mechanisms, where we will dive into each of these areas and services in great depth.

Threat detection and management

It is no surprise that AWS has a wide range of security services designed to help, protect, and guard your infrastructure. Within that scope, they also offer threat detection as another tool you can utilize to help you in your effort of minimizing security incidents.

AWS GuardDuty is a regional-based managed service powered by machine learning, specifically designed to be an intelligent threat detection service. It is used to monitor logs from other AWS services and features, including VPC Flow logs, DNS logs, and AWS CloudTrail event logs. AWS GuardDuty looks at these logs to detect unexpected and unusual behavior and cross-reference these analytics with a number of threat detection and security feeds that can help us identify potentially malicious activity and anomalies.

As I already stated, the service itself is powered by machine learning and, by its very nature, this allows the service to continually learn the patterns within your infrastructure and its operational behavior, which will, of course, evolve over time. Having a "big brother" approach allows GuardDuty to spot unusual patterns and potential threats, ranging from unexpected API calls and references that are not normally initiated to unexpected communications between resources. All of this could be the first sign of a compromised environment, and having insight into this through early detection is invaluable in reducing the impact of an incident.

From a security management point of view, we have AWS Security Hub, which integrates with other AWS services, such as Amazon GuardDuty, in addition to Amazon Inspector and Amazon Macie, plus a wide variety of AWS partner products and toolsets.

This scope of integration allows AWS Security Hub to act as a single pane of glass view across your infrastructure, thus bringing all your security statistical data into a single place and presented in a series of tables and graphs. If you are managing multiple AWS accounts, then Security Hub can operate across all of them using a master-slave relationship. The service itself operates as an always-on service, meaning it is continuously running and processing data in the background, which allows the service to automatically identify any discrepancies against best practices. The analysis of the data that's received by the different integrated services is checked against industry standards, such as the Center for Internet Security (CIS) benchmarks, thus enabling the service to spot and identify potential vulnerabilities and weak spots across multiple accounts and against specific resources. Early detection of weaknesses and non-compliance is valuable in ensuring that you safeguard your data.

One of the features of Security Hub is its insights. An insight is essentially a grouping of findings that meet a specific criteria base set from specific filters and statements. By using insights, you can easily highlight specific information that requires attention. AWS has created a number of managed insights, all of which can be found here: https://docs.aws.amazon.com/securityhub/latest/userguide/securityhub-managed-insights.html

The following are some examples of these managed insights:

AWS users with the most suspicious activity
S3 buckets with public write or read permissions
EC2 instances that have missing security patches for important vulnerabilities
EC2 instances with general unusual behavior
EC2 instances that are open to the internet
EC2 instances associated with adversary reconnaissance
AWS resources associated with malware

In addition to these managed insights, you can also configure your own insights so that they meet criteria that might be specific to your own business security concerns.

For more information related to Amazon GuardDuty and AWS Security Hub, please refer to Chapter 14, Automating Security Detection and Remediation.

Responding to an incident

Now that you have some background information regarding useful services, features, and how to ensure that you have a good foundation for your infrastructure by following the Cloud Adoption Framework and other best practices, let's look at some of the actions that you could take when an incident occurs.

Forensic AWS account

Having a separate AWS account for forensic investigations is ideal for helping you diagnose and isolate the affected resource. By utilizing a separate account, you can architect the environment in a more secure manner that's appropriate to its forensic use. You could even use AWS organizations to provision these accounts quickly and easily in addition to using a preconfigured, approved, tried, and tested CloudFormation template to build out the required resources and configuration. This allows you to build the account and environment using a known configuration without having to rely on a manual process that can be susceptible to errors and undesirable in the early stages of a forensic investigation. While performing your investigations, you should ensure that your steps and actions are auditable through the use of logging mechanisms provided by managed AWS services, in addition to services such as AWS CloudTrail.

Another benefit of moving the affected resource to a separate account is that it minimizes the chances of further compromise and effects on other resources when in its original source account.

Collating log information

Earlier in this chapter, I mentioned the significance of logs and the part they play in incident response. During an incident, it's critical that you are able to access your logs and that you know the process and methods for extracting and searching for data within them. You must be able to look at, for example, an S3 access log or AWS CloudTrail log and understand the syntax, parameters, and fields that are presented in order to process the information being shown. You may have third-party tools to do this analysis for you, but if you don't have access to those systems for any reason, you need to be able to decipher the logs manually.

If you have multiple AWS accounts, determine which ones can be shared with other accounts. To help with log sharing, you should configure cross-account data sharing with the use of CloudWatch and Amazon Kinesis. Cross-account data sharing allows you to share log data between multiple accounts that can then read it from a centralized Amazon Kinesis stream, thus allowing you to read, analyze, and process the data from the stream using security analytic systems.

For more information on Amazon S3 access logs, AWS CloudTrail logs, and Amazon CloudWatch, please refer to Chapter 12, Implementing Logging Mechanisms, and Chapter 13, Auditing and Governance.

Resource isolation

Let's assume you have an EC2 instance that is initiating unexpected API behavior. This has been identified as an anomaly and is considered to be an abnormal operation. As a result, this instance is showing signs of being a potentially compromised resource. Until you have identified the cause, you must isolate the resource to minimize the effect, impact, and potential damage that could occur to other resources within your AWS account. This action should be undertaken immediately. By isolating the instance, you are preventing any further connectivity to and from the instance, which will also minimize the chances of data being removed from it.

To isolate an instance, the quickest and best way to do so would be to change its associated security group with one that would prevent any access to or from the instance. As an additional precaution, you should also remove any roles associated with the instance.

To perform a forensic investigation of the affected instance, you will want to move the EC2 instance to your forensic account (discussed previously). However, it is not possible to move the same instance to a different AWS account. Instead, you will need to perform the following high-level steps:

First, you must create an AMI from the affected EC2 instance.
Then, you need to share the newly created AMI image with your forensic account by modifying the AMI permissions.
From within your forensic account, you need to locate the AMI from within the EC2 console or AWS CLI.
Finally, you must create a new instance from the shared AMI.

For detailed instructions on how to carry out each of these steps, please visit the following AWS documentation: https://aws.amazon.com/premiumsupport/knowledge-center/account-transfer-ec2-instance/

Copying data

Again, following on from the previous example of a compromised EC2 instance, let's also assume that the instance was backed by EBS storage. You may just want to isolate and analyze the storage of this instance from within your forensic account, and this can be achieved through the use of EBS snapshots. These snapshots are essential incremental backups of your EBS volumes.

Creating a snapshot of your EBS volumes is a simple process:

From within your AWS Management Console, select the EC2 service from the Compute category.
Select Volumes from under the ELASTIC BLOCK STORE menu heading on the left:

Select your volume from the list of volumes displayed:

Select the Actions menu and select Create Snapshot:

Add a description and any tags that are required:

Select Create Snapshot. At this point, you will get a message stating that the requested snapshot has succeeded:

Click on Close.
You can now ensure that your snapshot has been created by selecting Snapshots from under the ELASTIC BLOCK STORE menu on the left:

From here, you will see your newly created snapshot:

As you can see, it's a very simple process to create an EBS snapshot of your volumes.

Similarly, for AMI images, you must modify the permissions of your EBS snapshots so that you can share them from within another account. For more information on how to do this, please visit the following link: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-modifying-snapshot-permissions.html

Once the snapshot has been shared with the forensic account, incident response engineers will be able to recreate the EBS volume from the snapshot.

Forensic instances

Another option you can implement within your forensic account is forensic instances. These are instances that are specifically built to help you with your investigations and are loaded with forensic analysis tools and features. For example, if you had a compromised EBS volume, you could take a snapshot, copy the snapshot to your forensic account, build a new EBS volume from it, and attach it to your forensic instance, ready for investigation.

You can create a forensic instance in a few simple steps:

Select an AMI to be used for your forensic instance.
Launch an EC2 instance from this AMI. Be sure to use an EC2 instance that has a sufficient amount of processing power so that it can be used within your investigations.
Install all the latest security patches.
Remove all unnecessary software and applications from your operating system and implement audit and monitoring controls.
Harden your operating system as per best practices.
Install any software that you will be using to help you perform your analysis during your forensic investigations.
Once your packages have been installed, stop your instance and take an AMI image of that instance.
This process should be completed regularly and a new AMI should be built each time to ensure that the latest security fixes are in place.

In this section, we looked at the importance of implementing a forensic AWS account to isolate resources and help diagnose issues and incidents. We also looked at why we need to isolate the resource from the production network; that is, to minimize the blast radius of other resources being compromised. We then looked at the techniques we can use to obtain the data from the affected resource, before analyzing it with a forensic resource.

For more information on launching your forensic instance, please review the AWS White Paper entitled AWS Security Incident Response Guide: https://d1.awsstatic.com/whitepapers/aws_security_incident_response.pdf

A common approach to an infrastructure security incident

Before we come to the end of this chapter, I just want to quickly highlight a common approach to how you might respond to an infrastructure-related security incident involving an EC2 instance:

Capture: You should try and capture any metadata from the instance before you proceed and make any further changes related to your environment.
Protect: To prevent the EC2 instance from being accidentally terminated, enable termination protection while you continue to investigate.
Isolate: You should then isolate the instance by modifying the security group or updating the NACL to deny all traffic destined for the IP address of the instance.
Detach: Remove the affected instance from any autoscaling groups.
Deregister: If the instance is associated with any ELBs, you must remove them from any ELBs.
Snapshot: Take a copy of any EBS volumes via a snapshot so that you can investigate further without affecting the original volumes.
Tag: Using tags, you should highlight the instance that has been prepared for forensic investigation.

You will not be expected to know the commands to carry out the preceding steps via the AWS CLI, but should you wish to learn how to do this, please review the steps provided in the AWS Security Incident Response Guide White Paper: https://d1.awsstatic.com/whitepapers/aws_security_incident_response.pdf.

Summary

In this chapter, we looked at some of the recommendations regarding how to prepare for when a security incident occurs and some of the methods, services, and techniques that can be used to identify, isolate, and minimize the blast radius of damage across your environment.

Should you ever be contacted by AWS regarding a security incident, you must follow their instructions immediately and implement your own level of incident response in coordination with AWS's requirements.

The key to a successful incident response plan is planning and preparation. If you have read through this chapter well enough and have performed this element sufficiently, then you now stand a far higher chance of gaining control of the incident quicker and more effectively. Preparation is, in fact, the first element of the incident response life cycle within NIST Special Publication 800-61. Due to this, you must prepare for incidents and ensure you have your logging, auditing, monitoring, and detection services and features configured. You also need to have a way of isolating and removing affected resources from your production environment. You must also have the ability to investigate, analyze, and perform recovery for your affected systems.

In the next chapter, you'll learn how to secure connectivity to your AWS environment from your corporate data center using AWS virtual private networks (VPNs) and AWS Direct Connect.

Questions

As we conclude, here is a list of questions for you to test your knowledge regarding this chapter's material. You will find the answers in the Assessments section of the Appendix:

Which framework has been designed by AWS to help you transition and migrate solutions into AWS Cloud that's based on best practices and recommendations?
Which AWS service is a regional-based managed service that's powered by machine learning, specifically designed to be an intelligent threat detection service?
Which AWS service acts as a single-pane-of-glass view across your infrastructure, thus bringing all of your security statistical data into a single place and presented in a series of tables and graphs?
True or False: Having a separate AWS account to be used for forensic investigations is essential to helping you diagnose and isolate any affected resource.

Table of Contents for
Incident Response

Technical requirements

Where to start when implementing effective IR

Making use of AWS features

Logging

Threat detection and management

Responding to an incident

Forensic AWS account

Collating log information

Resource isolation

Copying data

Forensic instances

A common approach to an infrastructure security incident

Summary

Questions

Further reading

Table of Contents for Incident Response

Create new playlist

Sign In

Sign Up

Technical requirements

Where to start when implementing effective IR

Making use of AWS features

Logging

Threat detection and management

Responding to an incident

Forensic AWS account

Collating log information

Resource isolation

Copying data

Forensic instances

A common approach to an infrastructure security incident

Summary

Questions

Further reading

Table of Contents for
Incident Response