In Chapter 2, Deep Learning AMIs, we used AWS Deep Learning AMIs (DLAMIs) to set up an environment inside an EC2 instance where we could train and evaluate a deep learning model. In this chapter, we will take a closer look at AWS Deep Learning Containers (DLCs), which can run consistently across multiple environments and services. In addition to this, we will discuss the similarities and differences between DLAMIs and DLCs.
The hands-on solutions in this chapter focus on the different ways we can use DLCs to solve several pain points when working on machine learning (ML) requirements in the cloud. For example, container technologies such as Docker allow us to make the most of our running EC2 instances since we’ll be able to run different types of applications inside containers, without having to worry about whether their dependencies would conflict or not. In addition to this, we would have more options and solutions available when trying to manage and reduce costs. For one thing, if we were to use the container image support of AWS Lambda (a serverless compute service that lets us run our custom backend code) to deploy our deep learning model inside a serverless function, we would be able to significantly reduce the infrastructure costs associated with having an inference endpoint running 24/7. At the same time, with a serverless function, all we need to worry about is the custom code inside the function since AWS will take care of the infrastructure where this function would run.
In the scenario discussed in the Understanding how AWS pricing works for EC2 instances section of the previous chapter, we were able to reduce the cost of running a 24/7 inference endpoint to about $69.12 per month using an m6i.large instance. It is important to note that this value would more or less remain constant, even if this inference endpoint is not receiving any traffic. In other words, we might be paying $69.12 per month for a resource that could be either underutilized or unused. If we were to set up a staging environment that is configured the same as the production environment, this cost would double and it’s pretty much guaranteed that the staging environment resources would be severely underutilized. At this point, you might be wondering, Is it possible for us to reduce this cost further? The good news is that this is possible, so long as we can design a more optimal architecture using the right set of tools, services, and frameworks.
We will start the hands-on section of this chapter by training a PyTorch model inside a DLC. This model will be uploaded to a custom container image that will then be used to create an AWS Lambda function. After that, we will create an API Gateway HTTP API that accepts an HTTP request and triggers the AWS Lambda function with an event containing the input request data. The AWS Lambda function will then load the model we trained to perform ML predictions.
In this chapter, we will cover the following topics:
While working on the hands-on solutions of this chapter, we will cover several serverless services such as AWS Lambda and Amazon API Gateway, which allow us to run applications without having to manage the infrastructure ourselves. At the same time, the cost of using these resources scales automatically, depending on the usage of these resources. In a typical setup, we may have an EC2 instance running 24/7 where we will be paying for the running resource, regardless of whether it is being used. With AWS Lambda, we only need to pay when the function code runs. If it only runs for a few seconds per month, then we may pay close to zero for that month!
With these points in mind, let’s begin this chapter with a quick introduction to how AWS DLCs work.
Before we start, we must have the following ready:
The Jupyter notebooks, source code, and other files used for each chapter are available in this book’s GitHub repository at https://github.com/PacktPublishing/Machine-Learning-Engineering-on-AWS.
Important Note
It is recommended that you use an IAM user with limited permissions instead of the root account when running the examples in this book. We will discuss this, along with other security best practices, in detail in Chapter 9, Security, Governance, and Compliance Strategies. If you are just starting using AWS, you may proceed with using the root account in the meantime.
Containers allow developers, engineers, and system administrators to run processes, scripts, and applications inside consistent isolated environments. This consistency is guaranteed since these containers are launched from container images, similar to how EC2 instances are launched from Amazon Machine Images (AMIs).
It is important to note that we can run different isolated containers at the same time inside an instance. This allows engineering teams to make the most of the computing power available to the existing instances and run different types of processes and workloads, similar to what we have in the following diagram:
Figure 3.1 – Running multiple containers inside a single EC2 instance
One of the most popular container management solutions available is Docker. It is an open source containerization platform that allows developers and engineers to easily build, run, and manage containers. It involves the usage of a Dockerfile, which is a text document containing instructions on how to build container images. These container images are then managed and stored inside container registries so that they can be used at a later time.
Note
Docker images are used to create containers. Docker images are like ZIP files that package everything needed to run an application. When a Docker container is run from a container image (using the docker run command), the container acts like a virtual machine, with its environment isolated and separate from the server where the container is running.
Now that we have a better idea of how containers and container images work, let’s proceed by discussing what DLCs are and how these are used to speed up the training and deployment of ML models. One of the key benefits when using AWS DLCs is that most of the relevant ML packages, frameworks, and libraries are installed in the container images already. This means that ML engineers and data scientists no longer need to worry about installing and configuring the ML frameworks, libraries, and packages. This allows them to proceed with preparing the custom scripts used for training and deploying their deep learning models.
Since DLC images are simply prebuilt container images, these can be used in any AWS service where containers and container images can be used. These AWS services include Amazon EC2, Amazon Elastic Container Service (ECS), Amazon Elastic Kubernetes Service (EKS), Amazon SageMaker, AWS CodeBuild, AWS Lambda, and more.
With these in mind, let’s proceed with training and deploying a deep learning model using AWS Deep Learning Containers!
In this section, we will ensure that the following prerequisites are ready before proceeding with the training steps:
In the first part of this chapter, we will run our Deep Learning Container inside an EC2 instance, similar to what’s shown in the following diagram:
Figure 3.2 – Running a Deep Learning Container inside an EC2 instance
This container will serve as the environment where the ML model is trained using a script that utilizes the PyTorch framework. Even if PyTorch is not installed in the EC2 instance, the training script will still run successfully since it will be executed inside the container environment where PyTorch is preinstalled.
Note
If you are wondering what PyTorch is, it is one of the most popular open source ML frameworks available. You may check out https://pytorch.org/ for more information.
In the next set of steps, we will make sure that our Cloud9 environment is ready:
Figure 3.3 – Navigating to the Cloud9 console
Here, we can see that the region is currently set to Oregon (us-west-2). Make sure that you change this to where you created the Cloud9 instance in Chapter 1, Introduction to ML Engineering on AWS.
Note
If you skipped the first chapter, make sure that you complete the Creating your Cloud9 environment and Increasing the Cloud9 storage sections of that chapter before proceeding.
mkdir -p ch03 cd ch03
We will use this directory as our current working directory for this chapter.
Now that we have our Cloud9 environment ready, let’s proceed with downloading the training dataset so that we can train our deep learning model.
The training dataset we will use in this chapter is the same dataset we used in Chapter 2, Deep Learning AMIs. It has two columns that correspond to the continuous x and y variables. Later in this chapter, we will also generate a regression model using this dataset. The regression model is expected to accept an input x value and return a predicted y value.
In the next set of steps, we will download the training dataset into our Cloud9 environment:
mkdir -p data
wget https://bit.ly/3h1KBx2 -O data/training_data.csv
head data/training_data.csv
This should give us rows of (x,y) pairs, similar to what is shown in the following screenshot:
Figure 3.4 – The first few rows of the training_data.csv file
Since we started this section inside the ch03 directory, it is important to note that the training_data.csv file should be inside the ch03/data directory.
Now that we have the prerequisites ready, we can proceed with the training step.
At this point, you might be wondering what makes a deep learning model different from other ML models. Deep learning models are networks of interconnected nodes that communicate with each other, similar to how networks of neurons communicate in a human brain. These models make use of multiple layers in the network, similar to what we have in the following diagram. Having more layers and more neurons per layer gives deep learning models the ability to process and learn complex non-linear patterns and relationships:
Figure 3.5 – Deep learning model
Deep learning has several practical applications in natural language processing (NLP), computer vision, and fraud detection. In addition to these, here are some of its other applications and examples as well:
These past couple of years, the training and deployment of deep learning models have been greatly simplified with deep learning frameworks such as PyTorch, TensorFlow, and MXNet. AWS DLCs speed things up further by providing container images that already come preinstalled with everything you need to run these ML frameworks.
Note
You can view the list of available DLC images here: https://github.com/aws/deep-learning-containers/blob/master/available_images.md. Note that these container images are categorized by (1) the installed ML framework (PyTorch, TensorFlow, or MXNet), (2) the job type (training or inference), and (3) the installed Python version.
In the next set of steps, we will use the DLC image that’s been optimized to train PyTorch models:
wget https://bit.ly/3KcsG3v -O train.py
Before we proceed, let’s check the contents of the train.py file by opening it from the File tree:
Figure 3.6 – Opening the train.py file from the File tree
We should see a script that makes use of the training data stored in the data directory to train a deep learning model. This model gets saved in the model directory after the training step has been completed:
Figure 3.7 – The main() function of the train.py script file
Here, we can see that the main() function of our train.py script performs the following operations:
The last block of code in the preceding screenshot simply runs the main() function if train.py is being executed directly as a script.
Note
You can find the complete train.py script here: https://github.com/PacktPublishing/Machine-Learning-Engineering-on-AWS/blob/main/chapter03/train.py.
mkdir -p model
Later, we will see that the model output gets saved inside this directory.
sudo apt install tree
tree
This should yield a tree-like structure, similar to what we have in the following screenshot:
Figure 3.8 – Results after using the tree command
It is important to note that the train.py script is in the ch03 directory, which is where the data and model directories are located as well.
wget https://bit.ly/3Iz7zaV -O train.sh
If we check the contents of the train.sh file, we should see the following lines:
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 763104351884.dkr.ecr.us-west-2.amazonaws.com TRAINING_IMAGE=763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training:1.8.1-cpu-py36-ubuntu18.04 docker run -it -v `pwd`:/env -w /env $TRAINING_IMAGE python train.py
The train.sh script first authenticates with Amazon Elastic Container Registry (a fully managed Docker container registry where we can store our container images) so that we can successfully download the training container image. This container image has PyTorch 1.8.1 and Python 3.6 preinstalled already.
Important Note
The code in the train.sh script assumes that we will run the training experiment inside an EC2 instance (where the Cloud9 environment is running) in the Oregon (us-west-2) region. Make sure that you replace us-west-2 with the appropriate region code. For more information on this topic, feel free to check out https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html.
The docker run command first downloads the specified container image and creates a running container process using that image. After that, the contents of the current working directory are “copied” to the container after the current working directory (ch03) is mounted to the container using the -v flag when running the docker run command. We then set the working directory to where our files were mounted (/env) inside the container using the -w flag. Once all the steps are complete, the train.py script is executed inside the environment of the running container.
Note
Check out https://docs.docker.com/engine/reference/run/ for more information on how to use the docker run command.
chmod +x train.sh ./train.sh
This should yield a set of logs, similar to the following:
Figure 3.9 – Logs generated while running the train.sh script
Here, the train.sh script ran a container that invoked the train.py (Python) script to train the deep learning model. In the preceding screenshot, we can see the logs that were generated by the train.py script as it iteratively updates the weights of the neural network to improve the quality of the output model (that is, reducing the loss per iteration so that we can minimize the error). It is important to note that this train.py script makes use of PyTorch to prepare and train a sample deep learning model using the data provided.
This is the reason why we’re using a deep learning container image that has PyTorch 1.8.1 and Python 3.6 preinstalled already.
Note
This step may take 5 to 10 minutes to complete. Feel free to get a cup of coffee or tea while waiting!
tree
This should yield a tree-like structure, similar to the following:
Figure 3.10 – Verifying whether the model was saved successfully
This model.pth file contains the serialized model we have trained using the train.py script. This file was created using the torch.save() method after the model training step was completed. Feel free to check out https://pytorch.org/tutorials/beginner/saving_loading_models.html for more information.
Note
The generated model.pth file allows us to use the parameters of the model to make predictions (after the model has been loaded from the file). For example, if our model makes use of an equation such as ax^2 + bxy + cy^2 = 0, the a, b, and c values are the model parameters. With this, if we have x (which is the independent variable), we can easily compute the value of y. That said, we can say that determining a, b, and c is the task of the training phase, and that determining y given x (and given a, b, and c) is the task of the inference phase. By loading the model.pth file, we can proceed with the inference phase and compute for the predicted value of y given an input x value.
Wasn’t that easy? With the training step complete, we will proceed with the deployment step in the next section.
Now that we have the model.pth file, what do we do with it? The answer is simple: we will deploy this model in a serverless API using an AWS Lambda function and an Amazon API Gateway HTTP API, as shown in the following diagram:
Figure 3.11 – Serverless ML deployment with an API Gateway and AWS Lambda
As we can see, the HTTP API should be able to accept GET requests from “clients” such as mobile apps and other web servers that interface with end users. These requests then get passed to the AWS Lambda function as input event data. The Lambda function then loads the model from the model.pth file and uses it to compute the predicted y value using the x value from the input event data.
Our AWS Lambda function code needs to utilize PyTorch functions and utilities to load the model. To get this setup working properly, we will build a custom container image from an existing DLC image optimized for PyTorch inference requirements. This custom container image will be used for the environment where our AWS Lambda function code will run through AWS Lambda’s container image support.
Note
For more information on AWS Lambda’s container image support, check out https://aws.amazon.com/blogs/aws/new-for-aws-lambda-container-image-support/.
It is important to note that a variety of DLC images are available for us to choose from. These images are categorized based on their job type (training versus inference), installed framework (PyTorch versus TensorFlow versus MXNet versus other options), and installed Python version (3.8 versus 3.7 versus 3.6 versus other options). Since we are planning to use a container where a PyTorch model can be loaded and used to perform predictions, we will be choosing a PyTorch DLC optimized for inference as the base image when building the custom Docker image.
The following steps focus on building a custom container image from an existing DLC image:
wget https://bit.ly/3pt5mGN -O dlclambda.zip unzip dlclambda.zip
This ZIP file contains the files and scripts needed to build the custom container image.
tree
This should yield a tree-like structure, similar to the following:
Figure 3.12 – Results after running the tree command
Here, several new files have been extracted from the dlclambda.zip file:
We will discuss each of these files in detail as we go through the steps in this chapter.
Figure 3.13 – app.py Lambda handler implementation
This file contains the AWS Lambda handler implementation code, which (1) loads the model, (2) extracts the input x value from the event data, (3) computes for the predicted y value using the model, and (4) returns the output y value as a string.
In the Completing and testing the serverless API setup section near the end of this chapter, we will set up an HTTP API that accepts a value for x via the URL query string (for example, https://<URL>/predict?x=42). Once the request comes in, Lambda will call a handler function that contains the code to handle the incoming request. It will load the deep learning model and use it to predict the value of y using the value of x.
Note
You can find the complete app/app.py file here: https://github.com/PacktPublishing/Machine-Learning-Engineering-on-AWS/blob/main/chapter03/app/app.py.
cp model/model.pth app/model/model.pth
Important Note
Make sure that you only load ML models from trusted sources. Inside app/app.py, we are loading the model using torch.load(), which can be exploited by attackers with a model containing a malicious payload. Attackers can easily prepare a model (with a malicious payload) that, when loaded, would give the attacker access to your server or resource running the ML scripts (for example, through a reverse shell). For more information on this topic, you may check the author’s talk on how to hack and secure ML environments and systems: https://speakerdeck.com/arvslat/pycon-apac-2022-hacking-and-securing-machine-learning-environments-and-systems?slide=8.
chmod +x *.sh
cat build.sh
This should yield a single line of code, similar to what we have in the following code block:
docker build -t dlclambda .
The docker build command builds a Docker container image using the instructions specified in the Dockerfile in the current directory. What does this mean? This means that we are building a container image using the relevant files in the directory and we’re using the instructions in the Dockerfile to install the necessary packages as well. This process is similar to preparing the DNA of a container, which can be used to create new containers with an environment configured with the desired set of tools and packages.
Since we passed dlclambda as the argument to the -t flag, our custom container image will have the dlclambda:latest name and tag after the build process completes. Note that we can replace the latest tag with a specific version number (for example, dlclambda:3), but we will stick with using the latest tag for now.
Note
For more information on the docker build command, feel free to check out https://docs.docker.com/engine/reference/commandline/build/.
Note
A multi-stage build is a process that helps significantly reduce the size of the Docker container image by having multiple FROM instructions within a single Dockerfile. Each of these FROM instructions corresponds to a new build stage where artifacts and files from previous stages can be copied. With a multi-stage build, the last build stage produces the final image (which ideally does not include the unused files from the previous build stages).
The expected final output would be a container image that can be used to launch a container, similar to the following:
Figure 3.14 – Lambda Runtime Interface Client
If this container is launched without any additional parameters, the following command will execute:
/opt/conda/bin/python -m awslambdaric app.handler
This will run the Runtime Interface Client and use the handler() function of our app.py file to process AWS Lambda events. This handler() function will then use the deep learning model we trained in the Using AWS Deep Learning Containers to train an ML model section to make predictions.
Note
You can find the Dockerfile here: https://github.com/PacktPublishing/Machine-Learning-Engineering-on-AWS/blob/main/chapter03/Dockerfile.
Before running the build.sh script, make sure that you replace all instances of us-west-2 in the Dockerfile with the appropriate region code.
./build.sh
docker images | grep dlclambda
We should see that the container image size of dlclambda is 4.61GB. It is important to note that there is a 10 GB limit when using container images for Lambda functions. The image size of our custom container image needs to be below 10 GB if we want these to be used in AWS Lambda.
At this point, our custom container image is ready. The next step is to test the container image locally before using it to create an AWS Lambda function.
We can test the container image locally using the Lambda Runtime Interface Emulator. This will help us check whether our container image will run properly when it is deployed to AWS Lambda later.
In the next couple of steps, we will download and use the Lambda Runtime Interface Emulator to check our container image:
cat download-rie.sh
This should print the following block of code as output in the Terminal:
mkdir -p ~/.aws-lambda-rie && curl -Lo ~/.aws-lambda-rie/aws-lambda-rie https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie && chmod +x ~/.aws-lambda-rie/aws-lambda-rie
The download-rie.sh script simply downloads the Lambda Runtime Interface Emulator binary and makes it executable using the chmod command.
sudo ./download-rie.sh
cat run.sh
We should see a docker run command with several parameter values, similar to what we have in the following code block:
docker run -v ~/.aws-lambda-rie:/aws-lambda -p 9000:8080 --entrypoint /aws-lambda/aws-lambda-rie dlclambda:latest /opt/conda/bin/python -m awslambdaric app.handler
Let’s quickly check the parameter values that were passed to each of the flags:
This docker run command overrides the default ENTRYPOINT command and uses the Lambda Interface Emulator binary, aws-lambda-rie, instead of using the --entrypoint flag. This will then start a local endpoint at http://localhost:9000/2015-03-31/functions/function/invocations.
Note
For more information on the docker run command, feel free to check out https://docs.docker.com/engine/reference/commandline/run/.
./run.sh
Figure 3.15 – Creating a new Terminal tab
Note that the run.sh script should be kept running while we are opening a New Terminal tab.
cd ch03 cat invoke.sh
This should show us what is inside the invoke.sh script file. It should contain a one-liner script, similar to what we have in the following block of code:
curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{"queryStringParameters":{"x":42}}'
This script simply makes use of the curl command to send a sample POST request containing the x input value to the local endpoint that was started by the run.sh script earlier.
./invoke.sh
This should yield a value close to "42.4586". Feel free to change the input x value in the invoke.sh script to see how the output value changes as well.
Given that we were able to successfully invoke the app.py Lambda function handler inside the custom container image using the Lambda Runtime Interface Emulator, we can now proceed with pushing our container image to Amazon ECR and using it to create an AWS Lambda function.
Amazon Elastic Container Registry (ECR) is a container registry service that allows us to store and manage Docker container images. In this section, we will create an ECR repository and then push our custom container image to this ECR repository.
Let’s start by creating an ECR repository:
Figure 3.16 – Navigating to the Cloud9 console
This should open the Cloud9 console, where we can find all the created Cloud9 environments.
Figure 3.17 – Creating an ECR repository
Optionally, you can enable Tag immutability, similar to what is shown in the preceding screenshot. This will help ensure that we do not accidentally overwrite existing container image tags.
Figure 3.18 – View push commands
Click the View push commands button to open the Push commands for <ECR repository name> popup window.
Figure 3.19 – Push commands
This command will be used to authenticate the Docker client in our Cloud9 environment to Amazon ECR. This will give us permission to push and pull container images to Amazon ECR.
Figure 3.20 – Running the client authentication command
We should get a Login Succeeded message. Without this step, we wouldn’t be able to push and pull container images from Amazon ECR.
Figure 3.21 – Copying the docker tag command
This time, we will be copying the docker tag command from the Push commands window to the clipboard. The docker tag command is used to create and map named references to Docker images.
Note
The docker tag command is used to specify and add metadata (such as the name and the version) to a container image. A container image repository stores different versions of a specific image, and the docker tag command helps the repository identify which version of the image will be updated (or uploaded) when the docker push command is used. For more information, feel free to check out https://docs.docker.com/engine/reference/commandline/tag/.
docker tag dlclambda:latest <ACCOUNT ID>.dkr.ecr.us-west-2.amazonaws.com/dlclambda:latest
The command should be similar to what we have in the following code block after the latest tag has been replaced with 1:
docker tag dlclambda:latest <ACCOUNT ID>.dkr.ecr.us-west-2.amazonaws.com/dlclambda:1
Make sure that the <ACCOUNT ID> value is correctly set to the account ID of the AWS account you are using. The docker tag command that you copied from the Push commands window should already have the <ACCOUNT ID> value set correctly.
docker images
This should return all the container images, including the dlclambda container images, as shown in the following screenshot:
Figure 3.22 – Running the docker images command
It is important to note that both container image tags shown in the preceding screenshot have the same image ID. This means that they point to the same image, even if they have different names and tags.
docker push <ACCOUNT ID>.dkr.ecr.us-west-2.amazonaws.com/dlclambda:1
Make sure that you replace the value of <ACCOUNT ID> with the account ID of the AWS account you are using. You can get the value for <ACCOUNT ID> by checking the numerical value before .dkr.ecr.us-west-2.amazonaws.com/dlclambda after running the docker images command in the previous step.
Note
Note that the image tag value is a 1 (one) instead of the letter l after the container image name and the colon.
Figure 3.23 – Private repositories
This should redirect us to the details page, where we can see the different image tags, as shown in the following screenshot:
Figure 3.24 – Repository details page
Once our container image with the specified image tag has been reflected in the corresponding Amazon ECR repository details page, we can use it to create AWS Lambda functions using Lambda’s container image support.
Now that our custom container image has been pushed to Amazon ECR, we can prepare and configure the serverless API setup!
AWS Lambda is a serverless compute service that allows developers and engineers to run event-driven code without having to provision or manage infrastructure. Lambda functions can be invoked by resources from other AWS services such as API Gateway (a fully managed service for configuring and managing APIs), Amazon S3 (an object storage service where we can upload and download files), Amazon SQS (a fully managed message queuing service), and more. These functions are executed inside isolated runtime environments that have a defined max execution time and max memory limits, similar to what we have in the following diagram:
Figure 3.25 – AWS Lambda isolated runtime environment
There are two ways to deploy Lambda function code and its dependencies:
When using a container image as the deployment package, the custom Lambda function code can use what is installed and configured inside the container image. That said, if we were to use the custom container image that was built from AWS DLC, we would be able to use the installed ML framework (that is, PyTorch) in our function code and run ML predictions inside an AWS Lambda execution environment.
Now that we have a better understanding of how AWS Lambda’s container image support works, let’s proceed with creating our AWS Lambda function:
Figure 3.26 – Using the container image support of AWS Lambda
Selecting the Container image option means that we will use a custom container image as the deployment package. This deployment package is expected to contain the Lambda code, along with its dependencies.
Figure 3.27 – Selecting the container image
Under Amazon ECR image repository, select the container image we have pushed to Amazon ECR (dlclambda:1).
Note
This step may take 3 to 5 minutes to complete. Feel free to get a cup of coffee or tea while waiting!
Figure 3.28 – Editing the general configuration
Here, we can see that the AWS Lambda function is configured with a default max memory limit of 128 MB and a timeout of 3 seconds. An error is raised if the Lambda function exceeds one or more of the configured limits during execution.
Figure 3.29 – Modifying the memory and timeout settings
Note that increasing the memory and timeout limits here will influence the compute power and total running time available for the Lambda function, as well as the overall cost of running predictions using the service. For now, let’s focus on getting the AWS Lambda function to work using these current configuration values for Memory and Timeout. Once we can get the initial setup running, we can play with different combinations of configuration values to manage the performance and cost of our setup.
Note
We can use the AWS Compute Optimizer to help us optimize the overall performance and cost of AWS Lambda functions. For more information on this topic, check out https://aws.amazon.com/blogs/compute/optimizing-aws-lambda-cost-and-performance-using-aws-compute-optimizer/.
Figure 3.30 – Configuring the test event
Make sure that you specify the following test event value inside the code editor, similar to what is shown in the preceding screenshot:
{ "queryStringParameters": { "x": 42 } }
This test event value gets passed to the event (first) parameter of the AWS Lambda handler() function when a test execution is performed.
Figure 3.31 – Successful execution result
After a few seconds, we should see that the execution results succeeded, similar to what we have in the preceding screenshot.
Important Note
During an AWS Lambda function’s first invocation, it may take a few seconds for its function code to be downloaded and for its execution environment to be prepared. This phenomenon is commonly referred to as a cold start. When it is invoked a second time (within the same minute, for example), the Lambda function runs immediately without the delay associated with the cold start. For example, a Lambda function may take around 30 to 40 seconds for its first invocation to complete. After that, all succeeding requests would take a second or less. The Lambda function completes its execution significantly faster since the execution environment that was prepared during the first invocation is frozen and reused for succeeding invocations. If the AWS Lambda function is not invoked after some time (for example, around 10 to 30 minutes of inactivity), the execution environment is deleted and a new one needs to be prepared again the next time the function gets invoked. There are different ways to manage this and ensure that the AWS Lambda function performs consistently without experiencing the effects of a cold start. One of the strategies is to utilize Provisioned Concurrency, which helps ensure predictable function start times. Check out https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-1/ for more information on this topic.
With our AWS Lambda function ready to perform ML predictions, we can proceed with creating the serverless HTTP API that will trigger our Lambda function.
The AWS Lambda function we created needs to be triggered by an event source. One of the possible event sources is an API Gateway HTTP API configured to receive an HTTP request. After receiving the request, the HTTP API will pass the request data to the AWS Lambda function as an event. Once the Lambda function receives the event, it will use the deep learning model to perform inference, and then return the predicted output value to the HTTP API. After that, the HTTP API will return the HTTP response to the requesting resource.
There are different ways to create an API Gateway HTTP API. In the next couple of steps, we will create this HTTP API directly from the AWS Lambda console:
Figure 3.32 – Add trigger
The Add trigger button should be on the left-hand side of the Function overview pane, as shown in the preceding screenshot.
Figure 3.33 – Trigger configuration
Here’s the trigger configuration that we have:
This will create and configure an HTTP API that accepts a request and sends the request data as an event to the AWS Lambda function.
Important Note
Note that this configuration needs to be secured once we have configured our setup for production use. For more information on this topic, check out https://docs.aws.amazon.com/apigateway/latest/developerguide/security.html.
Figure 3.34 – Updating the Payload format version
After updating the Payload format version, navigate back to our AWS Lambda browser tab and then click the API endpoint link (which should open a new tab). Since we did not specify an x value in the URL, the Lambda function will use 0 as the default x value when performing a test inference.
Note
You may want to trigger an exception instead if there is no x value specified when a request is sent to the API Gateway endpoint. Feel free to change this behavior by modifying line 44 of app.py: https://github.com/PacktPublishing/Machine-Learning-Engineering-on-AWS/blob/main/chapter03/app/app.py.
https://<API ID>.execute-api.us-west-2.amazonaws.com/default/dlclambda?x=42
Make sure that you press the Enter key to invoke a Lambda function execution with 42 as the input x value:
Figure 3.35 – Testing the API endpoint
This should return a value close to 42.4586, as shown in the preceding screenshot. Feel free to test different values for x to see how the predicted y value changes.
Important Note
Make sure that you delete the AWS Lambda and API Gateway resources once you are done configuring and testing the API setup.
At this point, we should be proud of ourselves as we were able to successfully deploy our deep learning model in a serverless API using AWS Lambda and Amazon API Gateway! Before the release of AWS Lambda’s container image support, it was tricky to set up and maintain serverless ML inference APIs using the same tech stack we used in this chapter. Now that we have this initial setup working, it should be easier to prepare and configure similar serverless ML-powered APIs. Note that we also have the option to create a Lambda function URL to generate a unique URL endpoint for the Lambda function.
Figure 3.36 – Cost of running the serverless API versus an API running inside an EC2 instance
Before we end this chapter, let’s quickly check what the costs would look like if we were to use AWS Lambda and API Gateway for the ML inference endpoint. As shown in the preceding diagram, the expected cost of running this serverless API depends on the traffic passing through it. This means that the cost would be minimal if no traffic is passing through the API. Once more traffic passes through this HTTP API endpoint, the cost would gradually increase as well. Comparing this to the chart on the right, the expected cost will be the same, regardless of whether there’s traffic passing through the HTTP API that was deployed inside an EC2 instance.
Choosing the architecture and setup to use for your API depends on a variety of factors. We will not discuss this topic in detail, so feel free to check out the resources available here: https://aws.amazon.com/lambda/resources/.
In this chapter, we were able to take a closer look at AWS Deep Learning Containers (DLCs). Similar to AWS Deep Learning AMIs (DLAMIs), AWS DLCs already have the relevant ML frameworks, libraries, and packages installed. This significantly speeds up the process of building and deploying deep learning models. At the same time, container environments are guaranteed to be consistent since these are run from pre-built container images.
One of the key differences between DLAMIs and DLCs is that multiple AWS DLCs can run inside a single EC2 instance. These containers can also be used in other AWS services that support containers. These services include AWS Lambda, Amazon ECS, Amazon EKS, and Amazon EC2, to name a few.
In this chapter, we were able to train a deep learning model using a DLC. We then deployed this model to an AWS Lambda function through Lambda’s container image support. After that, we tested the Lambda function to see whether it’s able to successfully load the deep learning model to perform predictions. To trigger this Lambda function from an HTTP endpoint, we created an API Gateway HTTP API.
In the next chapter, we will focus on serverless data management and use a variety of services to set up and configure a data warehouse and a data lake. We will be working with the following AWS services, capabilities, and features: Redshift Serverless, AWS Lake Formation, AWS Glue, and Amazon Athena.
For more information on the topics covered in this chapter, feel free to check out the following resources:
3.142.153.224