What are the use cases for serverless architectures, and what kinds of architectures and patterns are useful? We’re often asked about use cases as people learn about a serverless approach to the design of systems. We find that it’s helpful to look at how others have applied technology and what kinds of use cases, designs, and architectures they’ve produced. Our discussion will center on these use cases and sample architectures. This chapter will give you a solid understanding of where serverless architectures are a good fit and how to think about design of serverless systems.
Serverless technologies and architectures can be used to build entire systems, create isolated components, or implement specific, granular tasks. The scope for use of serverless design is large, and one of its advantages is that it’s possible to use it for small and large tasks alike. We’ve designed serverless systems that power web and mobile applications for tens of thousands of users, and we’ve built simple systems to solve specific, minute problems. It’s worth remembering that serverless is not just about running code in a compute service such as Lambda. It’s also about using third-party services and APIs to cut down on the amount of work you must do.
In this book you’re going to build a back end for a media-sharing, YouTube-like application. It will allow users to upload video files, transcode these files to different playable formats, and then allow other users to view them. You’ll construct an entirely serverless back end for a fully featured web application with a database and a RESTful API. And we’re going to show that serverless technologies are appropriate for building scalable back ends for all kinds of web, mobile, and desktop applications.
Technologies such as AWS Lambda are relatively new, but we’ve already seen large serverless back ends that power entire businesses. Our serverless platform, called A Cloud Guru (http://acloud.guru), supports many thousands of users collaborating in real time and streaming hundreds of gigabytes of video. Another example is Instant (http://instant.cm), which is a serverless content management system for static websites. And yet another example is a hybrid-serverless system built by EPX Labs. We’ll discuss all of these systems later in the chapter.
Apart from web and mobile applications, serverless is a great fit for IoT applications. Amazon Web Services (AWS) has an IoT platform (https://aws.amazon.com/iot-platform/how-it-works/) that combines the following:
The rules engine, for example, can save files to Amazon’s Simple Storage Service (S3), push data to an Amazon Simple Queue Service (SQS) queue, and invoke AWS Lambda functions. Amazon’s IoT platform makes it easy to build scalable IoT back ends for devices without having to run a server.
A serverless application back end is appealing because it removes a lot of infrastructure management, has granular and predictable billing (especially when a serverless compute service such as Lambda is used), and can scale well to meet uneven demand.
A common use for serverless technologies is data processing, conversion, manipulation, and transcoding. We’ve seen Lambda functions built by other developers for processing of CSV, JSON, and XML files; collation and aggregation of data; image resizing; and format conversion. Lambda and AWS services are well suited for building event-driven pipelines for data-processing tasks.
In chapter 3, you’ll build the first part of your application, which is a powerful pipeline for converting videos from one format to another. This pipeline will set file permissions and generate metadata files. It will run only when a new video file is added to a designated S3 bucket, meaning that you’ll pay only for execution of Lambda when there’s something to do and not while the system is idling. More broadly, however, we find data processing to be an excellent use case for serverless technologies, especially when we use a Lambda function in concert with other services.
Ingestion of data—such as logs, system events, transactions, or user clicks—can be accomplished using services such as Amazon Kinesis Streams (see appendix A for more information on Kinesis). Lambda functions can react to new records in a stream, and can process, save, or discard data quickly. A Lambda function can be configured to run when a specific number (batch size) of records is available for processing, so that it doesn’t have to execute for every individual record added to the stream.
Kinesis streams and Lambda functions are a good fit for applications that generate a lot of data that needs to be analyzed, aggregated, and stored. When it comes to Kinesis, the number of functions spawned to process messages off a stream is the same as the number of shards (therefore, there’s one Lambda function per shard). Furthermore, if a Lambda function fails to process a batch, it will retry. This can keep going for up to 24 hours (which is how long Kinesis will keep data around before it expires) if processing fails each time. But even with these little gotchas (which you now know), the combination of Kinesis streams and Lambda is really powerful if you want to do real-time processing and analytics.
One innovative use case of the Amazon API Gateway and Lambda (which we’ve seen a few times) is what we refer to as the legacy API proxy. Here, developers use API Gateway and Lambda to create a new API layer over legacy APIs and services to make them easier to use. The API Gateway is used to create a RESTful interface, and Lambda functions are used to transpose request/response and marshal data to formats that legacy services understand. This approach makes legacy services easier to consume for modern clients that may not support older protocols and data formats.
Lambda functions can run on a schedule, which makes them effective for repetitive tasks like data backups, imports and exports, reminders, and alerts. We’ve seen developers use Lambda functions on a schedule to periodically ping their websites to see if they’re online and send an email or a text message if they’re not. There are Lambda blueprints available for this (a blueprint is a template with sample code that can be selected when creating a new Lambda function). And we’ve seen developers write Lambda functions to perform nightly downloads of files off their servers and send daily account statements to users. Repetitive tasks such as file backup and file validation can also be done easily with Lambda thanks to the scheduling capability that you can set and forget.
Another popular use of Lambda functions and serverless technologies is to build bots (a bot is an app or a script that runs automated tasks) for services such as Slack (a popular chat system—https://slack.com). A bot made for Slack can respond to commands, carry out small tasks, and send reports and notifications. We, for example, built a Slack bot in Lambda to report on the number of online sales made each day via our education platform. And we’ve seen developers build bots for Telegram, Skype, and Facebook’s messenger platform.
Similarly, developers write Lambda functions to power Amazon Echo skills. Amazon Echo is a hands-free speaker that responds to voice commands. Developers can implement skills to extend Echo’s capabilities even further (a skill is essentially an app that can respond to a person’s voice; for more information, see http://amzn.to/2b5NMFj). You can write a skill to order a pizza or quiz yourself on geography. Amazon Echo is driven entirely by voice, and skills are powered by Lambda.
The two overarching architectures that we’ll discuss in this book are compute as back end (that is, back ends for web and mobile applications) and compute as glue (pipelines built to carry out workflows). These two architectures are complementary. It’s highly likely that you’ll build and combine these architectures if you end up working on any kind of real-world serverless system. Most of the architectures and patterns described in this chapter are specializations and variations of these two to some extent.
The compute-as-back-end architecture describes an approach where a serverless compute service such as Lambda and third-party services are used to build a back end for web, mobile, and desktop applications. You may note in figure 2.1 that the front end links directly to the database and an authentication service. This is because there’s no need to put every service behind an API Gateway if the front end can communicate with them in a secure manner (for example, using delegation tokens; chapters 5 and 9 discuss this in more detail). One of the aims of this architecture is to allow the front end to communicate with services, encompass custom logic in Lambda functions, and provide uniform access to functions via a RESTful interface.
In chapter 1, we described our principles of serverless architectures. Among them we mentioned thicker front ends (principle 4) and encouraged the use of third-party services (principle 5). These two principles are particularly relevant if you’re building a serverless back end rather than event-driven pipelines. We find that good serverless systems try to minimize the scope and the footprint of Lambda functions so that these functions do only the bare minimum (call them nano functions, if you will) and primarily focus on the tasks that must not be done in the front end because of privacy or security concerns. Nevertheless, finding the right level of granularity for a function can be a challenging task. Make functions too granular and you’ll end up with a sprawling back end, which can be painful to debug and maintain after a long time. Ignore granularity and you’ll risk building mini-monoliths that nobody wants (one helpful lesson we’ve learned is to try to minimize the number of data transformations in a Lambda function to keep complexity under control).
A Cloud Guru (https://acloud.guru) is an online education platform for solution architects, system administrators, and developers wanting to learn Amazon Web Services. The core features of the platform include (streaming) video courses, practice exams and quizzes, and real-time discussion forums. A Cloud Guru is also an e-commerce platform that allows students to buy courses and watch them at their leisure. Instructors who create courses for A Cloud Guru can upload videos directly to an S3 bucket, which are immediately transcoded to a number of different formats (1080p, 720p, HLS, WebM, and so on) and are made available for students to view. The Cloud Guru platform uses Firebase as its primary client-facing database, which allows clients to receive updates in near real time without refreshing or polling (Firebase uses web sockets to push updates to all connected devices at the same time). Figure 2.2 shows a cut down version of the architecture used by A Cloud Guru.
Note the following about the Cloud Guru architecture given in figure 2.2:
Instant (http://instant.cm) is a startup that helps website owners add content management facilities—including inline text editing and localization—to their static websites. The founders, Marcel Panse and Sander Nagtegaal, describe it as instant content management system. Instant works by adding a small JavaScript library to a website and making a minor change to HTML. This allows developers and administrators to edit text elements directly via the website’s user interface. Draft edits made to the text are stored in DynamoDB (see appendix A on DynamoDB). The final, production version of the text (that the end user sees) is served as a JSON file from an S3 bucket via Amazon CloudFront (figure 2.3).
A simplified version of the Instant architecture is shown in figure 2.4. Note the following about the Instant architecture:
Marcel and Sander make a few points about their system:
The use of Lambda functions leads to an architecture of microservices quite naturally. Every function is completely shielded from the rest of the code. It gets better: the same Lambda function can fire in parallel in almost infinite numbers—and this is all done completely automated.
In terms of cost, Marcel and Sander share the following:
With our serverless setup, we primarily pay for data transfer through CloudFront, a tiny bit for storage and for each millisecond that our Lambda functions run. Since we know on average what a new customer uses, we can calculate the costs per customer exactly. That’s something we couldn’t do in the past, when multiple users were shared across the same infrastructure.
Overall, Marcel and Sander find that adopting an entirely serverless approach has been a winner for them primarily from the perspectives of operations, performance, and cost.
The legacy API proxy architecture is an innovative example of how serverless technologies can solve problems. As we mentioned in section 2.1.4, systems with outdated services and APIs can be difficult to use in modern environments. They might not conform to modern protocols or standards, which might make interoperability with current systems harder. One way to alleviate this problem is to use the API Gateway and Lambda in front of those legacy services. The API Gateway and Lambda functions can transform requests made by clients and invoke legacy services directly, as shown in figure 2.5.
The API Gateway can transform requests (to an extent) and issue requests against other HTTP endpoints (see chapter 7). But it works only in a number of fairly basic (and limited) use cases where only JSON transformation is needed. In more complex scenarios, however, a Lambda function is needed to convert data, issue requests, and process responses. Take a Simple Object Access Protocol (SOAP) service as an example. You’d need to write a Lambda function to connect to a SOAP service and then map responses to JSON. Thankfully, there are libraries that can take care of much of the heavy lifting in a Lambda function (for example, there are SOAP clients that can be downloaded from the npm registry for this purpose; see https://www.npmjs.com/package/soap).
As we mentioned in chapter 1, serverless technologies and architectures are not an all-or-nothing proposition. They can be adopted and used alongside traditional systems. The hybrid approach may work especially well if a part of the existing infrastructure is already in AWS. We’ve also seen adoption of serverless technologies and architectures in organizations with developers initially creating standalone components (often to do additional data processing, database backups, and basic alerting) and over time integrating these components into their main systems; see figure 2.6.
EPX Labs (http://epxlabs.com) proudly state that the “future of IT Operations and Application Development is less about servers and more about services.” They specialize in serverless architectures, with one of their recent solutions being a hybrid serverless system designed to carry out maintenance and management jobs on a distributed server-based infrastructure running on Amazon’s Elastic Compute Cloud (EC2) (figure 2.7).
Evan Sinicin and Prachetas Prabhu of EPX Labs describe the system they had to work with as a “multi-tenant Magento (https://magento.com) application running on multiple frontend servers. Magento requires certain processes to run on the servers such as cache clearing and maintenance operations. Additionally, all site management operations such as build, delete, and modify require a mix of on-server operations (building out directory structures, modifying configuration files, etc.) as well as database operations (creating new database, modifying data in database, and so on).” Evan and Prachetas created a scalable serverless system to assist with these tasks. Here’s how they describe how the system is built and the way it works:
GraphQL (http://graphql.org) is a popular data query language developed by Facebook in 2012 and released publicly in 2015. It was designed as an alternative to REST (Representational State Transfer) because of REST’s perceived weaknesses (multiple round-trips, over-fetching, and problems with versioning). GraphQL attempts to solve these problems by providing a hierarchical, declarative way of performing queries from a single end point (for example, api/graphql); see figure 2.8.
GraphQL gives power to the client. Instead of specifying the structure of the response on the server, it’s defined on the client (http://bit.ly/2aTjlh5). The client can specify what properties and relationships to return. GraphQL aggregates data from multiple sources and returns it to the client in a single round trip, which makes it an efficient system for retrieving data. According to Facebook, GraphQL serves millions of requests per second from nearly 1,000 different versions of its application.
In serverless architectures, GraphQL is usually hosted and run from a single Lambda function, which can be connected to an API Gateway (there are also hosted solutions of GraphQL like scaphold.io). GraphQL can query and write to multiple data sources, such as DynamoDB tables, and assemble a response that matches the request. A serverless GraphQL is a rather interesting approach you might want to look at next time you need to design an interface for your API and query data. Check out the following articles if you want to implement GraphQL in a serverless architecture:
The compute-as-glue architecture shown in figure 2.9 describes the idea that we can use Lambda functions to create powerful execution pipelines and workflows. This often involves using Lambda as glue between different services, coordinating and invoking them. With this style of architecture, the focus of the developer is on the design of their pipeline, coordination, and flow of data. The parallelism of serverless compute services like Lambda helps to make these architectures appealing. The example you’re going to build in this book uses this pattern to create an event-driven pipeline that transcodes videos (chapter 3, in particular, focuses on creating pipelines and applying this pattern to solve a complex task rather easily).
EPX Labs has built a system to process large real estate XML feeds (figure 2.10). Evan Sinicin and Prachetas Prabhu say that the goal of their system is “to pull the feed, separate the large file into single XML documents, and process them in parallel. Processing includes parsing, validation, hydration, and storing.”
They go on to describe how the system works in more detail:
As discussed in section 2.1.3, Amazon Kinesis Streams is a technology that can help process and analyze large amounts of streaming data. This data can include logs, events, transactions, social media feeds—virtually anything you can think of—as shown in figure 2.11. It’s a good way to continuously collect data that may change over time. Lambda is a perfect tool for Kinesis Streams because it scales automatically in response to how much data there is to process.
With Kinesis Streams you can accomplish the following:
Patterns are architectural solutions to problems in software design. They’re designed to address common problems found in software development. They’re also an excellent communications tool for developers working together on a solution. It’s far easier to find an answer to a problem if everyone in the room understands which patterns are applicable, how they work, their advantages, and their disadvantages. The patterns presented in this section are useful for solving design problems in serverless architectures. But these patterns aren’t exclusive to serverless. They were used in distributed systems long before serverless technologies became viable. Apart from the patterns presented in this chapter, we recommend that you become familiar with patterns relating to authentication (see chapter 4 for a discussion of the federated identity pattern), data management (CQRS, event sourcing, materialized views, sharding), and error handling (retry pattern). Learning and applying these patterns will make you a better software engineer, regardless of the platform you choose to use.
With the GraphQL architecture (section 2.2.4), we discussed the fact that a single end point can be used to cater to different requests with different data (a single GraphQL endpoint can accept any combination of fields from a client and create a response that matches the request). The same idea can be applied more generally. You can design a system in which a specific Lambda function controls and invokes other functions. You can connect it to an API Gateway or invoke it manually and pass messages to it to invoke other Lambda functions.
In software engineering, the command pattern (figure 2.12) is used to “encapsulate a request as an object, thereby letting you parameterize clients with different requests, queue or log requests, and support undoable operations” because of the “need to issue requests to objects without knowing anything about the operation being requested or the receiver of the request” (http://bit.ly/29ZaoWt). The command pattern allows you to decouple the caller of the operation from the entity that carries out the required processing.
In practice, this pattern can simplify the API Gateway implementation, because you may not want or need to create a RESTful URI for every type of request. It can also make versioning simpler. The command Lambda function could work with different versions of your clients and invoke the right Lambda function that’s needed by the client.
This pattern is useful if you want to decouple the caller and the receiver. Having a way to pass arguments as an object, and allowing clients to be parametrized with different requests, can reduce coupling between components and help make the system more extensible. Be aware of using this approach if you need to return a response to the API Gateway. Adding another function will increase latency.
Messaging patterns, shown in figure 2.13, are popular in distributed systems because they allow developers to build scalable and robust systems by decoupling functions and services from direct dependence on one another and allowing storage of events/records/requests in a queue. The reliability comes from the fact that if the consuming service goes offline, messages are retained in the queue and can still be processed at a later time.
This pattern features a message queue with a sender that can post to the queue and a receiver that can retrieve messages from the queue. In terms of implementation in AWS, you can build this pattern on top of the Simple Queue Service. Unfortunately, at the moment Lambda doesn’t integrate directly with SQS, so one approach to addressing this problem is to run a Lambda function on a schedule and let it check the queue every so often.
Depending on how the system is designed, a message queue can have a single sender/receiver or multiple senders/receivers. SQS queues typically have one receiver per queue. If you needed to have multiple consumers, a straightforward way to do it is to introduce multiple queues into the system (figure 2.14). A strategy you could apply is to combine SQS with Amazon SNS. SQS queues could subscribe to an SNS topic; pushing a message to the topic would automatically push the message to all of the subscribed queues.
Kinesis Streams is an alternative to SQS, although it doesn’t have some features, such as dead lettering of messages (http://amzn.to/2a3HJzH). Kinesis Streams integrates with Lambda, provides an ordered sequence of records, and supports multiple consumers.
This is a popular pattern used to handle workloads and data processing. The queue serves as a buffer, so if the consuming service crashes, data isn’t lost. It remains in the queue until the service can restart and begin processing it again. A message queue can make future changes easier, too, because there’s less coupling between functions. In an environment that has a lot of data processing, messages, and requests, try to minimize the number of functions that are directly dependent on other functions and use the messaging pattern instead.
A great benefit of using a platform such as AWS and serverless architectures is that capacity planning and scalability are more of a concern for Amazon’s engineers than for you. But in some cases, you may want to control how and when messages get dealt with by your system. This is where you might need to have different queues, topics, or streams to feed messages to your functions. Your system might go one step further and have entirely different workflows for messages of different priority. Messages that need immediate attention might go through a flow that expedites the process by using more expensive services and APIs with more capacity. Messages that don’t need to be processed quickly can go through a different workflow, as shown in figure 2.15.
This pattern might involve the creation and use of entirely different SNS topics, Kinesis Streams, SQS queues, Lambda functions, and even third-party services. Try to use this pattern sparingly, because additional components, dependencies, and workflows will result in more complexity.
This pattern works when you need to have a different priority on processing of messages. Your system can implement workflows and use different services and APIs to cater to many types of needs and users (for example, paying versus nonpaying users).
Fan-out is a type of messaging pattern that’s familiar to many users of AWS. Generally, the fan-out pattern is used to push a message out to all listening/subscribed clients of a particular queue or a message pipeline. In AWS, this pattern is usually implemented using SNS topics that allow multiple subscribers to be invoked when a new message is added to a topic. Take S3 as an example. When a new file is added to a bucket, S3 can invoke a single Lambda function with information about the file. But what if you need to invoke two, three, or more Lambda functions at the same time? The original function could be modified to invoke other functions (like the command pattern), but that’s a lot of work if all you need is to run functions in parallel. The answer is to use the fan-out pattern using SNS; see figure 2.16.
SNS topics are communications/messaging channels that can have multiple publishers and subscribers (including Lambda functions). When a new message is added to a topic, it forces invocation of all subscribers in parallel, thus causing the event to fan out. Going back to the S3 example discussed earlier, instead of invoking a single--message Lambda function, you can configure S3 to push a message onto an SNS topic to invoke all subscribed functions at the same time. It’s an effective way to create event-driven architectures and perform operations in parallel. You’ll implement this yourself in chapter 3.
This pattern is useful if you need to invoke multiple Lambda functions at the same time. An SNS topic will try and retry to invoke your Lambda functions if it fails to deliver the message or if the function fails to execute. Furthermore, the fan-out pattern can be used for more than just invocation of multiple Lambda functions. SNS topics support other subscribers such as email and SQS queues. Adding a new message to a topic can invoke Lambda functions, send an email, or push a message on to an SQS queue, all at the same time.
The purpose of the pipes and filters pattern is to decompose a complex processing task into a series of manageable, discrete services organized in a pipeline (figure 2.17). Components designed to transform data are traditionally referred to as filters, whereas connectors that pass data from one component to the next component are referred to as pipes. Serverless architecture lends itself well to this kind of pattern. This is useful for all kinds of tasks where multiple steps are required to achieve a result.
We recommend that every Lambda function be written as a granular service or a task with the single-responsibility principle in mind. Inputs and outputs should be clearly defined (that is, there should be a clear interface) and any side effects minimized. Following this advice will allow you to create functions that can be reused in pipelines and more broadly within your serverless system. You might notice that this pattern is similar to the compute-as-glue architecture we described previously. The compute-as-glue architecture is closely inspired by this pattern.
Whenever you have a complex task, try to break it down into a series of functions (a pipeline) and apply the following rules:
This chapter focused on use cases, architectures, and patterns. These are critical to understand and consider before embarking on a journey to build your system. The architectures we discussing include the following:
In terms of patterns, we covered these:
Throughout the rest of this book, we’re going to apply elements we explored in this chapter, with a particular focus on creating compute-as-back-end and compute-as-glue architectures. In the next chapter, you’ll begin building your serverless applications by implementing the compute-as-glue architecture and trying the fan-out pattern.
3.144.237.77