WordPress is a content management system that powers substantial parts of the internet. Like many other applications, WordPress uses a relational database to store articles, comments, users, and many other data. It is fair to say that relational databases are the de facto standard for storing and querying structured data, and many applications are built on top of a relational database system such as MySQL. Typically, relational databases focus on data consistency and guarantee ACID (atomicity, consistency, isolation, and durability) database transactions. A typical task is storing and querying structured data, such as the accounts and transactions in an accounting application. If you want to use a relational database on AWS, you have two options:
Use the managed relational database service Amazon RDS, which is offered by AWS.
Operate a relational database yourself on top of virtual machines.
The Amazon Relational Database Service (Amazon RDS) offers ready-to-use relational databases such as PostgreSQL, MySQL, MariaDB, Oracle Database, and Microsoft SQL Server. Beyond that, AWS offers its own engine called Amazon Aurora, which is MySQL and PostgreSQL compatible. As long as your application uses one of these database systems, it is not a big deal to switch to RDS. The trickiest part is migrating the data, which you will learn about in this chapter as well.
RDS is a managed service. The managed service provider—in this case, AWS—is responsible for providing a defined set of services—in this case, operating a relational database system. Table 10.1 compares using an RDS database and hosting a database yourself on virtual machines.
You’d need considerable time and know-how to build a comparable relational database environment based on virtual machines, so we recommend using Amazon RDS for relational databases whenever possible to decrease operational costs and improve quality. That’s why we won’t cover hosting your own relational database on VMs in this book. Instead, we’ll introduce Amazon RDS in detail.
In this chapter, you’ll launch a MySQL database with the help of Amazon RDS. Chapter 2 introduced a WordPress setup like the one shown in figure 10.1 and described next; you’ll use this example in this chapter, focusing on the database part:
The load balancer distributes the incoming request to a fleet of virtual machines.
Each virtual machine runs a web server, which connects to a MySQL database as well as a network filesystem.
After the MySQL database is up and running, you’ll learn how to import, back up, and restore data. More advanced topics like setting up a highly available database and improving the performance of the database will follow.
In the following section, you will launch the infrastructure required to run WordPress on AWS. In this chapter, we will focus on the MySQL database provided by RDS, but you can easily transfer what you learn to other database engines such as Aurora, PostgreSQL, MariaDB, Oracle, and Microsoft SQL Server, as well as to applications other than WordPress.
When you follow the official “How to Install WordPress” tutorial (see http://mng.bz/G1vV), one of the first steps is setting up a MySQL database. Formerly, you might have installed the database system on the same virtual machine that also runs the web server. However, operating a database system is not trivial. You have to implement a solid backup and recovery strategy, for example. Also, when running a database on a single virtual machine, you are introducing a single point of failure into your system that could cause downtimes of your website.
To overcome these challenges, you will use a fully managed MySQL database provided by RDS. AWS provides backup and restore functionality and offers database systems distributed among two data centers as well as the ability to recover from failure automatically.
Launching a database consists of two steps:
Next, you’ll use the same CloudFormation template you used in chapter 2 to spin up the cloud infrastructure for WordPress. The template can be found on GitHub and on S3. You can download a snapshot of the repository at https://github.com/AWSinAction/code3/archive/main.zip. The file we’re talking about is located at chapter10/template .yaml. On S3, the same file is located at http://s3.amazonaws.com/awsinaction-code3/chapter10/template.yaml.
Execute the following command to create a CloudFormation stack containing an RDS database instance with a MySQL engine and web servers serving the WordPress application:
$ aws cloudformation create-stack --stack-name wordpress --template-url ➥ https:/ /s3.amazonaws.com/awsinaction-code3/chapter10/template.yaml ➥ --parameters "ParameterKey=WordpressAdminPassword,ParameterValue=test1234" ➥ --capabilities CAPABILITY_IAM
You’ll have to wait several minutes while the CloudFormation stack is created in the background, which means you’ll have enough time to learn the details of the RDS database instance while the template is launching. The next listing shows parts of the CloudFormation template used to create the wordpress
stack. Table 10.2 shows the attributes you need when creating an RDS database instance using CloudFormation or the Management Console.
Size (also known as instance type) of the underlying virtual machine | |
Database engine (Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database, or Microsoft SQL Server) you want to use | |
It is possible to configure a database instance as publicly accessible, but we generally do not recommend enabling access from the internet to your database, to prevent unwanted access. Instead, as shown in listing 10.1, an RDS instance should be accessible only within the VPC.
To connect to an RDS database instance, you need an EC2 instance running in the same VPC. First, connect to the EC2 instance. From there, you can then connect to the database instance.
Resources: # [...] DatabaseSecurityGroup: ① Type: 'AWS::EC2::SecurityGroup' Properties: GroupDescription: 'awsinaction-db-sg' VpcId: !Ref VPC SecurityGroupIngress: - IpProtocol: tcp FromPort: 3306 ② ToPort: 3306 SourceSecurityGroupId: !Ref WebServerSecurityGroup ③ Database: Type: 'AWS::RDS::DBInstance' ④ DeletionPolicy: Delete ⑤ Properties: AllocatedStorage: 5 ⑥ BackupRetentionPeriod: 0 ⑦ DBInstanceClass: 'db.t2.micro' ⑧ DBName: wordpress ⑨ Engine: MySQL ⑩ MasterUsername: wordpress ⑪ MasterUserPassword: wordpress ⑫ VPCSecurityGroups: - !Sub ${DatabaseSecurityGroup.GroupId} ⑬ DBSubnetGroupName: !Ref DBSubnetGroup ⑭ DependsOn: VPCGatewayAttachment DBSubnetGroup: Type: 'AWS::RDS::DBSubnetGroup' ⑮ Properties: Description: DB subnet group SubnetIds: - Ref: SubnetA ⑯ - Ref: SubnetB
① Security group for the database instance, allowing incoming traffic on the MySQL default port for web servers
② The default MySQL port is 3306.
③ References the security group of EC2 instances running a web server
④ Creates a database instance with Amazon RDS
⑤ Disables backups. (Set to “Snapshot” for production workloads.)
⑥ The database provides 5 GB of storage.
⑦ Disables backups. (Turn this on in production.)
⑧ Creates a default database named wordpress
⑨ Uses MySQL as the database engine
⑩ The size of the database instance is t2.micro, the smallest available size.
⑪ The username for the admin user of the MySQL database
⑫ The password for the admin user of the MySQL database
⑬ References the security group for the database instance
⑭ Defines the subnets the RDS database instance will launch into
⑯ ... consists of subnet A and B, so RDS will distribute database instances among those subnets.
See if the CloudFormation stack named wordpress
has reached the state CREATE
_COMPLETE
with the following command:
Search for StackStatus
in the output, and check whether the status is CREATE_ COMPLETE
. If not, you need to wait a few minutes longer (it can take up to 15 minutes to create the stack) and rerun the command. If the status is CREATE_COMPLETE
, you’ll find the key OutputKey
in the output section. The corresponding OutputValue
contains the URL for the WordPress blogging platform. The following listing shows the output in detail. Open this URL in your browser; you’ll find a running WordPress setup.
$ aws cloudformation describe-stacks --stack-name wordpress { "Stacks": [{ "StackId": "[...]", "Description": "AWS in Action: chapter 10", "Parameters": [...], "Tags": [], "Outputs": [ { "Description": "WordPress URL", "OutputKey": "URL", "OutputValue": "http://[...].us-east-1.elb.amazonaws.com" ① } ], "CreationTime": "2017-10-19T07:12:28.694Z", "StackName": "wordpress", "NotificationARNs": [], "StackStatus": "CREATE_COMPLETE", ② "DisableRollback": false }] }
① Opens this URL in your browser to open the WordPress application
② Waits for state CREATE_COMPLETE for the CloudFormation stack
Launching and operating a relational database like MySQL is that simple. Of course, you can also use the Management Console (https://console.aws.amazon.com/rds/) to launch an RDS database instance instead of using a CloudFormation template. RDS is a managed service, and AWS handles most of the tasks necessary to operate your database in a secure and reliable way. You need to do only two things:
Monitor your database’s available storage, and make sure you increase the allocated storage as needed.
Monitor your database’s performance, and make sure you increase I/O and computing performance as needed.
Both tasks can be handled with the help of CloudWatch monitoring, as you’ll learn later in the chapter.
The CloudFormation stack created an RDS database instance with the MySQL engine. Each database instance offers an endpoint for clients. Clients send their SQL queries to this endpoint to query, insert, delete, or update data. For example, to retrieve all rows from a table, an application sends the following SQL request: SELECT
*
FROM
table
. You can request the endpoint and detailed information of an RDS database instance with the describe-db-instances
command:
$ aws rds describe-db-instances --query "DBInstances[0].Endpoint" { "HostedZoneId": "Z2R2ITUGPM61AM", "Port": 3306, ① "Address": "wdwcoq2o8digyr.cqrxioeaavmf.us-east-1.rds.amazonaws.com" ② }
① Port number of database endpoint
② Host name of database endpoint
The RDS database is now running, but what does it cost?
What does it cost to host WordPress on AWS? We discussed this question in chapter 2 in detail. Here, we want to focus on the costs for RDS.
Databases on Amazon RDS are priced according to the size of the underlying virtual machine and the amount and type of allocated storage. Compared to a database running on a plain EC2 VM, the hourly price of an RDS instance is higher. In our opinion, the Amazon RDS service is worth the extra charge because you don’t need to perform typical DBA tasks like installation, patching, upgrades, migration, backups, and recovery.
Table 10.3 shows a pricing example for a medium-sized RDS database instance with a standby instance for high availability. All prices in USD are for US East (N. Virginia) as of March 11, 2022. Get the current prices at https://aws.amazon.com/rds/pricing/.
You’ve now launched an RDS database instance for use with a WordPress web application. You’ll learn about importing data to the RDS database in the next section.
Imagine you are already running WordPress on a virtual machine in your on-premises data center, and you have decided to move the application to AWS. To do so, you need to move the data from the on-premises MySQL database to RDS. You will learn how to do that in this section.
A database without data isn’t useful. In many cases, you’ll need to import data into a new database by importing a dump from the old database. This section will guide you through the process of importing a MySQL database dump to an RDS database with a MySQL engine. The process is similar for all other database engines (Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database, and Microsoft SQL Server).
To import a database from your on-premises environment to Amazon RDS, follow these steps:
Start a virtual machine in the same region and VPC as the RDS database.
Run an import of the database dump to the RDS database on the virtual server.
We’ll skip the first step of exporting a MySQL database, because the RDS instance we created in our example is empty and you may not have access to an existing WordPress database. The next sidebar gives you hints on how to create a database dump in case you need that for your real-world systems later.
Theoretically, you could import a database to RDS from any machine from your on-premises or local network, but the higher latency over the internet or VPN connection will slow down the import process dramatically. Because of this, we recommend adding a second step: upload the database dump to a virtual machine running in the same AWS region and VPC, and import the database into RDS from there.
To do so, we’ll guide you through the following steps:
Download a database dump from S3 to the VM. (If you are using your own database dump, we recommend uploading it to S3 first.)
Import the database dump into the RDS database from the virtual machine.
Fortunately, you already started two virtual machines that you know can connect to the MySQL database on RDS, because they’re running the WordPress application. Go through the following steps to open a terminal session:
Open the EC2 service via AWS Management Console: https://console.aws.amazon.com/ec2/.
Because you are connected to a virtual machine with access to the RDS database instance, you are ready to import the database dump. First, change into the home directory of the ssm-user
as follows:
We prepared a MySQL database dump of a WordPress blog as an example. The dump contains a blog post and a few comments. Download this database dump from S3 using the following command on the virtual machine:
Next, you’ll need the port and hostname, also called the endpoint, of the MySQL database on RDS. Don’t remember the endpoint? The following command will print it out for you. Run this on your local machine:
Run the following command on the VM to import the data from the file wordpress-import.sql into the RDS database instance; replace $DBAddress
with the Address
you printed to the terminal with the previous command. The Address
will look similar to wdtq7tf5caejft.cd0o57zo3ohr.us-east-1.rds.amazonaws.com
. Also, type in wordpress
when asked for a password:
Point your browser to the WordPress blog again, and you’ll now find many new posts and comments there. If you don’t remember the URL, run the following command on your local machine to fetch it again:
$ aws cloudformation describe-stacks --stack-name wordpress ➥ --query "Stacks[0].Outputs[0].OutputValue" --output text
Over the years, your WordPress site has accumulated hundreds of blog posts and comments from the community. That’s a valuable asset. Therefore, it is key that you back up the data.
Amazon RDS is a managed service, but you still need backups of your database in case something or someone harms your data and you need to restore it, or you need to duplicate a database in the same or another region. RDS offers manual and automated snapshots for recovering RDS database instances. In this section, you’ll learn how to use RDS snapshots to do the following:
Configuring the retention period and time frame for automated snapshots
Restoring snapshots by starting new database instances based on a snapshot
Copying a snapshot to another region for disaster recovery or relocation
The RDS database you started in section 10.1 can automatically create snapshots if the BackupRetentionPeriod
is set to a value between 1
and 35
. This value indicates how many days the snapshot will be retained (the default is 1
). Automated snapshots are created once a day during the specified time frame. If no time frame is specified, RDS picks a random 30-minute time frame during the night. A new random time frame will be chosen each night.
Creating a snapshot requires all disk activity to be briefly frozen. Requests to the database may be delayed or even fail because of a timeout, so we recommend that you choose a time frame for the snapshot that has the least effect on applications and users (e.g., late at night). Automated snapshots are your backup in case something unexpected happens to your database. This could be a query that deletes all your data accidentally or a hardware failure that causes data loss.
The following command changes the time frame for automated backups to 05:00–06:00 UTC and the retention period to three days. Use the terminal on your local machine to execute it:
$ aws cloudformation update-stack --stack-name wordpress --template-url ➥ https:/ /s3.amazonaws.com/awsinaction-code3/chapter10/ ➥ template-snapshot.yaml ➥ --parameters ParameterKey=WordpressAdminPassword,UsePreviousValue=true ➥ --capabilities CAPABILITY_IAM
The RDS database will be modified based on a slightly modified CloudFormation template, as shown in the following listing.
Database: Type: 'AWS::RDS::DBInstance' DeletionPolicy: Delete Properties: AllocatedStorage: 5 BackupRetentionPeriod: 3 ① PreferredBackupWindow: '05:00-06:00' ② DBInstanceClass: 'db.t2.micro' DBName: wordpress Engine: MySQL MasterUsername: wordpress MasterUserPassword: wordpress VPCSecurityGroups: - !Sub ${DatabaseSecurityGroup.GroupId} DBSubnetGroupName: !Ref DBSubnetGroup DependsOn: VPCGatewayAttachment
① Keeps snapshots for three days
② Creates snapshots automatically between 05:00 and 06:00 UTC
If you want to disable automated snapshots, you need to set the retention period to 0
. As usual, you can configure automated backups using CloudFormation templates, the Management Console, or SDKs. Keep in mind that automated snapshots are deleted when the RDS database instance is deleted. Manual snapshots stay. You’ll learn about them next.
You can trigger manual snapshots whenever you need, for example, before you update to the latest WordPress version, migrate a schema, or perform some other activity that could damage your database. To create a snapshot, you have to know the instance identifier. The following command extracts the instance identifier from the first RDS database instance:
The next command creates a manual snapshot called wordpress-manual-snapshot
. Replace $DBInstanceIdentifier
with the output of the previous command:
$ aws rds create-db-snapshot --db-snapshot-identifier ➥ wordpress-manual-snapshot ➥ --db-instance-identifier $DBInstanceIdentifier
In case you get a “Cannot create a snapshot because the database instance .. is not currently in the available state.” error, retry after five minutes—your database is still initializing.
It will take a few minutes for the snapshot to be created. You can check the current state of the snapshot with this command:
RDS doesn’t delete manual snapshots automatically; you need to delete them yourself if you don’t need them any longer. You’ll learn how to do this at the end of the section.
Imagine a scary scenario: you have accidentally deleted all of the blog posts from your WordPress site. Of course, you want to restore the data as fast as possible. Lucky for you, RDS has you covered.
If you restore a database from an automated or manual snapshot, a new database will be created based on the snapshot. As figure 10.2 shows, you can’t restore a snapshot to an existing database.
Instead, a new database is created when you restore a database snapshot, as figure 10.3 illustrates.
To create a new database in the same VPC as the WordPress platform you started in section 10.1, you need to find out the existing database’s subnet group. Execute this command to do so:
$ aws cloudformation describe-stack-resource ➥ --stack-name wordpress --logical-resource-id DBSubnetGroup ➥ --query "StackResourceDetail.PhysicalResourceId" --output text
You’re now ready to create a new database based on the manual snapshot you created at the beginning of this section. Execute the following command, replacing $SubnetGroup
with the output of the previous command:
$ aws rds restore-db-instance-from-db-snapshot ➥ --db-instance-identifier awsinaction-db-restore ➥ --db-snapshot-identifier wordpress-manual-snapshot ➥ --db-subnet-group-name $SubnetGroup
You might get an “DBSnapshot must have state available but actually has creating” error if your snapshot has not been created yet. In this case, retry the command after five minutes.
A new database named awsinaction-db-restore is created based on the manual snapshot. In theory, after the database is created, you could switch the WordPress application to the new endpoint by modifying the /var/www/html/wp-config.php file on both virtual machines.
If you’re using automated snapshots, you can also restore your database from a specified moment, because RDS keeps the database’s change logs. This allows you to jump back to any point in time from the backup retention period to the last five minutes.
Execute the following command, replacing $DBInstanceIdentifier
with the output of the earlier describe-db-instances
command, $SubnetGroup
with the output of the earlier describe-stack-resource
command, and $Time
with a UTC timestamp from five minutes ago (e.g., 2022-03-11T09:30:00Z
):
$ aws rds restore-db-instance-to-point-in-time ➥ --target-db-instance-identifier awsinaction-db-restore-time ➥ --source-db-instance-identifier $DBInstanceIdentifier ➥ --restore-time $Time --db-subnet-group-name $SubnetGroup
A new database named awsinaction-db-restore-time is created based on the source database from five minutes ago. In theory, after the database is created, you could switch the WordPress application to the new endpoint by modifying the /var/www/ html/wp-config.php file on both virtual machines.
When you created the cloud infrastructure for WordPress, you assumed that most readers will come from the United States. It turns out, however, that most readers access your site from Europe. Therefore, you decide to move your cloud infrastructure to reduce latency for the majority of your readers.
Copying a database to another region is possible with the help of snapshots as well. The main reasons you might do so follow:
Disaster recovery—You can recover from an unlikely region-wide outage.
Relocating—You can move your infrastructure to another region so you can serve your customers with lower latency.
The second command copies the snapshot named wordpress-manual-snapshot
from the region us-east-1
to the region eu-west-1
. You need to replace $SourceSnapshotArn
with the Amazon Resource Name (ARN) of the snapshot. Use the following command to get the ARN of your manual snapshot:
$ aws rds describe-db-snapshots ➥ --db-snapshot-identifier wordpress-manual-snapshot ➥ --query "DBSnapshots[0].DBSnapshotArn" --output text
Compliance Moving data from one region to another may violate privacy laws or compliance rules. Make sure you’re allowed to copy the data to another region if you’re working with real data.
$ aws rds copy-db-snapshot ➥ --source-db-snapshot-identifier $SourceSnapshotArn ➥ --target-db-snapshot-identifier wordpress-manual-snapshot ➥ --region eu-west-1
After the snapshot has been copied to the region eu-west-1
, you can restore a database from it as described in the previous section.
Snapshots are billed based on the storage they use. You can store snapshots up to the size of your database instance for free. In our WordPress example, you can store up to 5 GB of snapshots for free. On top of that, you pay per GB per month of used storage. As we’re writing this book, the cost is $0.095 for each GB every month.
|
Every day, you can read about WordPress sites that got hacked. One essential aspect of protecting your WordPress site is controlling access to your cloud infrastructure and database.
The shared-responsibility model applies to the RDS service as well as to AWS services in general. AWS is responsible for the security of the cloud in this case—for example, for the security of the underlying OS. You, the customer, need to specify the rules controlling access to your data and RDS database.
Figure 10.4 shows the following three layers that control access to an RDS database:
Access to the RDS service is controlled using the IAM service. IAM is responsible for controlling access to actions like creating, updating, and deleting an RDS database instance. IAM doesn’t manage access inside the database; that’s the job of the database engine. IAM policies define which configuration and management actions an identity is allowed to execute on RDS. You attach these policies to IAM users, groups, or roles to control what actions they can perform on the database.
The following listing shows an IAM policy that allows access to all RDS configuration and management actions. You could use this policy to limit access by attaching it only to trusted IAM users and groups.
{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", ① "Action": "rds:*", ② "Resource": "*" ③ }] }
① Allows the specified actions on the specified resources
② All possible actions on RDS service are specified (e.g., changes to the database configuration).
③ All RDS databases are specified.
Only people and machines that really need to make changes to RDS databases should be allowed to do so. The following listing shows an IAM policy that denies all destructive actions to prevent data loss by human failure.
{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", ① "Action": "rds:*", ② "Resource": "*" ③ }, { "Effect": "Deny", ④ "Action": ["rds:Delete*", "rds:Remove*"], ⑤ "Resource": "*" ⑥ }] }
② ... to all actions related to RDS ...
⑤ ... to all destructive actions on the RDS service (e.g., delete database instance) ...
As discussed in chapter 5, when introducing IAM, a Deny
statement overrides any Allow
statement. Therefore, a user or role with the IAM policy attached does have limited access to RDS, because all actions are allowed except the ones that are destructive.
An RDS database is linked to security groups. Each security group consists of rules for a firewall controlling inbound and outbound database traffic. You already know about using security groups in combination with virtual machines.
The next listing shows the configuration of the security group attached to the RDS database in our WordPress example. Inbound connections to port 3306 (the default port for MySQL) are allowed only from virtual machines linked to the security group called WebServerSecurityGroup
.
DatabaseSecurityGroup: ① Type: 'AWS::EC2::SecurityGroup' Properties: GroupDescription: 'awsinaction-db-sg' VpcId: !Ref VPC SecurityGroupIngress: - IpProtocol: tcp FromPort: 3306 ② ToPort: 3306 SourceSecurityGroupId: !Ref WebServerSecurityGroup ③
① The security group for the database instance, allowing incoming traffic on the MySQL default port for web servers
② The default MySQL port is 3306.
③ References the security group for web servers
Only machines that really need to connect to the RDS database should be allowed to do so on the network level, such as EC2 instances running your web server or application server. See chapter 5 if you’re interested in more details about security groups (firewall rules).
A database engine also implements access control itself. User management of the database engine has nothing to do with IAM users and access rights; it’s only responsible for controlling access to the database. For example, you typically define a user for each application and grant rights to access and manipulate tables as needed. In the WordPress example, a database user called wordpress is created. The WordPress application authenticates itself to the database engine (MySQL, in this case) with this database user and a password.
Limiting write access to a few database users (e.g., only for an application)
Limiting access to specific tables to a few users (e.g., to one department in the organization)
Limiting access to tables to isolate different applications (e.g., hosting multiple applications for different customers on the same database)
User and access management varies between database systems. We don’t cover this topic in this book; refer to your database system’s documentation for details.
The availability of our blog cloudonaut.io is key to our business success. That’s the case for your WordPress site as well. Therefore, you should avoid downtimes when possible. This chapter is about increasing the availability of your database.
The database is typically the most important part of a system. Applications won’t work if they can’t connect to the database, and the data stored in the database is mission critical, so the database must be highly available and store data durably.
Amazon RDS lets you launch highly available (HA) databases. Compared to a default database consisting of a single database instance, an HA RDS database consists of two database instances: a primary and a standby database instance. You will be paying for both instances. All clients send requests to the primary database. Data is replicated between the primary and the standby database synchronously, as shown in figure 10.5.
We strongly recommend using high-availability deployment for all databases that handle production workloads. If you want to save money, you can choose not to deploy a highly available database for your test systems.
If the primary database becomes unavailable due to hardware or network failures, RDS starts the failover process. The standby database then becomes the primary database. As figure 10.6 shows, the DNS name is updated and clients begin to use the former standby database for their requests.
RDS detects the need for a failover automatically and executes it without human intervention.
Warning Starting a highly available RDS database will incur charges, about USD $0.017000 per hour. See https://aws.amazon.com/rds/pricing/ if you want to find out the current hourly price.
Execute the following command at your local terminal to enable high-availability deployment for the RDS database you started in section 10.1:
$ aws cloudformation update-stack --stack-name wordpress --template-url ➥ https:/ /s3.amazonaws.com/awsinaction-code3/ ➥ chapter10/template-multiaz.yaml ➥ --parameters ParameterKey=WordpressAdminPassword,UsePreviousValue=true ➥ --capabilities CAPABILITY_IAM
The RDS database is updated based on a slightly modified CloudFormation template as shown in the next listing.
Database: Type: 'AWS::RDS::DBInstance' DeletionPolicy: Delete Properties: AllocatedStorage: 5 BackupRetentionPeriod: 3 PreferredBackupWindow: '05:00-06:00' DBInstanceClass: 'db.t2.micro' DBName: wordpress Engine: MySQL MasterUsername: wordpress MasterUserPassword: wordpress VPCSecurityGroups: - !Sub ${DatabaseSecurityGroup.GroupId} DBSubnetGroupName: !Ref DBSubnetGroup MultiAZ: true ① DependsOn: VPCGatewayAttachment
① Enables high-availability deployment for the RDS database
It will take several minutes for the database to be deployed in HA mode. There is nothing more you need to do—the database is now highly available.
In addition to the fact that a high-availability deployment increases your database’s reliability, it offers another important advantage: reconfiguring or maintaining a single-mode database causes short downtimes. High-availability deployment of an RDS database solves this problem because AWS switches to the standby database during maintenance.
When search engines decide in which order to present the search results, the loading speed of a website is an important factor. Therefore, it is important to optimize the performance of your WordPress site. You will learn how to make sure the MySQL database is not slowing down your website in this section.
The easiest way to scale a RDS database, or an SQL database in general, is to scale vertically. Scaling a database vertically means increasing the following resources of your database instance:
Keep in mind that you can’t scale vertically (which means increasing resources) without limits. One of the largest RDS database instance types comes with 32 cores and 244 GiB memory. In comparison, an object store like S3 or a NoSQL database like DynamoDB can be scaled horizontally without limits, because they add more machines to the cluster if additional resources are needed.
When you start an RDS database, you choose an instance type. The instance type defines the computing power and memory of your virtual machine (such as when you start an EC2 instance). Choosing a bigger instance type increases computing power and memory for RDS databases.
You started an RDS database with instance type db.t2.micro
, the smallest available instance type. You can change the instance type using a CloudFormation template, the CLI, the Management Console, or AWS SDKs. You may want to increase the instance type if performance is inadequate. You will learn how to measure performance in section 10.7. Listing 10.8 shows how to change the CloudFormation template to increase the instance type from db.t2.micro
with one virtual core and 615 MB memory to db.m3.large
with two faster virtual cores and 7.5 GB memory. You’ll do this only in theory. Don’t do this to your running database because it is not covered by the Free Tier and will incur charges. Keep in mind that modifying the instance type causes a short downtime.
Database: Type: 'AWS::RDS::DBInstance' DeletionPolicy: Delete Properties: AllocatedStorage: 5 BackupRetentionPeriod: 3 PreferredBackupWindow: '05:00-06:00' DBInstanceClass: 'db.m3.large' ① DBName: wordpress Engine: MySQL MasterUsername: wordpress MasterUserPassword: wordpress VPCSecurityGroups: - !Sub ${DatabaseSecurityGroup.GroupId} DBSubnetGroupName: !Ref DBSubnetGroup MultiAZ: true DependsOn: VPCGatewayAttachment
① Increases the size of the underlying virtual machine for the database instance from db.t2.micro to db.m3.large
Because a database has to read and write data to a disk, I/O performance is important for the database’s overall performance. RDS offers the following three different types of storage, as you already know from reading about the block storage service EBS:
You should choose general-purpose (SSD) or even provisioned IOPS (SSD) storage for production workloads. The options are exactly the same as when using EBS for virtual machines. If you need to guarantee a high level of read or write throughput, you should use provisioned IOPS (SSD). The general-purpose (SSD) option offers moderate baseline performance with the ability to burst. The throughput for general purpose (SSD) depends on the amount of initialized storage size. Magnetic storage is an option if you need to store data at a low cost, or if you don’t need to access it in a predictable, performant way. The next listing shows how to enable general-purpose (SSD) storage using a CloudFormation template.
Database: Type: 'AWS::RDS::DBInstance' DeletionPolicy: Delete Properties: AllocatedStorage: 5 BackupRetentionPeriod: 3 PreferredBackupWindow: '05:00-06:00' DBInstanceClass: 'db.m3.large' DBName: wordpress Engine: MySQL MasterUsername: wordpress MasterUserPassword: wordpress VPCSecurityGroups: - !Sub ${DatabaseSecurityGroup.GroupId} DBSubnetGroupName: !Ref DBSubnetGroup MultiAZ: true StorageType: 'gp2' ① DependsOn: VPCGatewayAttachment
① Uses general-purpose (SSD) storage to increase I/O performance
A database suffering from too many read requests can be scaled horizontally by adding additional database instances for read traffic and enabling replication from the primary (writable) copy of the database instance. As figure 10.7 shows, changes to the database are asynchronously replicated to an additional read-only database instance. The read requests can be distributed between the primary database and its read-replication databases to increase read throughput. Be aware that you need to implement the distinction between read and write requests on the application level.
Tweaking read performance with replication makes sense only if the application generates many read requests and few write requests. Fortunately, most applications read more than they write.
Creating a read-replication database
Amazon RDS supports read replication for MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server databases. To use read replication, you need to enable automatic backups for your database, as shown in section 10.3.
Warning Starting an RDS read replica will incur charges. See https://aws.amazon.com/rds/pricing/ if you want to find out the current hourly price.
Execute the following command from your local machine to create a read-replication database for the WordPress database you started in section 10.1. Replace the $DBInstanceIdentifier
with the value from aws
rds
describe-db-instances
--query
"DBInstances[0].DBInstanceIdentifier"
--output
text
:
$ aws rds create-db-instance-read-replica ➥ --db-instance-identifier awsinaction-db-read ➥ --source-db-instance-identifier $DBInstanceIdentifier
RDS automatically triggers the following steps in the background:
Creating a snapshot from the source database, also called the primary database instance
Activating replication between the primary and read-replication database instances
Creating an endpoint for SQL read requests to the read-replication database instances
After the read-replication database is successfully created, it’s available to answer SQL read requests. The application using the SQL database must support the use of read-replication databases. WordPress, for example, doesn’t support read replicas by default, but you can use a plug-in called HyperDB to do so; the configuration is tricky, so we’ll skip this part. You can get more information here: https://wordpress.org/plugins/hyperdb/. Creating or deleting a read replica doesn’t affect the availability of the primary (writable) database instance.
Promoting a read replica to a standalone database
Imagine your WordPress site became really popular. On the one hand, that’s a great success. On the other hand, you are struggling with handling all that load with your cloud infrastructure. You are thinking about adding read replicas to your database to decrease the load on the primary database.
If you create a read-replication database to migrate a database from one region to another, or if you have to perform heavy and load-intensive tasks on your database, such as adding an index, it’s helpful to switch your workload from the primary database to a read-replication database. The read replica must become the new primary database.
The following command promotes the read-replica database you created in this section to a standalone primary database. Note that the read-replication database will perform a restart and be unavailable for a few minutes:
The RDS database instance named awsinaction-db-read will accept write requests after the transformation is successful.
|
You’ve gained experience with the AWS relational database service in this chapter. We’ll end the chapter by taking a closer look at the monitoring capabilities of RDS.
To avoid downtime of your WordPress site, it is imperative that you monitor all important parts of your cloud infrastructure. The database is definitely one of the key components. That’s why you will learn about RDS monitoring in this section.
RDS is a managed service. Nevertheless, you need to monitor some metrics yourself to make sure your database can respond to all requests from applications. RDS publishes several metrics for free to AWS CloudWatch, a monitoring service for the AWS cloud. You can watch these metrics through the Management Console, as shown in figure 10.8, and define alarms for when a metric reaches a threshold.
Table 10.4 shows the most important metrics; we recommend that you monitor them by creating alarms.
We recommend that you monitor these metrics in particular, to make sure your database isn’t the cause of application performance problems.
|
In this chapter, you’ve learned how to use the RDS service to manage relational databases for your applications. The next chapter will focus on in-memory caches.
RDS is a managed service that provides relational databases.
You can choose between PostgreSQL, MySQL, MariaDB, Oracle Database, and Microsoft SQL Server databases. Aurora is the database engine built by Amazon.
The fastest way to import data into an RDS database is to copy it to a virtual machine in the same region and load it into the RDS database from there.
RDS comes with built-in backup and restore functionality allowing you to create and restore snapshots on demand as well as to restore a database to a certain point in time.
You can control access to data with a combination of IAM policies and firewall rules and on the database level.
You can restore an RDS database to any time in the retention period (a maximum of 35 days).
RDS databases can be highly available. You should launch RDS databases in Multi-AZ mode for production workloads.
Read replication can improve the performance of read-intensive workloads on an SQL database.
CloudWatch metrics allow you to monitor your database, for example, to debug performance problems or decide when to increase or decrease the size of your database instance.
3.19.30.232