© Shimon Ifrah 2021
S. IfrahGetting Started with Containers in Google Cloud Platform https://doi.org/10.1007/978-1-4842-6470-6_9

9. Backup and Restore

Shimon Ifrah1  
(1)
Melbourne, VIC, Australia
 

In this chapter, we will learn how to back up and restore workloads in Google Cloud Platform (GCP). Backing up and restoring workloads in any infrastructure (self-hosted or cloud-hosted) is one of the most critical operations you will need to make sure you have under control. Many organizations neglect this part of the operation, only to find out when it is too late how important having a backup is.

The second important thing in any environment is to make sure your backups are working by testing your restore process. Over my sixteen years in IT I have heard too many horror stories about companies backing up workloads for years only to find out they are not working because no one tested the restore process and everyone assumed the restore would work.

In this chapter, we will cover the following topics:
  • Back Up and Restore Compute Engine VM Instances

  • Back Up GKE Persistent Storage Disks

  • Manage Cloud Storage and File Store

Back Up Compute Engine VM Instances

I would like to start with VM instances, which will help you understand the concept of backups in GCP.

Snapshots

In Google Cloud Platform, backups are done using snapshots of persistent disk volumes that are attached to resources like VM instances and GKE hosts. When it comes to VM instances, you run a backup by taking snapshots either manually or by using a scheduler, which is the recommended method.

When you run your first snapshot, GCP will take a full disk backup; however, after the first snapshot, any further backups will only contain the changes that were made to the instance. Using this method, backup size is smaller and so are the costs of running backups and keeping them for a long period of time.

Create Snapshot

Go ahead and create your first snapshot to learn by example and understand how back up and restore works in GCP. In my case, I have a running VM instance of which I will take a snapshot.

To create a snapshot of an existing machine, use the following process:
  • Open the GCP console and navigate to the Compute Engine console.

  • In the Compute Engine console’s left-hand navigation menu, click on Snapshots.

Figure 9-1 shows the Snapshots page.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig1_HTML.jpg
Figure 9-1

Snapshots

  • On the Snapshots page, click on Create Snapshot.

  • On the Create Snapshot page, fill in the following details:
    • Name: This the name of the backup set.

    • Source Disk: This is the present storage disk of the VM instance you are backing up, so make sure you get this value right. If you are not sure, click on the VM instance and scroll down to the Boot Disk section and note the name of your disk or disks.

    • Location: This is very important. By default, GCP will backup your data to multiple regions for high-availability reasons; however, the cost will double since two backup sets are kept, plus there are network traffic charges.

      If the workloads belong to a development environment, it might be smarter to use a regional zone. If you select a regional zone, make sure the backup is in the same location as the VM; otherwise, you will pay for network traffic.

  • When you are ready, click on Create to start the backup process.

Figure 9-2 shows the Create a Snapshot screen.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig2_HTML.jpg
Figure 9-2

Create a snapshot

After creating the backup, GCP will start the snapshotting process, regardless if the VM is running or not. When the process is complete, you will see the snapshot in the Snapshots page, as shown in Figure 9-3.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig3_HTML.jpg
Figure 9-3

Snapshot

If you look at Figure 9-3, my snapshot has completed, and it is 369.86 MB in size.

Create a Snapshot Using Cloud SDK and gcloud

You can also create a VM instance snapshot using Cloud SDK and gcloud. The process is the same as running any gcloud command from Cloud Shell and Cloud SDK. The following code will take a snapshot of my dockerhost VM instance:
$ gcloud compute disks snapshot dockerhost --project=web-project-269903 --description=LinuxDocker host backup --snapshot-names=dockerhost --zone=us-central1-a --storage-location=us-central1
To view existing snapshots, using the following command:
$ gcloud compute snapshots list

Schedule Snapshots

To schedule the operation of your backup infrastructure in GCP so as to not worry about taking backups, it is smart to automate the entire process. Using a snapshot schedule, you can configure the backup process to take place on specific days and times.

The process to use the snapshot schedule is simple. First, you create a schedule in the Snapshot Schedule console. Second, you configure the VM to use the schedule. Let’s see how this process works.

From the Snapshot Scheduler tab, located in the Snapshots console, click on Create Snapshot Schedule, as shown in Figure 9-4.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig4_HTML.jpg
Figure 9-4

Create a schedule

On the Create a Snapshot Schedule page, you need to name the schedule and configure the following settings:
  • Name: Use a meaningful name that will help identify what type of backup it is.

  • Region: Select a region where the schedule will be located.

  • Snapshot location: Select a region where the backup data will be.

  • Schedule options: This is where we configure the schedule details (day and time).

When you finish, save the configuration. You can see the Create a Snapshot Schedule page in Figure 9-5.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig5_HTML.jpg
Figure 9-5

Create a schedule

Attach VM Instance Disk to Snapshot

Now that you have the schedule ready, it is time to associate the schedule with a VM. When you attach a VM to a snapshot schedule, the VM will be backed up in accordance with the schedule.

You attach a schedule to a VM by opening the Disks section located on the left-hand navigation menu of the Compute Engine console, as shown in Figure 9-6. In the Disks section, select the disk that belongs to the VM to which you need to attach a schedule. In my case, the name of the disk is dockerhost.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig6_HTML.jpg
Figure 9-6

Snapshot schedule

On the Disk Details page, scroll down and select the snapshot from the Snapshot Schedule drop-down menu, as shown in Figure 9-6.

If you go back to the Snapshot Schedules page, you will see that the schedule is attached or being used by my dockerhost machine, as shown in Figure 9-7.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig7_HTML.jpg
Figure 9-7

Used by

Restore Compute Engine VM Instance

When it comes to restoring your snapshots, GCP gives you a few options, as follows:
  • Create a new VM from an existing snapshot: This option is good if you would like to have a copy of a running instance and compare or test the configuration of the instance in a sandbox environment.

  • Replace existing disk with a snapshot: This is probably what most people will use if they need to restore the VM. In this case, we first create a disk from a snapshot and attach it as a boot disk.

Let’s go ahead and start with the first option.

Create a New Instance from a Snapshot

To create a new VM instance from a disk snapshot, open the Snapshots console and click on the snapshot you would like to spin a VM from. Figure 9-8 shows my snapshots.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig8_HTML.jpg
Figure 9-8

Snapshots

On the Snapshot Details page, click on the Create Instance button located on the top menu, as shown in Figure 9-9. Follow the Create Instance wizard to deploy a VM.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig9_HTML.jpg
Figure 9-9

Create instance

In the wizard under Boot Disk, you will notice that the disk will be created from a snapshot, as shown in Figure 9-10.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig10_HTML.jpg
Figure 9-10

Boot disk

Replace Boot Disk

The second restore option is handier if you need to restore a running VM instance, because it will replace an existing boot disk or a secondary disk.

Create a New Disk from Snapshot

To use this option, you need to create a new disk first using the following process:
  • Open the Disks section from the Compute Engine console and click on Create Disk.

  • Name the disk.

  • On the Create Disk wizard, in the Source Type section, click on Snapshot and select the snapshot you would like to use.

Figure 9-11 shows the source type options.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig11_HTML.jpg
Figure 9-11

Source type

Replace Disk of an Existing VM

Now it is time to attach the disk to an existing VM. Do so as follows:
  • First, stop the VM and click on it.

  • From the VM Details page, click on Edit and scroll down to the Boot Disk section, as shown in Figure 9-12.

  • Remove the existing disk using the X sign and click on the Add Item button.

../images/496339_1_En_9_Chapter/496339_1_En_9_Fig12_HTML.jpg
Figure 9-12

Add disk

From the Add Item menu, click on the drop-down box under Name and select the disk you created in the previous subsection. In my case, the disk is called disk-1, and it appears in the list, as shown in Figure 9-12.
  • When you finish, save the settings and start the VM.

Back Up Persistent Storage Disks (GKE)

When it comes to Google Kubernetes Engine (GKE) backups and recovery, the process to manage GKE data is very similar to that for Compute Engine VM instances. Since GKE uses persistent storage (same as VM), we actually use the same interface and process to back up and restore data as we used in the previous section.

Let’s go ahead and deploy a stateful GKE application using the process we learned in Chapter 4. To deploy a stateful application, connect to your GKE cluster from the Shell terminal using the following gcloud command:

Note

You can find the connect command, on the GKE cluster page, by clicking on the Connect button next to the cluster name.

$ gcloud container clusters get-credentials cluster-1 --zone us-central1-c --project web-project-269903

After connecting to the GKE cluster, deploy a stateful application using the following line:

Note

This is the same random app we deployed in Chapter 4.

$ kubectl apply -f deploy_storage.yaml
After deploying the stateful application, from the GKE cluster console, click on Storage. On the Storage page, you will see the newly created volume, as shown in Figure 9-13.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig13_HTML.jpg
Figure 9-13

GKE storage

If you click on the Disks section under Compute Engine, you will see the volume that belongs to the stateful application. Figure 9-14 shows the GKE volumes under Disks.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig14_HTML.jpg
Figure 9-14

GKE disks

If you click on the disk, you will see to which application it belongs in the cluster. This part is very important when you need to back up and restore applications that are running on GKE.

In Figure 9-15, you can see that the volume belongs to the random-web-0 application in my example.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig15_HTML.jpg
Figure 9-15

Disk details

Manage Cloud Storage

Google Cloud Platform’s Cloud Storage feature allows you to create storage buckets and store unstructured data in them. Storage buckets are good for storing simple data and for applications that need to dump data or retrieve files from any location. You access Cloud Storage from the GCP console’s left-hand navigation menu, as shown in Figure 9-16.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig16_HTML.jpg
Figure 9-16

Cloud Storage

About Cloud Storage

Create Bucket

To create a storage bucket, you need to use the Create Bucket button on the top menu, as shown in Figure 9-17.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig17_HTML.jpg
Figure 9-17

Create storage

On the Create Storage Bucket page, start by giving the bucket a name. You also need to set the bucket location. As previously advised, always place the storage next to your applications for quick access. Figure 9-18 shows the name and region selection.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig18_HTML.jpg
Figure 9-18

Location

Set Retention Policy

When it comes to Cloud Storage, backup and restore are different—and I mean completely different, because there is no backup option. Cloud Storage uses retention policies to control data and prevent you from losing your data.

It is very important you understand this point: without using a retention policy, deleted data is gone forever and cannot be restored. The retention policy will keep any deleted data in the storage bucket for the life of the configured retention policy.

For example, if I set a retention policy of two years, deleted data would be kept in the bucket and be visible for two years before being deleted automatically.

Set a retention policy in the Retention Policy section on the Create Bucket setup page. By default, the retention policy is disabled. In my case, I will enable the retention policy with two days of retention just for testing purposes.

Figure 9-19 shows the Retention Policy section.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig19_HTML.jpg
Figure 9-19

Retention policy

Add and Delete Files from Cloud Storage

To add files to Cloud Storage, you can use the GUI for GCP API tools by using the following link:

https://cloud.google.com/storage/docs/uploading-objects#rest-upload-objects

For the purpose of this demo, we will use the GUI.

Click on the bucket name on the Cloud Storage console and then click on the Upload Files button. You can also upload a folder using the Upload Folder button. In Figure 9-20 you can see that I have uploaded a file.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig20_HTML.jpg
Figure 9-20

Upload files and folders

I will go ahead and delete the file by selecting it and clicking the Delete button that is located on the top menu. Figure 9-21 shows the Delete File menu.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig21_HTML.jpg
Figure 9-21

Delete file

After deleting the file, it will stay visible in the console; however, you will notice that the column of Retention Expiry Date shows the date on which the file will be deleted completely. Figure 9-22 shows the retention expiry date.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig22_HTML.jpg
Figure 9-22

Retention expiry date

Configure or Add Retention Policy

To reconfirm the retention policy or add a retention policy to an existing bucket, click on the bucket name and then click on the Bucket Lock tab, as shown in Figure 9-23. On the Bucket Lock page, you can add, modify, or delete a retention policy.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig23_HTML.jpg
Figure 9-23

Bucket lock

Create Lifecycle Rules

Using lifecycle rules, you can automate tasks like deleting data from storage buckets. To create a lifecycle policy, on the Bucket page, click on the Bucket Lock option and select the Age option.

In my case, I will use 100 days, as shown in Figure 9-24.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig24_HTML.jpg
Figure 9-24

Set bucket age

In the Select Action section, there are four options; the first three can move old data to a different tier of storage that costs less. This is very useful if you use Cloud Storage to store data that you do not need very often.

The fourth option is to delete, and in my case, any data that is older than 100 days will get deleted from the bucket.

Figure 9-25 shows the Select Action menu.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig25_HTML.jpg
Figure 9-25

Select action

Back Up Compute Engine Resources

In this last section of the chapter, I will show you how to configure a static public IP address to be used by Compute Engine resources. By default, any public IP address that a VM instance is using is not static, and when the instance is restarted, it gets a new IP address.

This process is not good at all if we are using the instance to host public sites, as public DNS entries use the public IP address to route traffic to websites and applications hosted on the host. Over the last few years, I have seen so many applications hosted on instances and various clouds that become inaccessible after reboot because a static IP address was not configured on the instance.

I hope that this section will prevent you from making this very common mistake.

To set a static IP address, open the VPC network console from the left-hand navigation menu, as shown in Figure 9-26.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig26_HTML.jpg
Figure 9-26

VPC network

Click on the External IP Addresses section, and you will see all the external IP addresses used by your application. Under Type, you will see if the IP is ephemeral or static. An ephemeral IP will get replaced on reboot, while static will stay. Figure 9-27 shows the External IP page.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig27_HTML.jpg
Figure 9-27

External IPs

You can change the type by clicking on the type of the IP and choosing Static. Then, confirm the change, as shown in Figure 9-28. Please note that a static IP address has extra charges associated with it.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig28_HTML.jpg
Figure 9-28

Change the IP type

To create a new IP reserve, click on Reserve Static Address and fill in the details. In the wizard, you have the option to select the service tier type, region, and, most important, the VM instance that will use it. Figure 9-29 shows the configuration page.
../images/496339_1_En_9_Chapter/496339_1_En_9_Fig29_HTML.jpg
Figure 9-29

Reserve IP

Summary

In this chapter, we have learned how to back up and restore GCP resources that are using disk volumes for storage. In our case, we covered the process of backing up and restoring VM instance volumes and GKE storage volumes that are used by stateful applications.

In the last section, we learned how to use a static IP address in a VM instance when we host public DNS or if we need a static IP address that doesn’t change on reboot.

In the next and last chapter, we will cover troubleshooting.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.178.157