An introduction to S3 storage

The Simple Storage Service (usually named S3) is a storage service provided by Amazon in its Amazon Web Services (AWS) cloud offering. S3 was one of the first services provided by AWS. It was made publicly available in March 2006, and presented as the Amazon in-house storage solution accessible to everybody. S3 is not only very different from a relational database, but also different from a NoSQL database (which also grew in popularity at the same time). S3 is based on the very simple principle of storing a value with an associated key, and retrieving that value via this key. In its simplest definition, it is just a key/value store. There is no notion of searches, transactions, or joins in S3. It looks more like a filesystem than a database. However, it benefits from the usual cloud promises: scalability, availability, redundancy, and easy access via APIs.

S3 relies on only a few concepts:

Buckets
Objects
Keys

A bucket is a flat container that can hold an infinite number of objects. Several buckets can be created per AWS account. An object is a binary blob. So it can be a file of any type, text-based or binary-based. The key is a UTF-8 string that is used to identify an object. Even though, in principle, any UTF-8 character can be used in the key name, only characters that are valid in URLs should be used. This is because access to the objects is done via REST APIs, where the object key is part of the URL. Being a flat container, an S3 bucket has no notion of hierarchy: all objects are stored at a single root level in the bucket. However, it is possible to simulate a hierarchical structure by using the slash / character in the keys. In this case, the AWS S3 console displays such keys as a folder-like layout. The following figure shows how S3 is organized:

Figure 7.1: The Simple Storage Service, S3

Buckets can be created either on the AWS console, or via HTTP APIs. Each object is a binary blob, identified by a unique key in the bucket. An object also has some metadata associated with it, such as the creation date. Custom metadata can be added when an object is created. Once the object is created, it is immutable: its content and its metadata cannot be changed. If any part has to be changed, then another object with the same key must be put in the bucket, which replaces the existing object and the associated metadata.

Amazon provides command-line tools to use S3 and also an SDK that is available for many programming languages. S3 is one of the most popular AWS services. There are two main reasons for this: it is easy to use and it is quite cost-efficient. However, the simplicity of this service comes with restrictions. The two main ones are the fact that objects are immutable, and also the fact that there is no search. So, S3 is adapted to use cases where a key/value store is needed, with values being large objects. But, in order to retrieve the objects, one has to know the key in advance. There is no way to search for an object based on its content or metadata.

Table of Contents for An introduction to S3 storage

Create new playlist

Sign In

Sign Up

Table of Contents for
An introduction to S3 storage