This chapter will be most useful to developers and architects considering building a system that includes Swift, or those writing code against an existing Swift installation. In the first section, we’ll offer a conceptual overview of Swift’s architecture as background for understanding what uses Swift is best suited for. In the second section, we’ll see how these theoretical concepts and practical realities influence Swift’s implementation, and thus which applications do best on a Swift platform. In the third and final section, we offer an applied overview of how to work with the Swift API, covering basics such as authentication, data retrieval and storage, and how to update and best use metadata.
API stands for “application programming interface,” which isn’t much more illuminating if you’re not clear on the term. An API is the set of rules for interacting with a software system. An API promises, “If you pass in this type of data in this manner, and if it meets these criteria, then you’ll get back this result; otherwise, you’ll get this type of error.” More loosely, an API is the set of instructions for how to use a software system.
The Swift API is a set of rules that specify what types of HTTP requests you are allowed to send to a Swift cluster, what types of success and failure responses you’ll get back under what circumstances, and what data those responses will contain. We’ll discuss the details of Swift’s API in this chapter.
But before we discuss how to use Swift, we’ll provide a little theory and background that will help explain how Swift works, the design decisions that shaped its implementation, and how those design decisions affect how you use Swift.
No One-Size-Fits-All Storage System briefly introduced the CAP theorem, which we’ll explore here in more detail since it can help us better understand Swift’s architecture.
The CAP theorem, proposed in 2000 by Eric Brewer of UC Berkeley and formally proven in 2002 by Seth Gilbert and Nancy Lynch of MIT, states that it is impossible for a distributed service to simultaneously guarantee consistency (all servers and clients see operations occur in the same order), availability (all client requests receive a response at all times), and partition tolerance (the system is usable even when arbitrary network links are unavailable).
Partition tolerance, or the P in CAP—formally, “the network will be allowed to lose arbitrarily many messages sent from one node to another”—defines our concept of a distributed system. Systems that aren’t partition tolerant are either single-node systems or not useful. The availability guarantee might seem easy—surely a server can simply try to communicate with its peers to update, and return an error code if it doesn’t get a response from them?—but a subtlety makes it more complicated: we don’t know whether the message was lost going from the server to the peer (in which case the peer has the old value), or from the peer back to the server (in which case the peer successfully updated). So when there is no acknowledgment, our original server doesn’t know whether to respond with success or failure!
More recently, other researchers have refined the concepts behind the CAP theorem. At Yale, Daniel Abadi points out that unlike consistency and availability, partition tolerance is not something that a design can choose to trade off: either the system is partition tolerant (i.e., it is a correctly functioning distributed system of some kind), or it is not. What is more precisely of interest is how the system behaves under a network partition, and how it behaves when the network isn’t partitioned. Abadi calls it “PACELC”: during a partition (P), does the system prioritize availability (A) or consistency (C); else (E), if there is no partition, does it prioritize latency (L) or consistency (C)?
Others warn about the use of the term “consistency,” which means something different in the CAP sense—a valid total ordering of operations—than it does in the ACID (atomicity, consistency, isolation, and durability) sense in databases, where it refers to database constraints not being violated.
A discussion of the proof of the CAP theorem (and its interesting edge cases) is beyond the scope of this book, but the theorem provides a simple rationale for why many distributed systems are designed to be either consistent (i.e., transactional, like a distributed database) or available (like Swift) but not both. It should be noted that this is an oversimplification of the result proven by the CAP theorem, and Brewer and others have written about how systems (including Swift) choose to be either consistent or available in some circumstances when they could have both properties. Nevertheless, it is certainly true that maintaining consistency and availability in the face of network partitions is difficult at best, introducing complexity and potential brittleness into the system, and at worst it’s been proven impossible.
Swift is a classic AP system in the CAP theorem’s sense: it provides high availability in the face of partition tolerance, but different clients might see operations in a different order and thus see inconsistent data. In fact, in certain unusual circumstances, it is possible for a single client to write a new value, see the write complete successfully, issue a read for that value, and retrieve the old data. You can successfully create an object, then immediately list the container and not see your object. Given time and sufficiently low system load, Swift is guaranteed to converge to a consistent state. However, applications must be designed to accommodate Swift being in an inconsistent state at times until consistency is restored. This might be a surprise for developers accustomed to ACID database guarantees.
Swift belongs to a class of systems described by the term BASE (basically available, soft state, eventual consistency), an ever so slightly contrived acronym intended to distinguish them from ACID systems. Of those descriptors, eventual consistency is the one that seems to be most challenging to developers accustomed to ACID.
What kinds of applications are good candidates for a Swift-based architecture? What applications won’t work well with Swift?
Due to Swift’s eventual consistency, any application that has a transactional
nature or needs ACID guarantees—such as a travel booking system that
must ensure that two people never buy the same ticket, or a banking system
that must ensure that two transactions don’t withdraw the same money—generally will
not be a good fit for Swift. Similarly, if your application requires
that clients nearly always see the same data (and if that fails due to timing
delays, that clients always agree afterward on the ordering of mixed read and
write operations), you should not choose an AP system for your data storage.
Although Swift can help guard against conflicts with headers like Etag
and If-Unmodified-Since
, locking or transactional operations are not
the core strength of Swift or any AP system.
On the other hand, Swift excels at high availability, redundancy, throughput—and, of course, capacity. To focus on availability over consistency, Swift has no transaction or locking delays. Large numbers of simultaneous reads are fast. Even large numbers of simultaneous writes complete quickly: each write notes its timestamp, and eventual consistency ensures that even under extreme load, conflicts are resolved after some delay.
Swift is excellent for high availability: if the network is unreliable, a strongly consistent distributed system must grind to a halt, but Swift can continue to store and retrieve data. Having a configurable number of redundant copies of your data means that you can arbitrarily increase Swift’s throughput for heavy loads, or increase Swift’s durability in the face of multiple simultaneous drive failures.
Swift also supports multi-region clusters, where redundant servers are located not on different racks in the same data center, but rather anywhere in the world. Because this requires traffic over high-latency routes, the situation resembles a partitioned network, and the performance of any consistent system necessarily becomes horrible. Swift, on the other hand, not only continues to function, but provides distributed points of presence for low-latency Internet access. Although this is likely to have a negative impact on the resolution time required to regain consistency after each write, many systems are willing to accept that trade-off, which would render a strongly consistent system largely unusable.
In short, if your system is willing to accept the limits of Swift’s eventual consistency guarantees in exchange for the benefits of high availability, redundancy, and throughput (and huge capacity), Swift is likely to be a good choice for your architecture. Examples of excellent uses for Swift include:
Fundamentally, Swift is accessed via a RESTful HTTP API. Many developers are already familiar with HTTP and some are familiar with the tricks and traps associated with it. REST is a term that has deviated in common usage from its original and more useful definition. It is often informally used simply to refer to a stateless communication protocol, while the original definition of REST is a much more expansive and powerful term that includes service discoverability and network optimization for servicing requests. Before discussing the Swift API, let’s quickly review the basics of HTTP and REST to ensure consistent terminology.
Because Swift relies on HTTP verbs and headers for both user metadata and system parameters, it is useful to review HTTP methods. Although many software systems send data over HTTP, they often do so in ad hoc ways. Swift uses the semantics of HTTP (mostly) as the protocol’s authors intended.
When you access a simple static web page in your browser, the browser performs a series of actions on your behalf:
200 OK
or 404 File Not Found
).
Here is a representative HTTP request and response:
GET /path/to/page.html HTTP/1.1 Host: www.example.com User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Cookie: cookie1=aaa; cookie2=bbb Connection: keep-alive X-User-Full-Name: Alice Adams HTTP/1.1 200 OK Content-Encoding: gzip Content-Type: text/html; charset=UTF-8 Date: Fri, 27 Dec 2013 23:24:03 GMT Server: Apache/2.2.22 (Ubuntu) Last-Modified: Thu, 23 Apr 2009 21:43:46 GMT ETag: "b221a-13d-4683fc64ce880" Accept-Ranges: bytes Set-Cookie: cookie1=yyy; cookie2=zzz Content-Length: 8317 X-Frame-Options: SAMEORIGIN X-XSS-Protection: 1; mode=block [content of the web page goes here]
In both the request and response, arguably the most important content is the
first line. The first line of the request, GET /path/to/page.html
HTTP/1.1
, tells the server that the client is issuing a GET
request
against the URL /path/to/page.html using the HTTP 1.1 protocol. The first
line of the response informs the client that the server replied in the HTTP 1.1
protocol and indicated that the request was successful.
After the first line of the request, the client uses request headers to
communicate
additional information that affects the request, such as what formats,
languages, and encodings it is willing to accept, what browser is being used,
and some custom information in a header starting with X-
. Take note
of this convention of reserving the X-
header namespace for
application-specific purposes—Swift makes heavy use of that feature of HTTP.
The first line of the server response gives its status code and a standard text description, in this case “200” and “OK,” respectively. There are many other status codes for the various success and error conditions, summarized in Getting a Response. Statuses in the 200-299 range are various types of success, while the 300-399 range is used for redirections, the 400-499 range for client errors, and the 500-599 range for server errors. (The informational 100-199 range is less commonly seen by developers and users, though it is actively used.)
After the first line of the response, the server provides some metadata in its
headers, such as the length of the content, the date it was last modified, and
the fact that the content is compressed with gzip. It also uses headers
starting with X-
for its own application-specific purposes.
This example showed a GET
request, the most common type of request on the
Web. GET
requests are used to retrieve data. Servers
should not interpret a GET
request in a way that modifies data, but keep in
mind that not all servers are good citizens in all circumstances. (For
example, a website might offer users the ability to delete data and, in a
poorly implemented system, that might
be transmitted to the server as a GET
request. While most modern sites would
not do this, it has definitely happened in the past and it illustrates that you
shouldn’t depend on external systems to follow best practices.)
HTTP supports several other methods (also called verbs) besides GET
. The
HEAD
method retrieves only the headers, not the actual data; it functions like
a GET
request that omits the data, which can be useful if the data is very
large and if we might want to take action based on just the headers. The PUT
method is used to store data on servers that support it, under appropriate
circumstances, such as when the user is authorized to do so. The DELETE
method requests that the server delete the information at the given URL, again
depending on authorization. The OPTIONS
method asks the server
what methods are allowed for the given URL. Web browsers such as Firefox,
Internet Explorer, and Chrome generally don’t expose HTTP methods other than
GET
and POST
to users, but the other HTTP methods
can be invoked with command-line tools such as cURL or with client-side
Javascript (AJAX). Modern web applications make extensive use of these HTTP
methods.
Although the GET
, HEAD
, PUT
, DELETE
, and OPTIONS
methods
perform very different actions, they have one important common feature: all of
them are idempotent. This means that repeated identical
requests yield the same result as a single request. For example, if you PUT
the same data to the server several times in a row, the result is the same as
if you had PUT
the data once. If you DELETE
a URL several times in a row,
again the result is the same as if you had performed a single DELETE
. Issuing
a GET
or HEAD
or OPTIONS
request doesn’t change the data on the server, so of
course issuing several of them doesn’t change the data, either.
The POST
method, however, is different. POST
is the HTTP method reserved for
requests that might not be idempotent. For example, a POST
might be used to create
a new object, such as a new credit card transaction. If you issue the POST
twice, you’ve created two transactions, not one—a very different result
indeed!
This carefully structured use of the HTTP verbs was laid out in the original
specification of HTTP 1.1 in the early 1990s. However, due to incomplete and
revised specifications, early browsers based their protocols on HTTP 1.0 and
only implemented GET
and POST
, leading to websites that only used GET
and
POST
. But starting early in the new millennium, the usefulness of the other verbs
has become obvious, and newer web development frameworks such as Rails depend
on them.
In today’s web development, nearly all frameworks use HTTP methods beyond GET
and POST
. Rails and Sinatra are the leading examples in the Ruby world; in
Python, Tastypie and Piston are two representative examples among many ways to
modernize Django in that regard. (Django itself, unfortunately, predates the
common use of extended HTTP methods.) It will come as no surprise that Swift
also uses these extended HTTP methods.
Roy Fielding, one of the authors of the HTTP specification, wrote a PhD dissertation in 2000 introducing his concept of REST. Fielding’s formulation included many interrelated concepts, including service discoverability, an idea of URIs referencing resources, a noun-based protocol in contrast with RPC-style verb-based protocols, caching and proxy behavior, a model of a network optimized for servicing requests, and other insightful observations. Sadly, the term “REST” is frequently misused to refer merely to stateless communication over HTTP, often with a create-read-update-delete (CRUD) API.
Although HTTP is a fine protocol for a service-oriented architecture, and stateless communication between entities provides many benefits (such as better isolation and testability of architectural components), Fielding’s use of the term “REST” encompasses a much wider vision. In particular, the concept of service discoverability is a valuable idea that makes a RESTful service easy to develop against and easy to support across multiple versions with differing capabilities. The interested reader is urged to read Fielding’s very accessible dissertation.
With that background, explaining the Swift API becomes quite simple.
Swift uses a RESTful HTTP API. Data is retrieved with an HTTP GET
, stored with
a PUT
, and deleted with (you guessed it!) a DELETE
request. Metadata is stored
in headers, so a HEAD
request is often useful.
Swift does a better-than-average job of using HTTP status codes correctly, so that a 404 actually means “File Not Found” rather than “something isn’t right somewhere.” Clients may issue any of several extra HTTP client headers, specifying more advanced Swift features to apply to their requests.
Swift uses HTTP server headers to provide clients with many different types of metadata related to the request. As a developer, if you have a way to issue HTTP requests and receive responses (including headers), whether in the form of a Swift client library or a command-line tool such as cURL, you have all you need to start developing against Swift. Unlike many other systems, no special SDK (software development kit) is necessary—plain HTTP is all you need to access all of Swift’s features.
Of course, not every client is authorized for every operation;
Alice shouldn’t be allowed to delete Bob’s data, for
example.
The general problem is that of authentication (verifying who a user is) and authorization (determining
what a user is allowed to do), collectively referred to as “auth.” Swift
allows system administrators to configure any of several auth systems,
depending on their needs. The auth API is versioned, with auth versions 1 through 3 in active use; we will mostly discuss version 1 auth in this book because it is the simplest and suitable for many needs. In version 1 auth, a request is made to the auth system, with the X-Auth-User
and X-Auth-Key
headers set; if the user gives the correct key (password), then Swift issues a response containing
an X-Auth-Token
header. That token is used on subsequent requests to prove
that the client has successfully authenticated. The response to a successful
authentication request also contains an X-Storage-Url
header, specifying the
root URI of the user’s primary account. Auth requests will be discussed in
more detail later in this chapter in Using the Swift API.
Swift’s RESTful HTTP API produces a system that is easy for developers to
understand and characterize. Each API call is a single HTTP round trip. With
the exception that a valid auth token must be retrieved for use in later
operations, each HTTP round trip requires no other knowledge about the system.
This ability to control and observe its state makes Swift easier to test and qualify as part of your architecture than other platforms. You can replicate HTTP content with command-line tools such as cURL in order to confirm system
behavior, but when building applications, most developers will want to use a
Swift library for their preferred language. In the next chapter, we will provide examples from the python-swiftclient
library, with some
discussion of other language bindings.
Although we’ll discuss the Swift API in more detail in later chapters, here we present an overview to help readers understand what the RESTful HTTP API looks like, and to help you see how integrating Swift into an existing architecture can be relatively simple. This section assumes a working Swift installation, a valid username and password, and a storage account for that username available in the cluster.
Before you even try to access any data, you might want to get some information
about your Swift cluster. What version of Swift is it running? What values
are configured for its parameters, such as maximum object size? What features
are enabled by middleware in its pipeline? This information is all available
via the Swift cluster info API, which can be accessed with a simple GET
request
to the path /info under the Swift cluster’s base URL. For example, if your Swift
cluster’s base URL is http://swift.example.com, you can get the cluster info
with a GET
request to the URL http://swift.example.com/info. The cluster
info API is public; no authentication is required.
The response to an info request will be a JSON dictionary such as the following:
{ "tempauth": { "account_acls": true }, "slo": { "max_manifest_segments": 1000, "min_segment_size": 1048576, "max_manifest_size": 2097152 }, "swift": { "max_file_size": 5368709122, "account_listing_limit": 10000, "max_meta_count": 90, "max_meta_value_length": 256, "container_listing_limit": 10000, "version": "1.12.0.37.g45feab5", "max_meta_name_length": 128, "max_object_name_length": 1024, "max_account_name_length": 256, "max_container_name_length": 256 } }
A Swift client application can retrieve and parse the cluster info, and use the results to determine how to proceed based on the cluster’s advertised capabilities. This is one way in which Swift implements the RESTful principle of discoverability.
With a few exceptions (such as the info request described in the previous section), an authentication token (“auth token”) is a prerequisite for nearly all Swift operations. From a cold start, the first step in a Swift operation is to authenticate and receive an auth token. Because Swift allows administrators to plug in their auth system of choice, the auth URL (that is, the URL to which requests for authentication are sent) is separate from the storage URL (which specifies where data is stored) of the user’s primary account. In order to authenticate to Swift, you need to have the cluster’s auth URL—and, of course, your username and password.
Your authentication request then looks like this:
curl -i -X GET -H 'X-Auth-User: myusername' -H 'X-Auth-Key: mysecretpassword' https://swift.example.com/auth/v1.0
In this example, https://swift.example.com/auth/v1.0 is the auth URL. Generally, Swift traffic should happen over the encrypted HTTPS connection, in order to protect all data on the wire (credentials and tokens for an auth request, or the stored content itself).
The auth system then produces a response. If there is no
problem with the Swift installation, and your username and password are
correct, then the authentication response code is 200 OK
and the response
headers will return with the auth token and the storage URL of your username’s primary account, which will look like this:
X-Auth-Token: AUTH_tkdc764d39fd1c40c9a293cbea142b90d7 X-Storage-Url: https://swift.example.com/v1/AUTH_myusername [other HTTP headers]
Auth tokens have a configurable expiration time, with a default of 24 hours. During that period, this auth token
will prove your identity to the auth system, which can then check to see
whether you are authorized to perform the operation you are requesting. If you
present an auth token that was previously working but you receive an HTTP 401
response, then your auth token has likely expired and you should
re-authenticate to receive a new token.
The storage URL returned by the auth response is the root of the storage area of the user’s primary Swift account. Account is a bit of a confusing term. In the context of Swift, think of an account as an area where data is stored. Like a bank account, a Swift account may be owned by one person or co-owned by multiple people. A given user, such as Alice or Bob, might have access to multiple Swift accounts, because the cluster might be set up with one account (storage area) per project team, or per service, or any of several other strategies. So the storage URL returned by the auth request is not necessarily the only storage URL that the user may access.
It is even more important to think of Swift accounts as storage areas when considering Swift’s data hierarchy. At Swift’s basic level, an account has containers and a container has objects. Unlike a traditional filesystem, however, containers can’t be placed in other containers, and objects can’t be placed directly in an account without a container. The URL of every object in a Swift cluster looks like https://swift.example.com/v1/myaccount/mycontainer/myobject, with exactly one account (storage area) name, one container name, and one object name. So it makes sense to emphasize this:
Swift users are identities. Swift accounts are storage areas.
Keep that in mind as we discuss storing and retrieving data, because we will be discussing accounts constantly!
Now that you have an auth token, you can issue requests to the storage
system. You might have a storage URL from an external source (such as a
configuration file,
your colleague, or the Swift cluster administrator), but for now let’s assume
that you’re interested in the data in your primary account—the storage URL
returned by the auth system’s response. If you’ve successfully authenticated
as described in the previous section, you might store your token and storage URL in environment
variables, which can be used to simplify requests to the storage
system. For
example, suppose you store your token in the environment variable
TOKEN
and the storage URL in the environment variable
STORAGE_URL
. If you’ve already created two containers named my_photos and my_videos in this account, then this might be a request and response:
curl -i -X GET -H "X-Auth-Token: $TOKEN" "$STORAGE_URL" HTTP/1.1 200 OK X-Account-Bytes-Used: 458 X-Account-Container-Count: 2 X-Account-Object-Count: 2 [...other headers...] my_photos my_videos
The 200 response indicates a successful request. If you had tried to access an
account for which you did not have authorization, you would receive a
403 Forbidden
response. (401 Unauthorized
indicates an authentication
failure; 403 Forbidden
indicates an authorization failure.) Swift
provides metadata about the resource in the headers of its response: in this
case, you learn that this account has two containers in it, with only one
object in those containers, using a total of 458 bytes. After the metadata,
the content of the response appears. This request was a GET
on an account
(storage area) resource; the response is a list of containers in that account.
Swift makes the choice to be case-insensitive with header names, though not
with header values. You can pass in x-auth-user: alice
or
X-Auth-User: alice
(or x-aUtH-uSER: alice
), and they will all work
equivalently. Note that these are all different from X-Auth-User: ALICE
,
however! The header name is case-insensitive, but the header value must use
upper- and lowercase letters properly.
Similarly, if you do a GET
on a container (such as $STORAGE_URL/my_photos)
rather than on an account as in the previous example,
the content of the response would be a list of objects in that container, and
the headers of the response would contain metadata about the container such as
the number of objects in it. And if you do a GET
on an object… well, you
retrieve the object, of course! (And yes, you also get metadata in the
headers.)
If you are interested only in the metadata, you can use a HEAD
request instead
of a GET
. This avoids the network load of retrieving and sending potentially
large objects (or even the potentially large container listing or account
listing), and just gives you the information you want.
When you retrieve the listings of an account or a container, you might want them in a
format that is easy to parse. You can append ?format=json
to
your resource’s URL (such as an account’s storage URL) to have Swift return
your response in JSON, or ?format=xml
for an XML response:
% curl -X GET -H "X-Auth-Token: $TOKEN" "$STORAGE_URL?format=json" [ {"count": 1, "bytes": 458, "name": "my_photos"}, {"count": 0, "bytes": 0, "name": "my_videos"} ] % curl -X GET -H "x-auth-token: $TOKEN" "$STORAGE_URL/my_photos?format=json" [ { "hash": "0450d6d21f1aa2aa1fe4a354c8e62c8f", "last_modified": "2014-01-16T20:01:04.329970", "bytes": 458, "name": "happy.png", "content_type": "image/png" } ] % curl -X GET -H "x-auth-token: $TOKEN" "$STORAGE_URL/my_photos?format=xml" <?xml version="1.0" encoding="UTF-8"?> <container name="my_photos"> <object> <name>happy.png</name> <hash>0450d6d21f1aa2aa1fe4a354c8e62c8f</hash> <bytes>458</bytes> <content_type>image/png</content_type> <last_modified>2014-01-16T20:01:04.329970</last_modified> </object> </container>
In these examples, observe that with the ?format=…
query parameter,
a GET
request on an account still returns a list of containers, but with
some useful metadata included in the data structure. Similarly, a GET
request on a container still returns a list of objects in the container,
but with useful metadata included in the data structure. In both the JSON
and the XML example, the returned data and the implementation
of lists and attributes are structured in a manner appropriate to the format. Given
a JSON parser or XML parser, it would be easy to parse this data and make
use of it.
You save data to a Swift cluster with an HTTP PUT
method. This applies
not only to the process of adding an object to a container, but also when
adding a container to an account. If a Swift cluster
administrator wants to add a new account to the Swift cluster, that is also
done as a PUT
.
In all cases, you execute a PUT
to the URL of the resource you want to create,
as opposed to the URL of the parent resource or any other scheme. If you want to
create a new container called my_music to go along with my_photos and my_videos, issue this command:
curl -i -X PUT -H "X-Auth-Token: $TOKEN" $STORAGE_URL/my_music
A successful PUT
creates the container, which didn’t exist before. The -i
option lets you see the headers returned by Swift, which look like this:
HTTP/1.1 201 Created Last-Modified: Thu, 16 Jan 2014 19:53:21 GMT Content-Length: 0 Date: Thu, 16 Jan 2014 19:53:20 GMT [...]
The 201 response code indicates a successful PUT
. You will now see your new
container along with the other two, if you issue a GET
to the account.
When you PUT
an object, you also need to provide the data that will be stored
in the object. When developing an application, you will use the methods
provided by your Swift library. But we can also do this on the command line
with cURL, by uploading a local file as shown in Example 5-1:
curl -i -X PUT -H "X-Auth-Token: $TOKEN" $STORAGE_URL/my_music/my_song.mp3 -T /tmp/song.mp3
cURL uses the -T
flag to provide the object data. This use
of cURL is suitable for testing, or small scripts. In a more complex
software development
effort, Swift libraries such as filename
python-swiftclient
would provide equivalent
functionality with more efficiency by allowing streaming data rather than
requiring a file on the filesystem, and not spawning a subshell to run cURL.
Deleting an object is as easy as you would expect:
curl -i -X DELETE -H "x-auth-token: $TOKEN" $STORAGE_URL/junk_container/junk_object
Swift prevents accidental mass deletion of data by requiring that a container
be empty before you can delete it. If you try to delete a non-empty container,
you will get an HTTP response of 409 Conflict
as shown here:
% curl -i -X DELETE -H "x-auth-token: $TOKEN" $STORAGE_URL/junk_container HTTP/1.1 409 Conflict Content-Length: 95 Content-Type: text/html; charset=UTF-8 X-Trans-Id: txe0fa9bced54448ecb3753-0052d84131 Date: Thu, 16 Jan 2014 20:29:37 GMT <html> <h1>Conflict</h1> <p>There was a conflict when trying to complete your request.</p> </html>
However, once you delete all the objects in the container, the previous request will succeed. (See Bulk Operations Middleware for an efficient way to delete multiple objects, such as to empty a container.)
The HTTP POST
method is used to update the metadata of an existing object,
container, or account. (For historical reasons, a PUT
to a container also
updates metadata rather than overwriting it, but this use isn’t recommended.)
A POST
to an object updates its metadata without the overhead of
resending the object’s content. For instance, in Example 5-1, we stored an MP3 file.
When someone retrieves it, we might want its Content-Type
header to be set
correctly. We could have done this in the original PUT
, but we can easily
update it now:
curl -i -X POST -H "X-Auth-Token: $TOKEN" -H "Content-Type: audio/mpeg" $STORAGE_URL/my_music/my_song.mp3
This request results in a few hundred bytes rather than several megabytes of traffic through the system.
In addition to Internet-standard metadata such as Content-Type
, Swift also
allows and encourages you to set custom metadata on your objects, containers,
and accounts. When developing a Swift-based system, consider what metadata might improve your system’s performance, help your users
understand what they’re retrieving, or indicate which of several resources is
most appropriate. There are so many possible uses for metadata that we can’t
begin to discuss them all, but here are a few:
Swift enforces some limits on the size of metadata. (These limits are given in the Swift cluster info available at the /info URL.) Metadata is usually much smaller in size than the content of the object, so don’t expect to store megabytes of information in the metadata. However, storing some carefully chosen metadata can provide your system with surprising flexibility to optimize for performance, minimize traffic, or provide a better user experience.
We hope this chapter has given you both some conceptual background on Swift’s architecture and a useful overview of the Swift API. Our goal was to help you understand more fully how Swift is designed and whether it’s a good fit for your needs. If it is a good fit, we hope you’ve been able to get started issuing basic commands. The next chapter focuses on building applications using Swift client libraries, which are more efficient at carrying out extended operations with Swift than the basic commands you’ve learned so far.
18.224.62.105