© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
M. BakerSecure Web Application Development https://doi.org/10.1007/978-1-4842-8596-1_6

6. APIs and Endpoints

Matthew Baker1  
Kaisten, Aargau, Switzerland

In this chapter, we will begin looking at coding web applications, starting with designing our endpoints: URLs and APIs. These are the building blocks of a web application. HTTP leaves a number of choices to us: what request method to use, what response code to return, what format to use for the request and response body.

We will begin by looking at the anatomy of a URL before exploring REST APIs. These are a specific type of API that leverages the HTTP protocol to enable stateless requests to server-side functionality.

A key method for restricting access to server-side functionality is username and password-based authentication. We will begin looking at this topic in this chapter (it is covered more fully in Chapter 10) as well as how to use unit testing to ensure permissions are set up correctly.

We finish the chapter by looking at some specific attacks that exploit vulnerabilities in APIs: deserialization attacks.

6.1 URLs

The general form of a URL is

All parts are optional except path. If path begins with a slash / (and has to if host is present in the URL), then it is an absolute path; otherwise, it is relative to the URL it is loaded from.

This is the URL in a link or the address bar of a browser. We saw in Chapter 4 that the actual request, sent once the connection is established, is
GET path?query HTTP/1.1
Host: host@port

The fragment (the part after the #) is not sent but applied by the browser. The username and password are placed in a header, which we discuss in Chapter 10.

If HTTPS is used, the entire request is encrypted. The host and port are sent unencrypted over the network so that the connection can be established. The query parameters, username, and password are not.

Despite this, we should not send sensitive data in a GET request. The reasons are as follows:
  • The URL is stored in the browser history.

  • The URL may be logged on the server and remain in backups, including on other servers such as proxies.

  • GET requests should not change state, as we will see later.

We therefore use the POST method when sensitive data is sent to the server. Sensitive data includes usernames, passwords, authorization tokens, session IDs, as well as any other secret data such as credit card details.


REST APIs (REST stands for Representational State Transfer) provide users and applications with programmatic access to functionality. Rather than returning HTML pages, they return data, serialized for transport in some way, for example, as JSON or XML.

REST APIs are no different from regular HTTP requests other than the data type of the body. The same HTTP methods are used (GET, POST, PUT, etc.) as well as the same response codes (200 OK, 404 Not Found, etc.)

A misconception is that the term “REST” applies to any API that allows programmatic access to the application over HTTP. However, REST APIs conform to specific principles and build on the existing meanings of the HTTP methods and response codes.

By keeping to the correct REST principles, we avoid some common vulnerabilities. Also, we can use frameworks that take care of the mechanics of REST, allowing us to focus on business logic. We write less code, and it is clearer. As we have already seen, less code means fewer vulnerabilities, and clearer code means we are more likely to spot them.

REST APIs operate on items and collections. An item is a single entity, such as an address. A collection is a group of items, such as an address book.

GET Requests

REST GET requests can be called on an item, for example:


or on a collection, for example:


In the former, a single address, with an ID of 100, is returned. In the latter, a set of addresses is returned.

A GET request is idempotent: it is safe to call it more than once, and it should return the same result each time. It is cacheable: a browser can store the result rather than making a fresh request to the server each time it is requested by the user.

GET requests should not change state, both out of idempotency and security reasons (we will look at the security issues in the next chapter).

The return status should be 200 OK if the requested item exists, with the item or collection in the body, and 404 Not Found otherwise.

POST Requests

POST is for creating an item. It is therefore only called on collections. It is not idempotent: if you call it more than once, a new item will be created each time. It should not be cached.

Returning a response body is optional. You may return the ID of the newly created item, in which case the response code should be 201 Created or 200 OK. If you do not return the ID or item, return a response code 204 No Content.

PUT Requests

PUT is for updating an existing item. It is therefore only called on an item. PUT requests are not cached. To see why, imagine you update a record, say, an address, using PUT. Then, another user updates the same record. If you want to change it back, you would issue the same PUT request as before. If it were cached, it would not execute on the server.

PUT is regarded as idempotent, as calling it multiple times has the same effect. It should return 200 OK if the response contains the ID or item, 204 No Content if it does not, or 404 Not Found if the record does not exist.

PATCH Requests

PATCH, like PUT, updates a record, so it is called on an item. However, PUT is a full update, whereas PATCH is a partial update. You only supply PATCH with the data that are changing, not the whole record.

Unlike PUT, PATCH is not idempotent. To see why, imagine we have the following address in our database:




1060 W Addison St







Say we use PATCH to set the postcode to 60600. If we use JSON, the request might look like
    "Postcode": "60600"
The address in the database would become




1060 W Addison St







Now imagine another user changes the state to OR and we make our PATCH call again to set the postcode to 60600. Now the record is




1060 W Addison St







Our two PATCH requests do not result in the same state in the database, so PATCH is not idempotent.

This is not the case with PUT as we supply all the records in the request.

PATCH should return 200 OK if the response contains the ID or item, 204 No Content if it does not, or 404 Not Found if the record does not exist.

DELETE Requests

DELETE removes an item, so it is called on items, not collections. It is normally regarded as idempotent as deleting a record a number of times makes the database look the same. However, deleting a nonexistent record results in a 404 Not Found message, so arguably it is not idempotent at all.

DELETE is not cacheable as we might wish to call DELETE, then a POST to create the item again, then another DELETE.

DELETE should return 200 OK if the response contains a body (e.g., a status message or the ID of the deleted object). It should return 204 No Content if the response does not contain a body and 404 Not Found if the item does not exist.

A summary of REST methods is given in Table 6-1.
Table 6-1

Summary of REST methods



Can Write

Called On

Return Codes




Retrieve an item/collection


Item or collection

200 OK

404 Not Found




Create an item



With ID:

201 Created

200 OK

Without ID:

204 No





Update an item



With ID:

200 OK

Without ID:

204 No


404 Not Found




Partial item update



With ID:

200 OK

Without ID:

204 No


404 Not Found




Delete an item



With data:

200 OK

Without data:

204 No


404 Not Found



REST APIs in Django

The Django REST Framework1 makes it easier to write REST APIs in Django, especially when you conform to the REST principles mentioned previously. You don’t have to use it, but it simplifies development by integrating serialization, content negotiation (different content types can be returned), and authentication. It also provides base classes for creating multiple endpoints simultaneously (GET, POST, DELETE, etc.).

The Django REST Framework is installed with
pip3 install djangorestframework

See the web page for details.

To see how using this framework simplifies developing a REST API and reduces vulnerabilities, let’s look at an example. Our Coffeeshop application has an Address table to store customers’ billing and delivery addresses. We will write a REST API to manipulate it. You don’t have to enter the code as it is already in our application.

First, we need a serializer for the Address model. We have the following in serializers.py.
class AddressSerializer(serializers.ModelSerializer):
    class Meta:
        model = Address
        fields = ['pk', 'address1', 'address2', 'city', 'postcode', 'country']
Listing 6-1

REST API serializer for the Address model, in serializers.py

We only include the fields we want to send between the client and server. We do not include user as we take that from the logged-in user.

We have inherited from the Django REST Framework’s ModelSerializer. The framework also provides a lower-level Serializer class for serializing more arbitrary data.

Next, we define our views. We need one for GET so that we can retrieve the address book and individual addresses, POST so we can add an address, PUT so we can update one, etc. The framework’s ModelViewSet lets us create all these views in one class. We have the following code in views.py:
class AddressViewSet(viewsets.ModelViewSet):
    def get_queryset(self):
        return Address.objects.filter(user=self.request.user)
    serializer_class = AddressSerializer
    permission_classes = [permissions.IsAuthenticated, OwnsAddress]
    def perform_create(self, serializer):
        return serializer.save(user=self.request.user)
Listing 6-2

REST API view set for the Address model, in views.py

The serializer_class definition is the only one required by the framework. It defines the class to deserialize data in requests and serialize data for responses.

The permission_classes defines the permissions that must be present when the request is made. If they are not, the framework will return a 403 Forbidden response. We will look at permissions a little later in this section.

In lines 2–3, we overwrite the default GET method for collections (for individual items, the get_object function is called). We overwrite it because a user should only be able to see their own addresses. The permission classes from line 6 have already been applied, so we can assume a user is logged in.

Lines 8–9 overwrite the default item creation function. We do this to set the user attribute to the person who is logged in, because we are not taking it from the serialized data stream.

We have two permissions classes defined (line 6). The permissions.IsAuthenticated permission comes with the REST framework. We defined the other ourselves, in permissions.py. It is shown in the next listing.
class OwnsAddress(BasePermission):
    def has_object_permission(self, request, view, obj):
        return obj.user == request.user
Listing 6-3

Custom REST API permissions classes in permissions.py

The OwnsAddress class defines an object permission that returns True only if the logged-in user matches the user field in that object.

Finally, we include the URLs for our API views with the following code in urls.py.
router = routers.DefaultRouter()
router.register(r'addresses', views.AddressViewSet, basename="addresses")
urlpatterns = [
path('api/', include(router.urls)),
path('api-auth/', include('rest_framework.urls', namespace='rest_framework')),
Listing 6-4

Activating REST API endpoints in urls.py

All view sets registered with the router.register() function are included by line 6. Line 7 includes authentication endpoints. We will look at REST authentication in Chapter 10. For this example, we are using Django’s default session authentication.

When reading the view set in Listing 6-2, the permissions are clear. Even without looking at permissions.py, it is easy to see who has access to what because the permissions are intuitively named.

We have not written much code, leaving most of it to the well-established Django REST Framework. REST principles are well defined, and we can leverage their implementation in the framework. Where permissions deviate from the default, our code makes it clear. Common mistakes, such as using the primary key from the request body rather than the URL, are avoided by relying on the framework. We don’t have to manage permissions for each view, which could introduce inconsistencies.

Exploring The Address Book REST API

The Address REST API example shown previously is implemented in our Coffeeshop VM. Practice using it by visiting the URL

By default, the Django REST API supports username/password and session ID authentication. If you do not have a session ID cookie (i.e., you have not logged into Coffeeshop) and you do not provide credentials, you will receive a 403 Forbidden response code. You have three choices:
  1. 1.

    Visit, click Log In, and sign in.

  2. 2.

    Pass the username and password in the URL: http://username:password:

  3. 3.

    If using a programmatic or command-line client, pass the credentials in the Authorization: Basic header.


For now, log in as user bob using option 1. The password is in the coffeeshop/secrets/config.env file. Look for DBUSER1PWD.

One you have logged in, visit in your browser again. You should see a screen as shown in Figure 6-1.

An HTML view of a Django REST Framework endpoint is depicted in this screenshot. It illustrates an Address List with a code.

Figure 6-1

HTML view of a Django REST Framework endpoint

Our REST API is ordinarily expected to return JSON. When we visit URL endpoints in a browser, the Django REST Framework sends the response as HTML because of content negotiation between the client and server. The client gives its acceptable content types, in order of preference, in the request. The framework responds with HTML instead of JSON if that is higher in the preference list.

Create a new address by filling out the form below the address book and clicking on POST.

Now visit an individual address by entering

taking pk from the address you want to view from the list. For example, you can view the first (and only) address in bob’s address book by visiting

You can update the address by entering data in the form below it and clicking the PUT button (thereby issuing a PUT request). You can delete the address by clicking the DELETE button (a DELETE request).

Since the Django REST Framework is configured to accept username and password as well as session ID cookies, we can call the endpoints using Curl by supplying credentials on the command line. We will look at authentication in Chapter 10, but one simple way to do this with Curl is with the -u command-line option. Try the following from within a Coffeeshop VM SSH session:
curl -u $DBUSER1:$DBUSER1PWD http://localhost/api/addresses/

6.3 Unit Testing Permissions

We saw in the previous section that we can reduce vulnerabilities by having clear, intuitive permission code. The Django REST Framework takes care of responding with 403 Forbidden if the permissions are not satisfied.

For function-based views, Django makes use of decorators to add clarity to permissions. When a page should only be viewable by a logged-in user, we can precede the function with @login_required. We do this for our basket view in the Coffeeshop application:
def basket(request):
    cart = None

If a user is logged in, Django executes the function. If not, the user is redirected to the login page and sent back upon successful login. Django also has a @permission_required decorator that returns an error if the logged-in user doesn’t have the stated permission.

Using decorators (or permission_classes in REST view classes) adds clarity to your code, making it easier to spot incorrect authorization settings.

One common mistake is to not apply permission checks on URLs called internally. For example, imagine a user has entered form data. The Submit button sends them to the URL /submitform. Once the form has been processed, the user is redirected to a URL /viewdata that displays the data they submitted. The /viewdata function needs the same permission checks as /submitform; otherwise, a malicious user could visit it directly.

In case errors do slip through, we can unit test permissions. Good developers unit test the functionality of their code but can forget to test that the code is callable by valid users and not callable by invalid ones.

Imagine a site that has a URL /protected/. This URL should only be available to logged-in users. As well as testing that the code that serves /protected/ actually performs the correct processing, we should also check that it can be called when a user is logged in and cannot be called when there is no logged-in user. We can put the following in the tests.py file:
from django.contrib.auth.models import AnonymousUser, User
from django.test import RequestFactory, TestCase
from django.urls import resolve
from .views import *
class SimpleTest(TestCase):
    def setUp(self):
        self.factory = RequestFactory()
        self.testuser = User.objects.create_user(
            email='[email protected]',
    def test_authorized(self):
        url = '/protected/'
        request = self.factory.get(url)
        request.user = self.testuser
        myview, myargs, mykwargs = resolve(url)
        response = myview(request)
        self.assertEqual(response.status_code, 200)
    def test_unauthorized(self):
        url = '/protected/'
        request = self.factory.get(url)
        request.user = AnonymousUser()
        myview, myargs, mykwargs = resolve(url)
        response = myview(request)
        self.assertEqual(response.status_code, 302)
        self.assertEqual(response['location'], '/account/login/?next=/protected/')
Listing 6-5

Testing permissions in tests.py

The tests are run with
python3 manage.py test

It will create a fresh, empty database, with the schema but nothing else. Django runs each class extending TestCase and, within those, every function beginning in test_. The setUp() function is called before each test. We use this to create a test user.

The test_authorized() function first creates a request object for the URL we are testing, at line 15. We set the user to our test user as we want to confirm that the page is accessible when a user is logged in. We next resolve the URL into the function that handles it (line 17) so that we can call it (line 18). If all went well, Django should return the HTML for the basket page and a response code of 200 OK. We are not checking the HTML of the response in this case, only that the response code is 200. We do this with an assert at line 19.

The test_unauthorized() function tests that the view is inaccessible when a user is not logged in. We put the request’s user in the unauthenticated state by setting it to AnonymousUser() at line 23.

Line 26 tests that the response code is 302 Found. We do not test that it equals 403 Forbidden. The reason is the @login_required decorator we use in Django redirects the user to the login page, rather than returning 403 Forbidden. If testing a REST API call, you assert the response code is 403, not 302, as it does not perform a redirect.

At line 27, we also check the redirect location does in fact go to the login page.

One final point to note before we try this in an exercise is how Django creates the test database. It uses the same database engine configured in the DATABASES variable in settings.py. It creates a new database with the same name but prefixed with test_. This means the database user Django is configured with has to have database creation permissions. This is undesirable in production settings, so we need to create separate configuration for development. Our alternative is to change the database engine to SQLite just for tests. We do this with the following lines in settings.py
if 'test' in sys.argv or 'test_coverage' in sys.argv:
    DATABASES['default']['ENGINE'] = 'django.db.backends.sqlite3'

When the database engine is SQLite, Django creates in-memory databases.

Unit Testing Django Permissions

In this exercise, we will create a test case for the /basket/ URL like the one in Listing 6-5. If you like, you can try writing it yourself based on this listing. As in that case, you will want to test the response code in the second test in 302 and test the location has an appropriate value.

Alternatively, a working tests.py is in

within the Coffeeshop VM. Copy this into the coffeeshopsite/coffeeshop directory.

Run your test with
python3 manage.py test

6.4 Deserialization Attacks

In Section 6.2, we used JSON for our request and response bodies. JSON has become a popular serialization format for a number of reasons:
  1. 1.

    It easy for humans to read and edit with standard editors.

  2. 2.

    It works for many languages.

  3. 3.

    It is fairly compact.

  4. 4.

    Its functionality is limited, minimizing risks.


XML was a popular format before JSON gained popularity and still is, especially in older frameworks.

Data formats should be chosen carefully as they can lead to vulnerabilities. A malicious user can craft requests that exploit vulnerabilities when deserialized at the server. An attacker can stage a man-in-the-middle attack to intercept and alter request and response bodies between the server and another user. If the deserialization is not secure, undesirable actions can be performed on the server.

XML Attacks

XML is a flexible data format that contains features developers are sometimes unaware of and can be exploited.

A famous XML vulnerability is known as the Billion Laughs attack. The name comes from its original example. XML supports a directive called <!ENTITY>. It allows the author of the XML to define a shortcut. For example, the author could define
<!ENTITY ms "Microsoft Corp">

Then, instead of writing Microsoft Corp in the XML body, the author just has to write &ms;.

Unfortunately, entities are allowed to be recursive: an entity can refer to another entity. This feature is exploited by the Billion Laughs attack. Consider the following XML document.
<?xml version="1.0"?>
<!DOCTYPE hahas [
<!ENTITY haha "haha">
<!ENTITY haha2 "&haha;&haha;&haha;&haha;&haha;&haha;&haha;&haha;&haha;&haha;">
<!ENTITY haha3 "&haha2;&haha2;&haha2;&haha2;&haha2;&haha2;&haha2;&haha2;&haha2;&haha2;">
<!ENTITY haha4 "&haha3;&haha3;&haha3;&haha3;&haha3;&haha3;&haha3;&haha3;&haha3;&haha3;">
<!ENTITY haha5 "&haha4;&haha4;&haha4;&haha4;&haha4;&haha4;&haha4;&haha4;&haha4;&haha4;">
<!ENTITY haha6 "&haha5;&haha5;&haha5;&haha5;&haha5;&haha5;&haha5;&haha5;&haha5;&haha5;">
<!ENTITY haha7 "&haha6;&haha6;&haha6;&haha6;&haha6;&haha6;&haha6;&haha6;&haha6;&haha6;">
<!ENTITY haha8 "&haha7;&haha7;&haha7;&haha7;&haha7;&haha7;&haha7;&haha7;&haha7;&haha7;">
<!ENTITY haha9 "&haha8;&haha8;&haha8;&haha8;&haha8;&haha8;&haha8; &haha8;&haha8;&haha8;">
Listing 6-6

The Billion Laughs attack

Line 3 defines a macro &haha; that just expands to the text haha. Line 4 defines a macro &haha2; that expands to ten &haha;’s, in other words, ten haha’s. Line 5 expands to 10 &haha2;’s, in other words 100 haha’s, and so on, up to &haha9;.

As a result, placing a single &haha9; in the last line of the document results in the insertion of one billion haha’s. A file that has taken no more than 1K of text for the attacker has consumed 3GB of memory on the server.

Of course, no developer would write such code in their own application. However, if server-side code to parse XML exists and entity expansion is allowed, as is the case in many XML parsers, then an attacker can craft a request that results in a Denial of Service.

The entity tag also supports a parameter called SYSTEM. This expands to the contents of a URL. It can be exploited by the XML External Entity (XXE) attack, resulting in file disclosure if the expanded XML is visible to the attacker. Listing 6-7 displays the Unix password file.
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd" >
Listing 6-7

The XML External Entity attack

The Billion Laughs Attack

Python XML parsers depend on libexpat. Ubuntu 20.04 LTS, on which our Coffeeshop VMs are based, has a version that is vulnerable to the Billion Laughs attack (they have version 2.2.9; the vulnerability was fixed in version 2.4.1).

Visit the Coffeeshop URL and click on a product. To illustrate the Billion Laughs vulnerability, this page makes a badly written API call to fetch the stock availability. Take a look at the code. The API call is in JavaScript at the end:
function getStockLevel() {
    const productId = "{{ product.id }}";
        fetch("{% url 'stocklevel' %}", {
        method: 'POST',
        headers: {
              'Content-Type': 'application/xml'
        body:  "<product>" + productId + "</product>"
            function(response) {
              if (response.status == 200) {
                  response.json().then(function(data) {
                    if (data.quantity == 0)
                        $('#stocklevel').html('<p class="text-danger">Out of Stock</p>');
                        $('#stocklevel').html('<p class="text-success">' + data.quantity
                            + ' in stock</p>');

A hacker reading this code, wanting to launch a Denial-of-Service attack, would see the XML POST request and wonder if it is susceptible to Billion Laughs. It is easy to test. Let’s replicate the call using Curl.

Start an SSH session in each of our two VMs with vagrant ssh. In the Coffeeshop VM, run the command

This gives an interactive list of running processes. Press Shift-M to sort by memory.

In the CSThirdparty SSH session, run the command
curl -X POST -d "<product>1</product>"
The API endpoint /stocklevel/ takes the product ID in the XML body. You should see a JSON return string of
{"quantity": 10}
In the /vagrant/snippets directory, you will find a file called hundred_million_laughs.xml. This is a slightly smaller attack than Billion Laughs (eight nested haha’s instead of nine) so that we don’t crash our machine. We could paste this into a Curl command like the one before; however, Curl also lets us read the request body from a file. Enter the commands
cd /vagrant/snippets
cat hundred_million_laughs.xml |
    curl -X POST --data-binary @-

Now switch back to your Coffeeshop VM and take a look at top. At the beginning of the list, you will see a Python process now taking close to 40% of the memory (in the VirtualBox version, a smaller percentage for the Docker but only because the memory allocated to the container is higher).

The API call will fail with a 500 Server Error response code, but that is not the object of the attack. Even after the call ends, the Python process is still consuming over a third of the memory. Adding the extra haha line would increase this by a factor of ten.

In order to free the memory, we must restart Apache with
sudo apachectl restart

Function Calls and Creation

Some serialization protocols allow functions or classes to be created from a serialized stream. One example is the BinarySerializer from Microsoft’s .NET framework. While not allowing arbitrary code to be created, it does (or, at least, did—some improvements have been made) allow dangerous .NET functionality to be called. Microsoft now recommends against its use.

Defending Against Deserialization Attacks

The best defense against these attacks is more restricted deserialization:
  1. 1.

    Don’t allow function creation.

  2. 2.

    Don’t allow function calls, or at least severely restrict them.

  3. 3.

    Understand your deserialization tools and capabilities, especially XML parsers.

  4. 4.

    Sanitize the body and ensure it complies with the format specifications.

  5. 5.

    Avoid descriptive error messages.


JSON is regarded as a safe format if sensibly used. It only supports simple data types (strings, integers, booleans, etc.), arrays, and dictionaries. It does not support function creation or calls.

However, the way in which the deserialized data are used may introduce vulnerabilities, even with JSON.

Before deserializing, check if the size is within bounds to avoid memory issues. The Django REST Framework’s Serializer class and its derivatives are helpful because they validate the string against the serializer’s expected schema.

When a deserialization error does take place, avoid telling the user what went wrong. This can enable the attacker to discover vulnerable parts of your code.

6.5 Summary

In this chapter, we looked at how to safely design our applications endpoints: when to use POST vs. GET and how to build a safe REST API. We saw how adhering to the established REST standards and using existing frameworks can make your code safer. We also looked at some common deserialization vulnerabilities that enable an attacker to exploit poor communication formats between a client and server, in particular unsafe use of XML.

We focussed on URLs. In the next chapter, we will look at vulnerabilities that can be introduced when our application accepts input from a user, in any of its forms. We will look at techniques to remove these vulnerabilities.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.