Chapter 5. JSON Security Concerns

JSON alone is not much of a threat. After all, it’s only a data interchange format. By itself, it is just a document, or a stream, of data. The real security concerns with JSON arise in the way that it is used. In this chapter, we will take a look at two of the most common security concerns for JSON on the Web: cross-site request forgery and cross-site scripting.

Before we move forward in discussing security concerns, and enter into the remaining chapters of this book, we need an understanding of client-side and server-side relationships. Let’s take a quick look at these relationships, for those that do not yet understand this concept.

A Quick Look at Client- and Server-Side Relationships

Upon arriving at Pierre’s Fine Dining for dinner, I sit down at a lovely table where the napkins are folded into swans. A tall man in slacks and a nice shirt approaches the table and says, “My name is Thomas, and I am here to serve you this evening.” After he recognizes me, he lowers his voice and says, “By the way, you are one of my favorite clients.”  He wags his eyebrows and says, “Here is our special menu." 

After perusing the special menu, I tell Thomas what I would like for dinner, and after some time he brings it to the table. I eat a lovely dinner, and then later pay $200 for the artistically arranged plates of tiny food. It was another fine evening at Pierre’s Fine Dining, where Thomas was the server, and I was the client. 

Your Internet browser has a relationship with websites, just like I have the relationship with Pierre’s Fine Dining. This relationship is a bustling world of requests and responses. I request a plate of food, and the kitchen responds by creating the specific food I requested and sends it out to be delivered to me.

When you go to your favorite website to look at cute kitten pictures, the Internet browser on your computer is the client and the computer that hosts the cute kitten pictures is the server. Your Internet browser makes a request that travels around the Internet until the computer behind the cute kitten picture website receives it. The cute kitten website then serves up the page in a response that gets sent back across the Internet. Your browser then renders the page for you on the screen.

In this relationship, we draw a line in the sand and say that everything sent in response by the cute kitten pictures website that will be dealt with by the browser is called client-side code. In the restaurant example, the response would be the plate of food and the table where I sit would be the Internet browser. The table receives the plate of food and now I can see it and consume it.

We then say that everything that happened before that response page was sent, essentially the creation of it, is the server-side code. In the restaurant example, the server-side code would be everything that happens in the kitchen. I never go into the kitchen, and I don’t see what they are doing in there to make my plate of food. The subtle difference between the restaurant example and the real world is that the server is not the guy running back and forth between the kitchen and the table. The kitchen is the server, and that guy is the Internet. 

The public doesn’t see what the cute kitten pictures website is doing in its metaphorical kitchen. The site could be using PHP, or ASP.NET, or any number of programming languages. Whatever it’s using in its kitchen doesn’t matter to my browser, so long as the site hands over a response with client-side code.

The response that I get from the site is a combination of HTML, CSS, and JavaScript. Just like I can inspect everything at the table in the restaurant, I can hit F12 in my browser, and using developer tools I can see all the HTML, CSS, and JavaScript for the page. I can even see the JavaScript code for that annoying flashing pop up with the dancing kittens.

Client side is everything happening in a user’s Internet browser, and server side is everything that is happening on the server where the website is hosted. When client-side code is referred to, it usually means JavaScript, HTML, or CSS. When server-side code is referred to, it usually means server-side languages such as ASP.NET, Ruby on Rails, or Java.

Now let’s take a look at the important subject of security concerns. 

Cross-Site Request Forgery (CSRF)

Cross-site request forgery, or CSRF (pronounced sea-surf), is an exploit that takes advantage of a site’s trust in a user’s Internet browser. CSRF vulnerabilities have been around a long time, way before JSON came into existence. 

An example of a CSRF exploit with JSON would be something like this.

You sign into a banking website. This website has a JSON URL that includes some sensitive information about you (Example 5-1).

Example 5-1. Your sensitive information in JSON format
[
    {
        "user": "bobbarker"
    },
    {
        "phone": "555-555-5555"
    }
]

You might think, “Hey, this JSON is missing its curly brackets!”  This is valid JSON. Dangerous valid JSON because it is also a valid JavaScript script. This is called a top-level JSON array.

The example banking website uses authentication with session cookies to make sure that this information is only given to you—the logged in and registered user.

The bad guy in this example finds the URL to the sensitive JSON data on the banking website and puts it in a <script> tag on his own website. If you aren’t sure what a <script> tag is, you can find them behind the scenes on most modern web pages. This is called the code-behind, and it is an HTML document.  If you go to your favorite website, right-click, and select view source, you should find one or more tags that look like the one shown in Example 5-2.

Example 5-2. Example of a <script> tag (the “src” attribute in this tag specifies where the script is located)
<script src="https://code.jquery.com/jquery-2.1.4.min.js"></script>

Internet browsers have rules about sharing between websites with different domains (http://domainone.com and http://domaintwo.com are different domains). The bad guy uses the <script> tag because it is exempt from those rules about sharing (Example 5-3). It is often necessary and normal to use a script hosted by another website, so the <script> tag gets an exception. The JSON is using a top-level array that makes it valid JavaScript, so he gets away with it.

Example 5-3. Example of what the <script> tag on the bad guy’s site might look like
<script src="https://www.yourspecialbank.com/user.json"></script>

One key thing for this to work is your relationship with the bank’s website. Without your relationship, the link in the <script> tag won’t return sensitive data. That link doesn’t just host your sensitive data. It is a dynamic link, which returns the sensitive data for the member that is logged in. When you log in with the bank, you have initiated a relationship where the bank trusts that you are who you say you are.

This exploit depends on that trust. In order for the bad guy to exploit the trust, he needs you to come to his website that has the stolen <script> tag while you are logged in to your bank. To achieve this, the bad guy might send out a million emails to people that say, “There is an important message for you on your banking website.” These emails will often be formatted exactly like emails that people are familiar with from their bank (or the website they are attempting to exploit). If they don’t check the email headers to see who it’s from, or hover over the link to question if it’s going to their trusted website’s real domain, then they will likely click it.

For example’s sake, let’s pretend you were ill and not thinking clearly, so you clicked on the link. Additionally, you didn’t log out of your banking site last time you were there, so your session still exists. You are currently in a state of the trust relationship with the bank. Once the page to the bad guy’s site loads, you may realize that you’ve landed somewhere strange and leave. By then it’s too late. The bad guy’s site was able to retrieve the sensitive JSON data and send it back to its own servers and keep. Ouch.

What could the bank and its web developers have done differently to prevent a CSRF exploit?  

To start, the bank could have turned the array into a value in a JSON object. This would make it invalid JavaScript. See Example 5-4.

Example 5-4. By wrapping the array in an object, it is no longer valid JavaScript that can be loaded in a <script> tag
{
    "info": [
        {
            "user": "bobbarker"
        },
        {
            "phone": "555-555-5555"
        }
    ]
}

Also, if the bank had only allowed POST requests instead of GET requests to retrieve the sensitive JSON, then the bad guy couldn’t have used the link in his URL. GET and POST are two HTTP methods that are used to communicate with the server. GET is a request for data, and can return a response. POST is a submission of data, and can also return a response. If a server allows a GET request for a link, it can be linked to directly in a browser or a <script> tag. POST, however, cannot be linked to directly. Without that handy <script> tag, the bad guy would have his hands tied with resource sharing policies, preventing him from doing anything else client side that could spoof the bank into trusting him.

This does not mean that HTTP data interchange with JSON should be entirely limited to the POST method. A good rule of thumb for deciding whether or not to allow GET requests for a page or resource is to ask: will this page ever need to be accessed directly by URL, or used in a <script> tag? If the answer is “no,” then the GET method should be disallowed to prevent just anyone from accessing it via a URL or <script> tag.

Sensitive data is also key to this exploit.  If the JSON only contained a list of species of birds, the bad guy probably wouldn’t set up a site to attempt to steal the data.  However, in being prepared for security threats, habits are important. If you don’t get into the habit of using top-level arrays in your JSON, and you don’t get into the habit of conveniently using GET instead of POST, then you aren’t susceptible to being that person that wrote the code that caused a huge exploit and a bunch of angry customers.

Injection Attacks

Injection attacks take many shapes and forms. Ultimately, they rely on finding exploitable holes in security. The CSRF attack that was just covered was an attack that relied on trust. An injection attack is an attack that relies on the ability to inject malicious code into an otherwise innocent website.

Cross-Site Scripting (XSS)

Cross-site scripting (XSS) attacks are a type of injection attack.  A hole in security that often takes place with JSON is at the point where the JavaScript fetches a string of JSON and turns it into a JavaScript object.

Remember that JSON by itself is just text. In programming, if we want to do anything useful with that textual representation of an object, it needs to be loaded into memory as an object. It can then be manipulated, inspected, and used in programming logic.

In JavaScript, one way to do this is by using a function called eval(), which takes a string, compiles it, and then executes it.

In Example 5-5, the JavaScript code uses the eval() function to load the animal/cat object into memory. The properties of the object can then be accessed in the code. The alert on the third line will pop up a browser alert that says “cat.”

Example 5-5. Accessing object properties
    var jsonString = '{"animal":"cat"}';    
    var myObject = eval("(" + jsonString + ")"); 
    alert(myObject.animal);

Example 5-5 is relatively harmless because the JSON is directly in the code. Typically, the JSON would be coming from another server. This server is often a third-party server, which you have no control over. For example, if I were asking Facebook’s server for some JSON, I have no control over what JSON it gives me. If the server were exploited, or the JSON intercepted somehow, I could be served malicious code.

The issue with the eval() function is it takes a string, and compiles and executes it without discrimination. If my JSON is coming from a third-party server and is replaced with a malicious script, then my perfectly innocent website will be compiling and executing this malicious code in the Internet browsers of those who visit my site.

In the JavaScript code in Example 5-6, I’ve replaced the JSON string with some JavaScript. When this code runs, the eval() function will execute and display an alert that says “this is bad.”

Example 5-6. An alert from the eval() function
    var jsonString = "alert('this is bad')";    
    var myObject = eval("(" + jsonString + ")"); 
    alert(myObject.animal);

As JSON has grown up over the years this vulnerability has been recognized. The JSON.parse() function deals with this vulnerability by being discriminate. This function will only parse JSON, and will not execute scripts. Example 5-7 replaces eval() with JSON.parse().

Example 5-7. Replacing eval() with JSON.parse()
    var jsonString = '{"animal":"cat"}';    
    var myObject = JSON.parse(jsonString); 
    alert(myObject.animal);

In web development, another concern that often weighs just as heavily as security concerns is cross-browser support. As the Web evolves, those who host websites must decide which browsers and which versions they should support. Essentially, they have to decide who to leave behind. There will always be people that don’t update their Internet browsers, or who use a less popular Internet browser that isn’t keeping up with the evolving standards.

The much safer JSON.parse() function is currently supported in all major Internet browsers and their most recent versions. However, some earlier Internet browser versions that a small percentage of users are still on do not support this function. If JSON.parse() is used, that small percentage of users will not be able to use the functionality of the website that relies on JSON. Oftentimes, this is dealt with by finding ways to gracefully fail. For example, instead of a web page full of rampant errors, consider one that catches those errors and displays a message such as “Please update your browser to the latest version.”

Holes in Security: Architectural Decisions

At the beginning of the chapter, I said, “JSON by itself is not a threat.”  This still holds true, but there are some scenarios in which a threat can be included directly within the JSON data, while still remaining valid JSON.

Let’s take a look at some perfectly innocent JSON (Example 5-8).

Example 5-8. Perfectly innocent JSON
{
    "message": "hello, world!"
}

Now let’s pretend that I host a website that stores messages in a database and then displays them on a web page for a user to read. I’ve never heard of this exploit, so I have a page where one user can send another user a message that contains anything. On my messages page, I request the JSON string from the server, and client side I use the eval() JavaScript function to convert the JSON string response to a JavaScript object. I use that JavaScript object in my client-side code to display the message value directly in the HTML.

Let’s take a look at some less innocent JSON in Example 5-9.

Example 5-9. Not-so-innocent JSON
{
    "message": "<div onmouseover="alert('gotcha!')">hover here.</div>"
}

The not-so-innocent JSON in the example includes JavaScript. While this JavaScript is within the bounds of the JSON name-value pair, it is simply a string of text. This is perfectly valid JSON and there are many places where this JSON could be used that is not a threat.

However, the JavaScript in the example, when outputted within the HTML on the messaging website, is a threat. The “not-so-innocent JSON” shown here will cause an alert to pop up with the message “gotcha!” every time the user hovers their mouse cursor over the message on the screen. The problem is that far worse could be done than popping up an alert. A bad guy could include a script to access all of your private messages on the page and send them to his own server to read.

What could I have done to prevent this? For one, I could have taken measures to disallow HTML in my messages.  This could involve both client-side and server-side validation. Additionally, I could ensure that any HTML characters included in the message were escaped, so the HTML characters such as <div> would display as &lt;div&gt; on the page (&lt;div&gt; would not function as valid HTML). All of these measures would be specific to the client-side and server-side code that I am using in my architecture.

Most injection attacks involve architecture that has not been built with one important question in mind: how can a bad guy exploit this? The decision to allow HTML in the JSON and display the values directly on the page was a seemingly innocent architectural decision. In avoiding injection attacks, the key is to think through possible exploits, and take the extra (and sometimes arduous) steps to prevent them.

Key Terms and Concepts

This chapter covered the following key terms:

Server side (in web development)
The operations that take place behind the scenes on the server where a page or resource is being requested. The server provides the response that the Internet browser processes and/or loads.
Client side (in web development)
The operations that take place in the Internet browser from the point that a requested page is loaded. This is typically HTML, CSS, and JavaScript.
Cross-site request forgery (CSRF)
An exploit that takes advantage of a site’s trust in a user’s browser.
Top-level JSON array
A JSON array that exists outside of a name-value pair and at the top level of the document.
Injection attack
An attack that relies on injecting data into a web application to facilitate the execution or interpretation of malicious data.
JSON cross-site scripting (XSS) attack
A type of injection attack that takes advantage of an innocent website by intercepting or replacing JSON being served to the site by a third party with a malicious script.

We also discussed these key concepts:

  • JSON by itself is not a threat. It is just text.
  • Three things to remember that will address security concerns with JSON:
    • Do not use top-level arrays. Top-level arrays are valid JavaScript that can be linked to in a <script> tag and used.
    • Use HTTP POST instead of GET for JSON that is not intended for the public. The HTTP GET request can be linked to in a URL and placed in a <script> tag.
    • Use JSON.parse() instead of eval(). The eval() function will compile and execute the string that is passed in, which opens your code up for attacks. JSON.parse() only parses JSON.
  • Holes in security are often introduced through architectural decisions that do not ask the basic question of “How can a bad guy exploit this?”
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.122.244