Origin

Firstly, keep in mind that there are two types of languages, compiled and interpreted. Compiled languages transform source code into binary code, creating strings of 1s and 0s that computers can understand in a low-level way. On the other hand, interpreted languages use something called an interpreter to understand the execution time of source code and do whatever the code says. There are other kinds of languages that are intermediates between interpreted and compiled, and use frameworks or virtual machines to understand bytecode. Bytecode is intermediate code created by technologies such as Java or .NET, but for the purpose of this book, it is not necessary to explore it in more depth.

Most web technologies use interpreters or virtual machines to work. Popular languages such as PHP, Python, Ruby, and JavaScript are interpreted, while .NET and Java use a virtual machine. These technologies create statements to interact with data stores.

A data store is usually a database, but it also could be a file, an LDAP repository, XML files, and so on. In this chapter, we will focus on databases. Although NoSQL databases are increasingly used, SQL-based databases are more extended. To interact with these databases, the application uses statements to query the database to manage information.

For example, if we want to get a list of users, an application can use a statement such as the following:

    SELECT * FROM users;

As we deal with more and more information, statements will be more complex. When creating useful statements, these statements need to not be static, but dynamic. This means that the application needs to create statements using information provided by the user. For example, imagine you are using a university system, and want to look for a specific student in a database. Our statement would be something like this:

    SELECT 'Diana' FROM students;

But, how can I specify the name Diana in the statement? Well, by using a simple variable in the application, for example, by using a form:

    'SELECT '.$name.' FROM students;

In the last line, instead of directly using the name Diana, we are using a variable, $name. This variable will change each time, with different data, when the user passes a value from a form to the backend.

Now, what do you see wrong in this simple statement? The basic rule about information entered by users is that all data entered by a user is unreliable, and needs to be validated by the application. In this case, the application passed the data directly to the statement. What would happen if a user entered an unexpected value, such as a single quote and two single scripts?

    SELECT ' '--' FROM students;

These special characters modify the whole statement, but do not generate an error, they just change the result. The final statement looks like this:

    SELECT ''--

Why? Because, as the input is not validated, the special characters are included in the statement and interpreted as valid SQL syntax, modifying how the database management system interprets the statement.

This simple concept is the basis of how SQL injection works.

Table of Contents for Origin

Create new playlist

Sign In

Sign Up

Table of Contents for
Origin