Understanding best practices when querying data

To build fast queries, you need to follow some best practices when querying data. First, it's important to understand that you have to write the clauses in a specific order or your query will fail, as shown in the following list:

  • SELECT statement best practices:
  • Use SELECT fields instead of SELECT *. This is taxing on database resources, especially if a table has a lot of columns. It's best to write a query just to select the columns that you know you will need instead of selecting them all with SELECT *
  • Use column aliases, especially when using functions. When you use a function in the WHERE clause, it will put the function as the name of the column instead of a more useful name. While giving the column an alias doesn't speed up your query, it will make it easier to know what you see in the results—for example, if you execute SELECT UPPER('ab'), the column in the results will be displayed as shown in the following screenshot: 

Whereas if you give the function an alias such as SELECT UPPER('ab') in uppercase, then the results will have a better column name, as shown in the following screenshot: 

  • Avoid using SELECT DISTINCT. You can use DISTINCT to remove duplicates from query results, but this can be an intensive process that uses a lot of database resources depending on the size of the table. It may be better just to add more columns to your SELECT statement to ensure the distinctiveness of the results. One exception to this can be if you are trying to get a list of distinct values in one column only. 
  • WHERE clause best practices:
  • Only use wildcards at the end of a string. Wildcards, such as the percent sign (%) and underscore (_), at the beginning of a string are slower to process because the user won't be able to use any indexes on the table. 
  • Use an underscore (_) wildcard instead of a percent sign (%) when possible. This is because the underscore will perform faster than the percent sign since the underscore matches on one character and the percent sign matches on one or more characters. 
  • Avoid WHERE clauses with functions. Functions in the WHERE clause can cause the query to be a lot slower because the query won't be able to use any indexes on the table. For example, the following query will be slow:
SELECT column WHERE UPPER(column) = 'ab';

However, the following query will be faster: 

SELECT column WHERE column = 'ab';
  • Use WHERE instead of HAVING to filter, when possible. HAVING is used to filter query results in a GROUP BY. WHERE can filter those results before it even gets to the GROUP BY, which makes your query less expensive to run. For example, if you are trying to filter by a count in a grouping, then you will need to use a HAVING clause, but if you are also trying to filter by a date range, then you should use a WHERE clause first. 
  • Use lookup tables instead of IN clauses.
  • JOIN clause best practices:
  • Use table aliases. It's more human-readable to use table aliases when joining tables together. Look at the following query: 
SELECT tablea.col, tableb.col 
FROM tablea
INNER JOIN tableb
ON tablea.id = tableb.id;

Instead, you should use this query: 

SELECT a.col, b.col 
FROM tablea a
INNER JOIN tableb b
ON a.id = b.id;
  • ORDER BY clause best practices:
  • Avoid using column numbers in the ORDER BY clause. It can be confusing if you use column numbers instead of the column names in the ORDER BY clause. In addition, if you change the columns in the SELECT statement, you will have no idea what the column numbers in the ORDER BY clause corresponded to. 
    For example, you should execute the following query: 
SELECT col1, col2 FROM tablename ORDER BY col1, col2; 

You should not execute the following query : 

SELECT col1, col2 FROM tablename ORDER BY 1, 2; 
  • Don't order by too many columns. If you order by all the columns in the table, then there will be a performance impact on your query, so you need to be careful when choosing which and how many columns to use in your ORDER BY clause. 
  • LIMIT clause best practices:
  • Use LIMIT when running a query for the first time. This can help you to see a subset of your query results to check whether you are on the right track without returning a lot of results that may be too taxing on the database system. 
  • Other best practices:
  • Use transactions and error handling in stored procedures and functions. You want to be able to group functionality and have it roll back as one if something fails. It's essential to understand what error happened so that you can troubleshoot how to fix any issues that come up. 
  • Avoid subqueries when you can. They are more intensive than a JOIN—for example, the following query will be more intensive: 
SELECT a.col,
(SELECT b.col
FROM tableb b
WHERE b.id = a.id) AS colname
FROM tablea a;

The following query will be less intensive: 

SELECT a.col, b.col 
FROM tablea a
LEFT JOIN tableb b
ON b.id=a.id;
  • When running intensive analytical queries, run them during nonpeak hours, or if there is a secondary readable database, execute them there. Intensive queries include queries that will return a lot of results or query large tables. You may be able to schedule them to execute at nonpeak hours so that you don't have to work at strange hours to execute them. 
  • Don't take user input to build a query. Instead, use a stored procedure or function to take values via variables, but with the caveat that if an application is passing through values into the stored procedure or function, then it should check values before they are passed in to avoid attacks on your database. 
  • Use stored procedures, functions, or views for queries that are used repeatedly. This ensures that the queries are more easily maintained. It can also ensure that similar functionality is stored together. It's also possible to limit access at a more granular level by granting user access to just one view or stored procedure. 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.39.142