Chapter 8. How It Works: Refactoring in Practice

Riguarda bene omai s ì com’io vado Per questo loco al vero che disiri, Sì che poi sappi sol tener o guado.

Observe me well, how through this place I come Unto the truth thou wishest, that hereafter Thou mayst alone know how to keep the ford.

—Dante Alighieri (1265-1321)

Paradiso, II, 124-126

(trans. Henry W. Longfellow)

IN THIS BOOK, I’VE TRIED TO EXPLAIN THE VARIOUS PATHS YOU CAN EXPLORE TO REFACTOR UNDERPERFORMING SQL applications. In this short chapter, I will briefly describe how I usually proceed when I am asked to look at a database application that doesn’t live up to expectations. I’ll try to be both concrete and abstract: concrete because I want to show refactoring in practice and abstract because I want to describe mental processes, not give a mock demonstration. Needless to say, you will find justifications and detailed explanations in the previous chapters, but not in this one.

Can You Look at the Database?

Very often, there is a vague rumor that something is wrong with an application, but you are asked to fix whatever you can without knowing exactly what the problem is.

  • If you are lucky, the most critical performance issues have already been narrowed to one process or task, and the behavior is the same on a test database as on the main production database. If you are very lucky, someone who knows the functional side of the application well has volunteered to help you.

    In such a case, the best course to take is probably to turn tracing on, and let someone else run the task that is too slow. Secure the benevolence of the database administrator; you will need privileges to query the application tables, but also the data dictionary tables, dynamic performance views, and possibly some system stored procedures. Check where trace files are generated and make sure you can access them easily. Try to gather as much data as you can: SQL statements that are issued, elapsed time, CPU time, number of executions, number of rows processed, and, if available, information about wait times. Try to get hold of the application code as well. Even if it’s written in a language you have never used, the program logic will tell you how the various SQL statements that you are going to trace are combined.

  • If the information that something is wrong has rippled to you through several people up and down the management ladder, and if the problem cannot easily be reproduced on a test database, you must query the production database. Once again, the first thing to negotiate is a number of privileges that are the preserve of a happy few. Don’t ask for full DBA privileges; that may be politically difficult to negotiate. But do come with a prepared list of privileges:

    Mandatory privileges

    The right to query dynamic performance views and the subset of the data dictionary that describes the objects used by the application

    Highly desirable privileges

    The right to query the application tables

    You don’t need the right to update application tables or to delete rows from them, even if it’s update or delete statements that are slow. You can study these statements by replacing them with select statements. Very often, you can diagnose a lot without even querying the application tables; you may even be able to suggest some possible solutions for improving performance. Of course, if you want to provide a solution, that is a different matter. But even if data is confidential, it is likely that statistical information regarding distributions will be willingly communicated to you (all the more willingly if the problem is acute). It should enable you to generate data in a test database (see Chapter 4) and test the various solutions you have thought of.

    When you don’t know exactly where the problem is, you must get as much information as you can from the database server cache. If monitoring facilities are available, use them. If there is no dedicated product, or if it doesn’t provide the information you’d like, build your own. Read about snapmon.sql in Appendix A, under the "Chapter 1" heading. If you expect to be involved in performance analysis more than a few times, don’t hesitate to create a tool based on this type of query. There is no need for a fancy user interface; it can be raw and ugly. And it doesn’t really matter whether you output plain-text files, comma-separated files, HTML files (which is what I usually do), or sophisticated graphics. What you need is the list, by unit of time, of the SQL statements that used the most resources during the period you’re investigating. Sort by (total) elapsed time in descending order, or if timing information is unavailable, by number of logical I/O operations. CPU time, if available, can be interesting to compare to the elapsed time. When I monitor a database, I usually take a snapshot of the database server cache every 10 minutes, and aggregate by the hour. MySQL requires more sustained polling. When monitoring tools show polling queries among the most resource-intensive queries, you know you have monitored too hard.

    Don’t forget to collect (and display) global statistics; by comparing what you have caught to global numbers, you will be able to assess the validity of your monitoring. If you miss more than 20% or 30% of logical I/O operations, you risk chasing the wrong target.

While monitoring is going on, you needn’t stay idle: get better acquainted with the application. Check schema indexing, as explained in Chapter 2. Check statistics. Get information about locking and issues linked to concurrency. It may also be useful to collect statistical information regarding the relative weight, in terms of number of executions and resource consumption, of select statements versus statements that change the data.

Queries of Death

Sometimes “problem queries” have already been identified, or you discover that some very costly queries loom large among the top queries. In this case, Chapter 5 will be your chapter of choice.

Problem queries are complicated and ugly. This is often why they are problem queries in the first place. There is no need to panic.

  • Check the true nature of the tables that appear in the query. If you have views, reinject them as subqueries into the statement (this may be a recursive process), and remove from these subqueries everything that isn’t relevant to the current query (Chapter 3). Beware of distributed queries, too: nested loops between distant tables would kill any query.

  • If you cannot modify the program, you can sometimes improve performance simply by rewriting views that reference other views so as to hit only base tables.

  • For update, insert, and delete statements, check whether they are slow because of locks acquired by other statements that are much slower.

  • For update, insert, and delete statements again, check whether triggers fire. That may be where the problem really is.

  • Remove all optimizer directives, and reduce the query to its core. What is the input to the query? Where are the constants in the where clause? Which are the truly selective criteria that brutally cut down the number of rows to process? If you don’t know, run a few group by statements (or check statistics if the tables are too big). Can there be data distribution problems? Does the optimizer have all the information required? Turn the optimizer into your ally.

  • Among the columns that are returned, is the information stored in a single column among all tables, or is the column a join column that is common to several tables? Would it change anything then to return the information from another table?

  • Identify tables that you need for data, tables that you need for conditions, and tables that you need for joins only. Move tables that are required only for conditions to subqueries. If they are very big and the expected result set is small, make these subqueries correlated subqueries. Otherwise, consider in () or a join with the subquery in the from clause. Beware that in (subquery) adds an implicit distinct that must become explicit if the subquery is in the from clause.

  • Beware, with subqueries, of null values that are neither equal to nor different from anything else, not even other null values. Be generous with is not null conditions for columns that are not mandatory.

  • Beware that correlated and uncorrelated subqueries rest on different indexing assumptions.

  • Which criterion should drive (be the starting point of) the query? If several criteria applied to different tables are possible candidates, would it be possible to isolate candidate rows from each query, and then combine them through joins or set operations such as intersect?

  • Are there any user-defined functions? Are they called often? Are they called again and again with the same parameters? If they are likely to be called often, can their result be cached (Chapter 3)? What is the source code for these functions? Can their logic be coded into a single query (Chapter 6)? Can you use this single query in a join?

  • Are there any functions that would prevent you from using an index? Beware also of implicit type conversions. Check that columns and values on both sides of conditions are of the same type.

  • Are the columns of composite indexes suitably ordered (Chapter 2)? Get some warranties from people who know the application on the type of queries that may need an existing index that looks disordered. You don’t want to break the quarterly report.

  • Remember that accesses are more efficient when a composite index column to which inequality conditions are applied appears after columns to which equality conditions are applied.

  • Be extremely careful with indexes, as they add much overhead to insert and delete operations. If a new index is required, try to identify another index that can be dispensed with because it is redundant or not selective enough. Try to keep the number of indexes constant.

  • Are any patterns repeated throughout the query? Are certain tables visited several times? Can you visit tables only once?

  • Do subqueries stamp a particular order on query execution through dependencies? For instance, does a correlated subquery depend on a column value that requires the execution of another subquery or join to be known? Is it possible to relax this order of execution and to give more freedom to the optimizer?

  • If set operators appear in the query, is it possible to factorize?

  • Could some self-joins be replaced with ranking functions?

All These Fast Queries

Big, ugly queries are still encountered on a routine basis, but more often high costs are associated with a high number of executions, and unitary costs are low. What matters is the overall cost (the unitary cost times number of executions), much more than just the unitary cost. When unitary costs are low, the question then isn’t so much one of SQL query tuning than it is of algorithm, as you saw in Chapter 6.

  • What is the ratio between the number of “business units” that are processed (whether they are the generation of invoices, trades, registrations, or whatever) and the number of queries? Is it reasonable? Can it be lowered?

  • What is the reason behind loops?

  • Can you replace any track of paranoid preventive checks with forensic analysis? Try to order SQL statements by number of executions; close numbers of executions often give very good clues about the logic without even seeing the code. In an SQL program, one must do, then check, why things have failed if they have. There is no need to check first if most cases pass the check.

  • What is the ratio (the number of queries against the table times the number of active sessions, divided by the total number of rows in the table)? Is it justified? Do as much as you can in a single statement.

  • Is there anything that looks like a query from a lookup function that is executed much too often?

  • When input criteria are surrogate keys (system-generated sequential numbers), it is likely that the issue is in the algorithms more than the query itself, even if the query is slow. There must be a query somewhere that returns these surrogate keys from real, user-provided criteria. Try to join both queries; don’t use a cursor loop.

  • Among fast queries, you sometimes find recursive queries against the data dictionary. Try to find out what kind of internal operations they relate to, and whether these operations cannot be preemptively avoided by better table or index sizing, or anything of the kind. Work in tandem with a DBA if queries that refer to internal operations show up.

  • Chase down count(*) queries. You can probably dispense with most of them.

  • Chase down queries that merely execute simple operations or return values available through other means. A query that doesn’t return data from a table is useless. For instance, if the application needs the time on the server, a single query should not be dedicated to that purpose; the time can be returned with “useful” data.

  • Low unitary cost doesn’t necessarily mean minimum unitary cost. If you can modify a frequently run query so that it performs the same work accessing three blocks instead of five, that will result in a 40% improvement that will noticeably lighten the burden of the server and free resources needed elsewhere. You can frequently obtain this type of improvement by avoiding indirections after an index search, through one of the following methods:

  • Using a clustered index or index-organized table in which the index tree walk takes you to the row data, not to the index of the row

  • Avoiding the need to visit the table by stuffing all necessary data inside the index

No Obvious Very Wrong Query

The case when you can concentrate on a few SQL “heavy hitters” is a simple one, whether they are costly queries executed a few times or low-cost queries executed millions of times. The situation is more challenging when no particular query weighs heavily, but you have many queries that each take some moderate amount of resources.

Sometimes such a situation just proves that there isn’t much you can do (one cannot always win). But you may also encounter such a situation when statements are dynamically built by programs and are hardcoded.

  • Check the commit rate, and its relation to the number of business transactions—that is, logically independent business units of work.

  • Check whether making the DBMS substitute parameters for constants can help; this doesn’t discharge you from the duty of pointing out bad practices to developers, though.

  • One of the worst cases is the generation of in () clauses by program. Is it possible to replace the list with the query that returns the various elements, if such a query exists? If not, is it possible to use one of the methods presented in Chapter 2 to bind lists of values?

  • Does focusing on execution plans rather than statements tell a different story? Get, for each different plan, the costliest query as a sample.

  • Could there be some contention issues that partitioning could help to solve?

  • A system, especially a database system, is a series of queues that provide resources to concurrent clients. If clients don’t require the same resources at the same time, increasing parallelism will increase throughput. When one queue lengthens somewhere in the system, performance takes the plunge.

  • Don’t fix the wrong problem. Alleviating contention at the wrong place will do nothing for performance.

  • You can attempt to do only two things at the physical storage level: cluster rows or scatter them. Both can be useful, but for opposite classes of problem; you have to decide what the most important issue is.

  • Physically reorganizing your tables may help make some queries run faster by minimizing the number of pages to inspect, but confirm that what improves a critical process will not make worse another process that is as critical—or even more critical—than the process you are improving.

    Note

    Performance improvement is an iterative process. Very often, another performance issue is hiding behind the issue you are fixing.

Time to Conclude

At times, I have a vague feeling that trying to improve SQL code is a Sisyphean challenge. There is much to do, and ever-expanding databases don’t take long to reveal the weaknesses of code that has been in production for some time. As I have tried to show, there are many options and many routes that you can take. But reviewing and questioning the code will get you the best returns. Don’t rely on “tuning,” even if tuning has its place, and rely even less on “wizards” and “assistants” that are supposed to automagically make your queries faster. If it were that simple, there would never be any performance issues. This is not a dark art; it’s through understanding the original business requirements, understanding the developers’ intents, and applying some simple principles and some SQL skills that you can really transform applications to a level that will often impress. And you may even have fun doing it.

We could’ve been anything That we wanted to be And it’s not too late to change I’d be delighted to give it some thought Maybe you’ll agree that we really ought.

—Paul Williams (b. 1940)

Lyrics from “Bugsy Malone”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.182.66