Using Cursors

In this chapter, I’ll explain the types of problems for which cursors are a reasonable solution, even though such cases are not common. The goal of the chapter is to show you how to use them wisely.

I’ll assume that you have sufficient technical knowledge of the various cursor types and know the syntax for declaring and using them. If you don’t, you can find a lot of information about cursors in Books Online. My focus is to explain why cursors are typically not the right choice and to present the cases in which cursors do make sense.

So why should you avoid using cursors for the most part?

For one, cursors conflict with the main premise of the relational model. Using cursors, you apply procedural logic rather than set-based logic. That is, you write a lot of code with iterations, where you mainly focus on "how" to deal with data. When you apply set-based logic, you typically write substantially less code, as you focus on "what" you want and not how to get it. You need to be able to recognize the cases where a problem is procedural/iterative in nature–where you truly need to process one row at a time. In these cases, you should consider using a cursor. For example, you have a table that contains user information along with e-mail addresses, and you need to send e-mail to all users. Or you need to invoke a stored procedure per each row in some table and provide the stored procedure with column values from each row as arguments.

Cursors also have a lot of overhead involved with the row-by-row manipulation and are typically substantially slower than set-based code (queries). I demonstrate the use of set-based solutions throughout the book. You need to be able to measure and estimate the cursor overhead and identify the few scenarios where cursors will yield better performance than set-based code. In some cases, data distribution will determine whether a cursor or a set-based solution will yield better performance.

There’s another very important aspect of cursors–they can request and assume ordered data as input, whereas queries can accept only a relational input, which by definition cannot assume a particular order. This difference is important in identifying scenarios in which cursors might actually be faster–such as problems that are tightly based on ordered access to the data. Examples of such problems are running aggregations and ranking calculations, resolving some temporal problems, and so on. The I/O cost involved with the cursor activity plus the cursor overhead might end up being lower than a set-based solution that performs substantially more I/O.

ANSI recognizes the practical need for manipulation of ordered data and provides some standards for addressing this need. In extensions to the ANSI SQL:1999 standard and in the ANSI SQL:2003 standard, you can find several query constructs that inherently rely on ordering–for example, the ANSI OVER(ORDER BY ...) clause, which determines the calculation order for ranking and aggregate calculations, or the SEARCH clause defined with recursive CTEs, which determines the order of traversal of trees.

SQL Server 2005 implements the OVER clause with support for ORDER BY only for ranking functions, and SQL Server 2005’s engine was, of course, enhanced to support the rapid performance of such calculations. As a result, ranking calculations using queries are now substantially faster than cursor-based solutions. With aggregate functions in SQL Server 2005, however, the OVER clause does not support ORDER BY. Therefore, set-based solutions to compute running aggregations with large groups of data are slower than cursor-based solutions. I’ll demonstrate this in the section Running Aggregations later in this chapter. The SEARCH clause for recursive common table expressions (CTEs) has not been implemented in SQL Server 2005.

Another kind of problem where cursor solutions are faster than query solutions is matching problems, which I’ll also demonstrate. With those, I haven’t found set-based solutions that perform nearly as well as cursor solutions. But I haven’t given up. One of my goals is to find set-based solutions for problems that are not procedural. Some of those problems could have set-based solutions if newer ANSI constructs had been supported in SQL Server. I hope that SQL Server will implement those in future versions. And when the ANSI standard doesn’t have answers, I believe there will be vendor-specific product extensions, followed by motions to the ANSI committee to add them to the standard.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.200.46