Views can be used to develop solutions in a modular way. You solve each step of the problem with a query, and define a view based on that query. This process simplifies the solution by allowing you to focus on a single step at a time.
I’ll demonstrate a modular approach through an example. First, run the code in Example 5-1 to create and populate the Sales table:
Example 5-1. Creating and populating the Sales table
SET NOCOUNT ON; USE tempdb; GO IF OBJECT_ID('dbo.Sales') IS NOT NULL DROP TABLE dbo.Sales; GO CREATE TABLE dbo.Sales ( mnth DATETIME NOT NULL PRIMARY KEY, qty INT NOT NULL ); INSERT INTO dbo.Sales(mnth, qty) VALUES('20041201', 100); INSERT INTO dbo.Sales(mnth, qty) VALUES('20050101', 110); INSERT INTO dbo.Sales(mnth, qty) VALUES('20050201', 120); INSERT INTO dbo.Sales(mnth, qty) VALUES('20050301', 130); INSERT INTO dbo.Sales(mnth, qty) VALUES('20050401', 140); INSERT INTO dbo.Sales(mnth, qty) VALUES('20050501', 140); INSERT INTO dbo.Sales(mnth, qty) VALUES('20050601', 130); INSERT INTO dbo.Sales(mnth, qty) VALUES('20050701', 120); INSERT INTO dbo.Sales(mnth, qty) VALUES('20050801', 110); INSERT INTO dbo.Sales(mnth, qty) VALUES('20050901', 100); INSERT INTO dbo.Sales(mnth, qty) VALUES('20051001', 110); INSERT INTO dbo.Sales(mnth, qty) VALUES('20051101', 100); INSERT INTO dbo.Sales(mnth, qty) VALUES('20051201', 120); INSERT INTO dbo.Sales(mnth, qty) VALUES('20060101', 130); INSERT INTO dbo.Sales(mnth, qty) VALUES('20060201', 140); INSERT INTO dbo.Sales(mnth, qty) VALUES('20060301', 100); INSERT INTO dbo.Sales(mnth, qty) VALUES('20060401', 100); INSERT INTO dbo.Sales(mnth, qty) VALUES('20060501', 100); INSERT INTO dbo.Sales(mnth, qty) VALUES('20060601', 110); INSERT INTO dbo.Sales(mnth, qty) VALUES('20060701', 120); INSERT INTO dbo.Sales(mnth, qty) VALUES('20060801', 110); INSERT INTO dbo.Sales(mnth, qty) VALUES('20060901', 120); INSERT INTO dbo.Sales(mnth, qty) VALUES('20061001', 130); INSERT INTO dbo.Sales(mnth, qty) VALUES('20061101', 140); INSERT INTO dbo.Sales(mnth, qty) VALUES('20061201', 100);
The table contains one row per month with the sales quantity (column qty) and month (column mnth). Notice that I used the DATETIME datatype to store a month to support datetime-related calculations. Even though I care only about the year and month elements of the value, I had to specify something in the day portion. So I specified the first day of each month as the day and midnight is assumed by default as the time. When you need to present the data, you can always extract the relevant elements from the full datetime value.
The task at hand is to return groups of consecutive months that have the same sales trend. That is, identify ranges of months with the same trend (up, same, down, or unknown). The trend for a given month is based on its qty value minus the qty value of the previous month. If the difference is positive, the trend is ′up′; if it’s negative, the trend is ′down′; if it’s zero, the trend is ′same′; otherwise, the trend is ′unknown′. Table 5-5 shows the desired result.
Table 5-5. Ranges of Months with the Same Sales Trend
start_range | end_range | trend |
---|---|---|
200412 | 200412 | unknown |
200501 | 200504 | up |
200505 | 200505 | same |
200506 | 200509 | down |
200510 | 200510 | up |
200511 | 200511 | down |
200512 | 200602 | up |
200603 | 200603 | down |
200604 | 200605 | same |
200606 | 200607 | up |
200608 | 200608 | down |
200609 | 200611 | up |
200612 | 200612 | down |
Trying to develop a single query to solve the problem can be too complex. Instead, break the solution into steps. I’ll first show a solution that will work in SQL Server 2000 and then optimize it by using features available in SQL Server 2005.
First calculate the sign of the difference between the current month’s qty and the previous month’s. This can be achieved by creating the VSgn view as follows:
IF OBJECT_ID('dbo.VSgn') IS NOT NULL DROP VIEW dbo.VSgn; GO CREATE VIEW dbo.VSgn AS SELECT mnth, qty, SIGN((S1.qty - (SELECT TOP 1 qty FROM dbo.Sales AS S2 WHERE S2.mnth < S1.mnth ORDER BY S2.mnth DESC))) AS sgn FROM dbo.Sales AS S1 GO
Remember that in SQL Server 2000 a semicolon terminating the CREATE VIEW statement is not allowed, nor are parentheses for TOP’s input.
The contents of the VSgn view are shown in Table 5-6.
Table 5-6. Contents of VSgn
mnth | qty | sgn |
---|---|---|
2004-12-01 00:00:00.000 | 100 | NULL |
2005-01-01 00:00:00.000 | 110 | 1 |
2005-02-01 00:00:00.000 | 120 | 1 |
2005-03-01 00:00:00.000 | 130 | 1 |
2005-04-01 00:00:00.000 | 140 | 1 |
2005-05-01 00:00:00.000 | 140 | 0 |
2005-06-01 00:00:00.000 | 130 | –1 |
2005-07-01 00:00:00.000 | 120 | –1 |
2005-08-01 00:00:00.000 | 110 | –1 |
2005-09-01 00:00:00.000 | 100 | –1 |
2005-10-01 00:00:00.000 | 110 | 1 |
2005-11-01 00:00:00.000 | 100 | –1 |
2005-12-01 00:00:00.000 | 120 | 1 |
2006-01-01 00:00:00.000 | 130 | 1 |
2006-02-01 00:00:00.000 | 140 | 1 |
2006-03-01 00:00:00.000 | 100 | –1 |
2006-04-01 00:00:00.000 | 100 | 0 |
2006-05-01 00:00:00.000 | 100 | 0 |
2006-06-01 00:00:00.000 | 110 | 1 |
2006-07-01 00:00:00.000 | 120 | 1 |
2006-08-01 00:00:00.000 | 110 | –1 |
2006-09-01 00:00:00.000 | 120 | 1 |
2006-10-01 00:00:00.000 | 130 | 1 |
2006-11-01 00:00:00.000 | 140 | 1 |
2006-12-01 00:00:00.000 | 100 | –1 |
The SIGN function returns 1 for a positive input, 0 for when zero is input,–1 for a negative input, and NULL for a NULL input. The sgn column actually represents the sales trend of the current month. At this point, you want to group all consecutive months that have the same sales trend. To do so, you first need to calculate a grouping factor–a value that will identify the group. One option for the grouping factor is the earliest future month in which the trend is different from the current month’s trend. If you think about it, you’ll see that such a value will be the same for all consecutive months that have the same trend.
Run the following code to create the VGrp view, which calculates the grouping factor:
IF OBJECT_ID('dbo.VGrp') IS NOT NULL DROP VIEW dbo.VGrp; GO CREATE VIEW dbo.VGrp AS SELECT mnth, sgn, (SELECT MIN(mnth) FROM dbo.VSgn AS V2 WHERE V2.sgn <> V1.sgn AND V2.mnth > V1.mnth) AS grp FROM dbo.VSgn AS V1 GO
The contents of the VGrp view is shown in Table 5-7.
Table 5-7. Contents of VGrp
mnth | sgn | Grp |
---|---|---|
2004-12-01 00:00:00.000 | NULL | NULL |
2005-01-01 00:00:00.000 | 1 | 2005-05-01 00:00:00.000 |
2005-02-01 00:00:00.000 | 1 | 2005-05-01 00:00:00.000 |
2005-03-01 00:00:00.000 | 1 | 2005-05-01 00:00:00.000 |
2005-04-01 00:00:00.000 | 1 | 2005-05-01 00:00:00.000 |
2005-05-01 00:00:00.000 | 0 | 2005-06-01 00:00:00.000 |
2005-06-01 00:00:00.000 | –1 | 2005-10-01 00:00:00.000 |
2005-07-01 00:00:00.000 | –1 | 2005-10-01 00:00:00.000 |
2005-08-01 00:00:00.000 | –1 | 2005-10-01 00:00:00.000 |
2005-09-01 00:00:00.000 | –1 | 2005-10-01 00:00:00.000 |
2005-10-01 00:00:00.000 | 1 | 2005-11-01 00:00:00.000 |
2005-11-01 00:00:00.000 | –1 | 2005-12-01 00:00:00.000 |
2005-12-01 00:00:00.000 | 1 | 2006-03-01 00:00:00.000 |
2006-01-01 00:00:00.000 | 1 | 2006-03-01 00:00:00.000 |
2006-02-01 00:00:00.000 | 1 | 2006-03-01 00:00:00.000 |
2006-03-01 00:00:00.000 | –1 | 2006-04-01 00:00:00.000 |
2006-04-01 00:00:00.000 | 0 | 2006-06-01 00:00:00.000 |
2006-05-01 00:00:00.000 | 0 | 2006-06-01 00:00:00.000 |
2006-06-01 00:00:00.000 | 1 | 2006-08-01 00:00:00.000 |
2006-07-01 00:00:00.000 | 1 | 2006-08-01 00:00:00.000 |
2006-08-01 00:00:00.000 | –1 | 2006-09-01 00:00:00.000 |
2006-09-01 00:00:00.000 | 1 | 2006-12-01 00:00:00.000 |
2006-10-01 00:00:00.000 | 1 | 2006-12-01 00:00:00.000 |
2006-11-01 00:00:00.000 | 1 | 2006-12-01 00:00:00.000 |
2006-12-01 00:00:00.000 | –1 | NULL |
You can observe that the grp column values are unique for each consecutive group of months that have the same trend. The only exception is the two NULLs. You received a NULL for December 2004 because that month showed an unknown trend. You received a NULL for December 2006 because there is no data after that date. The two NULLs belong to two different consecutive trend groups, but you can easily solve the problem by using both sgn (representing the trend) and grp to define the group.
The last part is straightforward–group the data by sgn and grp, return MIN(mnth) as the start of the range, and return MAX(mnth) as the end of the range. Also, use a CASE expression to convert the sgn value to a more descriptive representation of the trend.
Run the following code to create the VTrends view implementing this step:
IF OBJECT_ID('dbo.VTrends') IS NOT NULL DROP VIEW dbo.VTrends; GO CREATE VIEW dbo.VTrends AS SELECT CONVERT(VARCHAR(6), MIN(mnth), 112) AS start_range, CONVERT(VARCHAR(6), MAX(mnth), 112) AS end_range, CASE sgn WHEN -1 THEN 'down' WHEN 0 THEN 'same' WHEN 1 THEN 'up' ELSE 'unknown' END AS trend FROM dbo.VGrp GROUP BY sgn, grp GO
If you query VTrends as shown in the following code, you will get the desired results shown earlier in Table 5-5:
SELECT start_range, end_range, trend FROM dbo.VTrends ORDER BY start_range;
In SQL Server 2005, you have new features that allow you to optimize the solution. First create a view called VSalesRN with row numbers assigned to the rows from Sales based on the order of mnth:
IF OBJECT_ID('dbo.VSalesRN') IS NOT NULL DROP VIEW dbo.VSalesRN; GO CREATE VIEW dbo.VSalesRN AS SELECT mnth, qty, ROW_NUMBER() OVER(ORDER BY mnth) AS rn FROM dbo.Sales; GO
The contents of the view are shown in Table 5-8.
Table 5-8. Contents of VSalesRN
mnth | qty | rn |
---|---|---|
2004-12-01 00:00:00.000 | 100 | 1 |
2005-01-01 00:00:00.000 | 110 | 2 |
2005-02-01 00:00:00.000 | 120 | 3 |
2005-03-01 00:00:00.000 | 130 | 4 |
2005-04-01 00:00:00.000 | 140 | 5 |
2005-05-01 00:00:00.000 | 140 | 6 |
2005-06-01 00:00:00.000 | 130 | 7 |
2005-07-01 00:00:00.000 | 120 | 8 |
2005-08-01 00:00:00.000 | 110 | 9 |
2005-09-01 00:00:00.000 | 100 | 10 |
2005-10-01 00:00:00.000 | 110 | 11 |
2005-11-01 00:00:00.000 | 100 | 12 |
2005-12-01 00:00:00.000 | 120 | 13 |
2006-01-01 00:00:00.000 | 130 | 14 |
2006-02-01 00:00:00.000 | 140 | 15 |
2006-03-01 00:00:00.000 | 100 | 16 |
2006-04-01 00:00:00.000 | 100 | 17 |
2006-05-01 00:00:00.000 | 100 | 18 |
2006-06-01 00:00:00.000 | 110 | 19 |
2006-07-01 00:00:00.000 | 120 | 20 |
2006-08-01 00:00:00.000 | 110 | 21 |
2006-09-01 00:00:00.000 | 120 | 22 |
2006-10-01 00:00:00.000 | 130 | 23 |
2006-11-01 00:00:00.000 | 140 | 24 |
2006-12-01 00:00:00.000 | 100 | 25 |
In the VSgn view, you join two instances of VSalesRN to match each current row with the row for the previous month. You then have access to both the current month’s and previous month’s qty values, and you calculate the sign of their difference. Here’s the code for the SQL Server 2005 version of the VSgn view:
IF OBJECT_ID('dbo.VSgn') IS NOT NULL DROP VIEW dbo.VSgn; GO CREATE VIEW dbo.VSgn AS SELECT Cur.mnth, Cur.qty, SIGN(Cur.qty - Prv.qty) AS sgn FROM dbo.VSalesRN AS Cur JOIN dbo.VSalesRN AS Prv ON Cur.rn = Prv.rn + 1; GO
You can further optimize the solution by revising the VGrp view, which calculates the grouping factor as follows:
IF OBJECT_ID('dbo.VGrp') IS NOT NULL DROP VIEW dbo.VGrp; GO CREATE VIEW dbo.VGrp AS SELECT mnth, sgn, DATEADD(month, -ROW_NUMBER() OVER(PARTITION BY sgn ORDER BY mnth), mnth) AS grp FROM dbo.VSgn; GO
The logic behind the calculation of the grouping factor here is a bit tricky. You calculate a row number (rn) based on the order of mnth, partitioned by sgn (trend). This means that, for each trend, you can have multiple consecutive groups, naturally with gaps between them. Now try to think of the way the mnth value increments within a particular trend versus how rn increments. Both continue to increment by one unit as long as you’re still in the same consecutive group. Once you have a gap, mnth increments by more than one unit, while rn keeps incrementing by one. You can conclude that if you subtract rn months from mnth, the result for each consecutive group will be constant and unique. As I mentioned, the logic here is tricky and can be hard to grasp. To better understand it, I suggest that you pull VGrp’s query aside and play with it. For example, return the row number itself (as opposed to using it in a calculation), and so on.
Finally, create the VTrends view to group the data by sgn and grp, returning the ranges of consecutive months with the same trend.
IF OBJECT_ID('dbo.VTrends') IS NOT NULL DROP VIEW dbo.VTrends; GO CREATE VIEW dbo.VTrends AS SELECT CONVERT(VARCHAR(6), MIN(mnth), 112) AS start_range, CONVERT(VARCHAR(6), MAX(mnth), 112) AS end_range, CASE sgn WHEN -1 THEN 'down' WHEN 0 THEN 'same' WHEN 1 THEN 'up' ELSE 'unknown' END AS trend FROM dbo.VGrp GROUP BY sgn, grp; GO
Query the VTrends view and you will get the desired result:
SELECT start_range, end_range, trend FROM dbo.VTrends ORDER BY start_range;
Remember that SQL Server 2005 supports CTEs, which also allow you to develop solutions applying a modular approach. In fact, you can think of CTEs as inline views that exist only in the scope of the outer query. If you think about it, other than allowing a modular development approach, there’s no real reason to create the intermediate views in the solution. Instead, you should just create the final one (VTrends). So why create these as objects in the database? In SQL Server 2000, you didn’t have better options, but in SQL Server 2005 you can create just one view (VTrends), which will be defined by a CTE developed using a modular approach. Run the code in Example 5-2 to alter the VTrends view, implementing it with multiple CTEs defined in the same WITH statement instead of defining multiple views.
Example 5-2. Implementing VTrends with a CTE
ALTER VIEW dbo.VTrends AS WITH CSalesRN AS ( SELECT mnth, qty, ROW_NUMBER() OVER(ORDER BY mnth) AS rn FROM dbo.Sales ), CSgn AS ( SELECT Cur.mnth, Cur.qty, SIGN(Cur.qty - Prv.qty) AS sgn FROM CSalesRN AS Cur JOIN CSalesRN AS Prv ON Cur.rn = Prv.rn + 1 ), CGrp AS ( SELECT mnth, sgn, DATEADD(month, -ROW_NUMBER() OVER(PARTITION BY sgn ORDER BY mnth), mnth) AS grp FROM CSgn ) SELECT CONVERT(VARCHAR(6), MIN(mnth), 112) AS start_range, CONVERT(VARCHAR(6), MAX(mnth), 112) AS end_range, CASE sgn WHEN -1 THEN 'down' WHEN 0 THEN 'same' WHEN 1 THEN 'up' ELSE 'unknown' END AS trend FROM CGrp GROUP BY sgn, grp; GO
If you query VTrends using the following code, you will get the desired output shown earlier in Table 5-5:
SELECT start_range, end_range, trend FROM dbo.VTrends ORDER BY start_range;
In short, developing solutions with the modular approach simplifies the process and reduces the chances of bugs and errors.
When you’re done, run the following cleanup code:
IF OBJECT_ID('dbo.VTrends') IS NOT NULL DROP VIEW dbo.VTrends; GO IF OBJECT_ID('dbo.VGrp') IS NOT NULL DROP VIEW dbo.VGrp; GO IF OBJECT_ID('dbo.VSgn') IS NOT NULL DROP VIEW dbo.VSgn; GO IF OBJECT_ID('dbo.VSalesRN') IS NOT NULL DROP VIEW dbo.VSalesRN; GO IF OBJECT_ID('dbo.Sales') IS NOT NULL DROP TABLE dbo.Sales; GO
18.221.241.116