Over the course of this book, we have covered many different topics. Most of these chapters involve best practices. These best practices involve naming conventions, formatting T-SQL code, or designing database objects. Despite your best efforts, you may find yourself in a situation that is outside the scope of best practices. These times can correlate to periods of high stress or tight deadlines. The goal of this chapter is to help you prepare for the situations when you must make quick decisions about complex topics.
Designing database code for applications that are created to let users insert or update data can be challenging. You may find yourself dealing with applications where you need to disable permissions or functionality. When dealing with legacy application code, there may be additional challenges particularly if those applications are using ORM software. As organizations grow, there may be a need for additional reporting, but there may not be the time required to develop a data warehouse. Within SQL Server, there is the ability to make your T-SQL code more flexible, but you will want to ensure that you do not sacrifice functionality at the cost of flexibility.
Inserting and Updating Data
There are a variety of scenarios where you would want the ability to pass in a set of values. If the records do not exist, you may want to insert the information, but if the record does exist, then you may want to update that record. Often when we are under pressure, we look for the fastest, easiest way to do things. Other times, we may be focused more on T-SQL code readability than what is best for the database engine. While it is important to ensure that your database code can be understood by others, it is equally as important to consider the performance of your database code when writing T-SQL.
This can be equally true when using T-SQL to insert or update database records. It may be easier and more straightforward for an application to execute the same stored procedure regardless of whether the user is inserting a new record or updating an existing record. This type of action can also be referred to as an upsert. It does not matter how you find yourself in the situation; the important thing is to design T-SQL code that works reliably and effectively.
Stored Procedure with a Merge Statement
The preceding MERGE statement updates a recipe if the recipe already exists in the dbo.Recipe table. The logic for this MERGE statement is easy to follow, and the format can be easy to write. There are several known issues with using MERGE statements. You will want to be aware of these issues before implementing MERGE statements in your environment.
Store Procedure to Insert or Update
Replacing the logic from the MERGE statement, you can use an IF… ELSE statement to control the flow of actions inside the SQL Server query. In Listing 13-2, if the recipe exists in the dbo.Recipe table, the record will update all fields except DateCreated. Otherwise, a new record in dbo.Recipe will be inserted.
MERGE statements are not limited to inserting or updating data in a target table. Using MERGE statements can be expanded to handle other scenarios such as inserting or deleting data. MERGE statements can also be created so that missing records are inserted, changed records are updated, and orphaned records are deleted as part of a single statement. While the data modifications can be combined as part of single statement, you should use caution when using MERGE statements as MERGE statements can be prone to performance issues and data inconsistencies. Due to the ease of writing MERGE statements, it is possible to compare data between two tables in a way that is not efficient for SQL Server. While the tables are only being compared once, if the comparison is not efficient, this can have significant impact on the performance tables being compared. There have also been instances where the unique key violations can occur when using filtered indexes. MERGE statements can also create foreign key constraint violations in specific circumstances. This includes using two tables with foreign key constraints where the foreign key is set to NOCHECK and then rolled back. I suggest researching known issues with MERGE statements and performing adequate testing to confirm the T-SQL code will work as expected. The code in Listing 13-2 shows that it is not overly complex to write code that can handle either an insert or an update. Prior to SQL Server 2019, either the code in Listing 13-1 or Listing 13-2 may have been subject to parameter sniffing. Now that SQL Server 2019 has adaptive joins, this should be less of an issue.
Disable Functionality
When designing applications often the first goal is to focus on meeting the specified requirements. In many cases, the product owners or business units are asking for functionality that they need right now. However, the same individuals do not have your knowledge of application and database development. Therefore, the same people may not request the ability to enable or disable specified functionality upon their request. In this case, functionality can either be considered the ability for a user or a role to perform an action or for an application to behave in a certain way. It may be that they are counting on you to know how they may want to use the same applications in the future.
Create dbo.Recipe Table
Recipe Table Data
RecipeID | RecipeName | DateCreated | DateModified |
---|---|---|---|
1 | Spaghetti | 01/02/2018 | 02/05/2019 |
2 | Lee’s Burgers | 01/03/2018 | 02/05/2019 |
3 | Spinach Frittata | 01/03/2018 | 03/16/2019 |
4 | Dinner Rolls | 02/01/2018 | 03/17/2019 |
5 | Brandin’s Fried Rice | 04/16/2019 | 06/23/2019 |
For instance, if a recipe was displayed in an application because it was in the recipe table, this would be a case where you could not control how an application interacted with recipes. The only option you would have to keep a recipe from being used by an application would be to delete the recipe record. This action is referred to as a hard delete.
Drop Foreign Key
Delete Recipe Record
Recipe Table After Deleting Dinner Rolls
RecipeID | RecipeName | DateCreated | DateModified |
---|---|---|---|
1 | Spaghetti | 01/02/2018 | 02/05/2019 |
2 | Lee’s Burgers | 01/03/2018 | 02/05/2019 |
3 | Spinach Frittata | 01/03/2018 | 03/16/2019 |
5 | Brandin’s Fried Rice | 04/16/2019 | 06/23/2019 |
There is a significant issue with this method. You should take caution when you are faced with the decision to drop foreign key relationships between tables. This can not only negatively affect query performance but can also affect your data quality. Foreign keys are one of the final pieces that can help ensure that data between your tables remains consistent.
Insert Dummy Recipe Record
Add IsActive Column to dbo.Recipe Table
This may not prevent your application from showing the specific value unless your application has been configured in such a way to always ignore that value. If neither of these options can be used, there is another option available. With this table design involved, another option is to remove the ingredients associated with the recipe Dinner Rolls.
Add IsActive Column to dbo.Recipe Table
This would usually be a value that would indicate a true or false. For SQL Server, that is often represented with a one for true and a zero for false. If you set the IsActive to true and ran a query that filtered results where the IsActive value was true, then you would only see a listing of active recipes. Once you have added the IsActive column to the dbo.Recipe table, you can modify your T-SQL code to display either active or inactive recipes based on how they are written.
You have the option to only show results based on the value assigned to IsActive. When the IsActive value is set to 1, the value is true. If you wanted to only show active recipes, you would modify your T-SQL code to use IsActive = 1 in the WHERE clause. This also gives you the added ability to have some screens that would only show active recipes and other screens that would only show inactive or disabled recipes.
In addition to disabling or enabling functionalities for users or roles, there are also occasions where you would want to be able to control how an application works. Companies often want to change how their business operates over time. This can usually be done by adding, changing, or removing functionality in the applications that the business uses.
For instance, if a business wants to expand outside of their current market, they may choose to add new functionality to their existing applications. The easiest way to add this new functionality is to add new application code. However, the business may find that this new market or new line of business is not working the way that they intended, and the company decides to roll back this new functionality. Depending on how the code was written, this may be a complex change requiring another code rewrite or a simple change allowing the application to use a different part of the code.
Depending on your business environment, you may find that it is easier to manage the changes in functionality through the database instead of the application code. This is even more likely when you may want functionality to change based on seasonality or by region. This allows you the option to exert finely tuned control that can be changed rapidly. You will want to carefully consider the best place to manage this functionality. The best advice I have received is to keep business logic in the application when higher-level management will need to sign off on any changes. For business logic that can change rapidly and with little oversight, it may be best to leave the business logic in the database.
Create Application Rule Table
If you are creating rules for a single, centralized application, this table design will work for your needs. If you are managing multiple applications, you will want to consider how these rules are used by applications. If most or all rules are unique per application, you may want to add a column to this table to indicate the application affected.
Application Rule Table
Application RuleID | ApplicationRule Description | IsActive | DateCreated | DateModifed |
---|---|---|---|---|
1 | Show only active recipes | 1 | 04/12/2019 | 04/12/2019 |
The entry in this table indicates that only active recipes will be shown when this application rule is enabled. If you want the same application to show all recipes, you would need to update the IsActive value to false. While you could update all your database code and stored procedures to use IF... THEN statements depending on when this application rule is enabled, I would suggest handling this functionality within the application.
This is where it becomes important to develop your code to be flexible. While the topic of this book is not an application code development, you will need a database that works with your application code to handle these scenarios. Conceptually this can work similarly to feature flags as discussed in Chapter 12. This is where you could make a table to store a reference to the specified functionality. In some environments this may be referred to as application rules.
Support Legacy Code
One of the key issues in software development involves dealing with technical debt. Applications are developed over time, and depending on how the applications are developed will determine how easy it can be to maintain them over time. For many organizations, applications are initially developed with a timeline in order to meet the current business needs. This can cause applications to be developed rapidly without having the time to plan for how this application will be supported in the future. This often becomes the single source of tension between development and the database administrators.
The real challenge is that when an application is developed, it usually is not well documented. If the application is not well documented, this can make supporting legacy code even more difficult if the T-SQL queries are saved directly in the application code. If the database code is in the application, it will require more effort not only to locate the specific database code but also to get the database code updated. This is because it is not merely as simple as altering a table or stored procedure to increase the flexibility and maintainability of the T-SQL code.
As businesses grow and change, the same application may need to be modified to handle new functionality. This new functionality may require storing new information. To store this information, you may want to add a column to a table or create a new table entirely. The challenge becomes trying to understand all the items that depend on this table. While you can verify dependencies between database objects, the issue is that you do not know how the application is currently accessing this table. Your applications may be accessing the data table through prepared statements and ad hoc queries. If your application is using prepared statements or ad hoc queries, you may have to change your application code when adding a new column to a table especially if your coding standards do not specify that column names must always be specified with your T-SQL queries.
The concern is how to modify this table without breaking the entire application. There are strategies that can be used to allow you to add new columns to an existing table without breaking existing code. You are looking for a way to both allow existing code to reference the table as is and store this new information in a manner that works with your existing standards. The options that will be discussed in this section are intended to support new functionality on a short-term basis. It is recommended that you update your application code and database objects to remain compliant with best practices.
The simplest option that you can use to allow for additional columns in a table is to create another table with the same primary key as the table you want to modify. In this scenario, you will also need to rename the existing table. In order to prevent existing applications from breaking, you would create a view with the same name as the current table. All queries that would access the original unmodified table will now access this view. When you are ready to update all the application code, the view can be dropped and the table can be renamed back to its original name.
Create Copy of dbo.Recipe Table with IsActive Column
Create View to Match Original Schema of dbo.Recipe
Existing application code can reference the view to select or modify data. This is not an ideal long-term solution and is intended to allow you to move toward the goal of re-factoring your applications to use the new table over time. If you have database code written directly in the application, the transition period needed to drop the view and rename the table back to dbo.Recipe may require additional effort to manage. When determining how to modify your database objects while allowing existing code to function, you want to focus on implementing a solution that will allow you to easily continue to develop T-SQL code going forward.
Create Trigger when Records Are Inserted into dbo.Recipe
This will allow you to continue using the original database object for all your application code. You can leave this additional table and trigger in place until you are ready to start using the new table in your database code.
In addition to adding new columns, you may want to use some other strategies to re-factor tables. This can include increasing the normalization of your databases. It is quite common to come across legacy tables with many more columns than would be expected for a database with a high level of normalization. By creating a space where you can modify the table design and create an abstraction layer, you can modify your database objects without affecting application performance.
Original Denormalized Table
Normalized Table
View using Normalized Table
Like the logic in Listing 13-11, you can create a view with the name dbo.Recipe so that the applications can interact with the new table going forward. Depending on the purpose of the original and new tables, you may not be able to update all code to use these new tables. If so, you may have to rely on triggers to update these tables.
Reporting on Transactional Data
Most databases used by applications are designed to handle a large volume of transactions. The overall design of these databases is referred to as online transaction processing or OLTP. These databases are usually designed to store and retrieve data quickly. This type of behavior involves many writes to the database. However, your business may ultimately decide that it needs information from this same database that is running one of your applications. Ideally, you would only have your applications accessing databases on your transactional servers. You will want to have a good understanding of your business and the impact your queries will have on application performance. You may be able to run very simple SELECT statements that are only accessing a small set of data without negatively impacting your applications. You should do everything in your power to limit the additional load put on SQL Server for data requests outside of your applications. When large quantities of data are accessed for reporting purposes, this can cause performance degradation of your applications. This includes additional CPU resources to generate execution plans that have been flushed from the cache to make way for running queries for reporting workloads. You may also have issues with application data that exists in memory getting cleared from the buffer pool after large quantities of data have been moved to memory for reporting. These issues can cascade into a scenario where your application performance is affected by reporting workloads. If you find yourself in the situation where you need to report directly from a transactional database, communicate with your management team what kind of impact reporting may have on your applications.
When you need to access data for reporting, you need to have a high number of reads from this same database. This type of behavior happens when you need to report on the data in this database. Aside from the fact that a transactional database is not designed for handling reading large quantities of data as well as it can handle reading very small sets of data, there are other challenges that can happen when using your transactional data for reporting.
In some cases, gathering this reporting data can be simple. There may be a small number of joins or the underlying logic may not be very complex. I have found that you can end up writing a single report, and due to the quality of the report, you may be asked to create additional reports. These additional reports may involve many tables or the overall logic for these queries may be more complex. This can often be because these tables were not designed with reporting in mind. In some cases, you may find that you are being asked to report on data that does not exist.
To develop reports quickly, it is often easiest to write specific logic for each individual report. Over time this can cause you to end up with a significant number of reports that should return the same or similar results but may not. The best-case scenario is that the results are similar but that the underlying code is different. This can get even more complex as the business may change the functionality of some applications. It may be easy to identify some reports that will need to be updated while other reports are missed and start returning inaccurate results.
Spaghetti Ingredients
While the specifics of report development is outside of the scope of this book, I have seldom worked in an environment where some type of reporting data was not regularly requested off the transactional databases. The preceding query does not show a stored procedure, but you could just as easily use a stored procedure to select this data. The issue with the query in Listing 13-16 is that it is hard-coded to only return results for the recipe Spaghetti.
Spinach Frittata Ingredients
While we have met the requirements for creating one report for Spaghetti and new report for Spinach Frittata, we have also added additional overhead to maintaining our reports. The queries from Listings 13-16 and 13-17 are maintained separately. If there are changes to how data is pulled from one report, you would need to remember that the other report may also need to be changed. Our next step would be to analyze these two reports and see if we can create one data set for both reports.
Recipe Ingredients from Several Joins
In the preceding query, we are using a parameter @RecipeName. This parameter allows us to set a value for the @RecipeName variable. This value can be changed depending on which report is being run. When it comes to using Report Server, the value for @RecipeName can be set for each report. This type of query can allow you to use a single set of code for multiple types of report. This allows you to use the same query for Spaghetti and Spinach Frittata or a report request for a specific recipe.
Recipe Information
Recipe Ingredients from a View
The main difference between these two queries is that the overall logic required for Listing 13-20 is much simpler and more straightforward. This can allow less technical users the ability to create reports off this same data set. While this can help keep data consistent and easy to access, this method may not have the best performance. When using this method to access your data for reporting, be sure to monitor performance and confirm that the reports are returning data without timing out.
This base data set can then be used throughout all your reports. Once you have updated the reports to use this base data set, you are now confirming that this data set will be able to handle your reporting needs going forward. This gives you an idea of the structure of the data that you will need if you would like to move toward a data warehouse.
Dynamic SQL
Depending on what T-SQL code you are trying to write, you may find it difficult to write your database code to use set-based design. In those situations, it may be tempting to write your code in a way that is more iterative. You may want to write T-SQL code that is highly variable and can be modified depending on the parameters and values passed into it. Using dynamic SQL may seem like the solution you have been wanting. In most cases, the drawbacks of dynamic SQL can far outweigh the benefits, but there are times where dynamic SQL can be one of the right solutions.
Previously in Chapter 7, I covered how you can read execution plans. When using dynamic SQL, a new execution plan will need to be calculated each time the query is executed. This causes additional strain on the database engine.
Most of the cases where you would want to use dynamic SQL involve times when you need additional flexibility. Many of the times where dynamic SQL is desirable relate to database administration activities. Most of these activities involve performing the same actions on more than one database on the SQL Server instance. There is some additional functionality available with dynamic SQL that can make using dynamic SQL especially appealing.
Dynamically Retrieve Data from Tables
Execute Store Procedure with Different Values
Application Rule Table
Recipe ID | Recipe Name | Recipe Description | Serving Quantity | Meal Type ID | Preparation Type ID |
---|---|---|---|---|---|
1 | Spaghetti | Pasta dinner | 8 | 2 | 1 |
Ingredient ID | Ingredient Name |
---|---|
2 | Tomato Sauce |
The first set of data returned are the columns from the Recipe table for the RecipeName of Spaghetti. The stored procedure in Listing 13-21 excludes the columns IsActive, DateCreated, and DateModified. All other columns in the table are returned. The values in the second data set is the record from the Ingredient table for the IngredientName of Tomato Sauce. From a software development perspective, this may seem like the ideal method to use when writing database code.
However, this method of using T-SQL does not take how the database engine executes a query into considerations. This will cause SQL Server to generate a new execution plan each time the stored procedure is executed as the data being requested can change dramatically between each execution. While it may seem like dynamic SQL can help make your application code more flexible, there is a cost to this method of writing T-SQL. In general, I would avoid using dynamic SQL for most scenarios when I would need to write T-SQL code. Trying to write your code without dynamic SQL will encourage you to practice writing database code in a manner that uses the strengths of the SQL Server database engine.
Using dynamic SQL can also increase the risk that additional T-SQL code is executed beyond the purpose you originally intended. This type of behavior is referred to as SQL injection. The concept of SQL injection is that additional T-SQL code is inserted, or injected, into the original statement. The additional T-SQL code that has been inserted allows for functionality that was not originally intended for the query execution. This can include viewing data that should not be accessible to this user. That same user may also be able to modify data or database objects through SQL injection. If you want to use dynamic SQL, you can minimize the risk of SQL injection by using parameters to pass values to the dynamic SQL. The method of parameterizing the dynamic SQL will make it more difficult for users to view or modified data in ways that you did not intend.
When writing T-SQL code, you want to write code that is functional and effective. This means writing code that is readable and maintainable. Writing code that can grow with your business is also a critical skill. If you are writing code to handle inserts and updates or variety of data modifications, you want to make sure you are writing code that is easily understood and supported going forward. During application development, consider how you want to handle disabling permissions or application functionality. If not, you will want to consider strategies that will allow you to deactivate data values in a way that does not break relationships between the tables in your database. You may also have to support legacy code. One of the challenges with supporting legacy code is making changes without breaking existing functionality. There are times you may need to pull data for reporting from your transactional database. Try to design your queries to allow for flexibility and reusability. While flexibility is good, also use caution as to how much flexibility you implement in your T-SQL code. After improving the functionality of your database code, you may find that you want to determine how to track changes to your data.