Chapter 1. Design and implement database objects

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 1. Design and implement database objects

Developing and implementing a database for SQL Server starts with understanding both the process of designing a database and the basic structures that make up a database. A firm grip on those fundamentals is a must for an SQL Server developer, and is even more important for taking this exam.

Important Have you read page xv?

It contains valuable information regarding the skills you need to pass the exam.

We begin with the fundamentals of a typical database meant to store information about a business. This is generally referred to as online transaction processing (OLTP), where the goal is to store data that accurately reflects what happens in the business in a manner that works well for the applications. For this pattern, we review the relational database design pattern, which is covered in Skill 1.1. OLTP databases can be used to store more than business transactions, including the ability to store any data about your business, such as customer details, appointments, and so on.

Skills 1.2 and 1.3 cover some of the basic constructs, including indexes and views, that go into forming the physical database structures (Transact-SQL code) that applications use to create the foundational objects your applications use to do business.

In Skill 1.4 we explore columnstore indexes that focus strictly on analytics. While discussing analytics, we look at the de facto standard for building reporting structures called dimensional design. In dimensional design, the goal is to format the data in a form that makes it easier to extract results from large sets of data without touching a lot of different structures.

Skills in this chapter:

Design and implement a relational database schema

Design and implement indexes

Design and implement views

Implement columnstore indexes

Skill 1.1: Design and implement a relational database schema

In this section, we review some of the factors that go into creating the base tables that make up a relational database. The process of creating a relational database is not tremendously difficult. People build similar structures using Microsoft Excel every day. In this section, we are going to look at the basic steps that are needed to get started creating a database in a professional manner.

This section covers how to:

Design tables and schemas based on business requirements

Improve the design of tables by using normalization

Write create table statements

Determine the most efficient data types to use

Designing tables and schemas based on business requirements

A very difficult part of any project is taking the time to gather business requirements. Not because it is particularly difficult in terms of technical skills, but because it takes lots of time and attention to detail. This exam that you are studying for is about developing the database, and the vast majority of topics center on the mechanical processes around the creation of objects to store and manipulate data via Transact-SQL code. However, the first few sections of this skill focus on required skills prior to actually writing Transact-SQL.

Most of the examples in this book, and likely on the exam, are abstract, contrived, and targeted to a single example; either using a sample database from Microsoft, or using examples that include only the minimal details for the particular concept being reviewed. There are, however, a few topics that require a more detailed narrative. To review the topic of designing a database, we need to start out with some basic requirements, using them to design a database that demonstrates database design concepts and normalization.

We have a scenario that defines a database need, including some very basic requirements. Questions on the exam can easily follow this pattern of giving you a small set of requirements and table structures that you need to match to the requirements. This scenario will be used as the basis for the first two sections of this chapter.

Imagine that you are trying to write a system to manage an inventory of computers and computer peripherals for a large organization. Someone has created a document similar in scope to the following scenario (realistic requirements are often hundreds or even thousands of pages long, but you can learn a lot from a single paragraph):

We have 1,000 computers, comprised of laptops, workstations, and tablets. Each computer has items associated with it, which we will list as mouse, keyboard, etc. Each computer has a tag number associated with it, and is tagged on each device with a tag graphic that can be read by tag readers manufactured by “Trey Research” (http://www.treyresearch.net/) or “Litware, Inc” (http://www.litwareinc.com/). Of course tag numbers are unique across tag readers. We don’t know which employees are assigned which computers, but all computers that cost more than $300 are inventoried for the first three years after purchase using a different software system. Finally, employees need to have their names recorded, along with their employee number in this system.

Let’s look for the tables and columns that match the needs of the requirements. We won’t actually create any tables yet, because this is just the first step in the process of database design. In the next section, we spend time looking at specific tests that we apply to our design, followed by two sections on creating the table structures of a database.

The process of database design involves scanning requirements, looking for key types of words and phrases. For tables, you look for the nouns such as “computers” or “employee.” These can be tables in your final database. Some of these nouns you discover in the requirements are simply subsets of one another: “computer” and “laptop.” For example, laptop is not necessarily its own table at all, but instead may be just a type of computer. Whether or not you need a specific table for laptops, workstations, or tablets isn’t likely to be important. The point is to match a possible solution with a set of requirements.

After scanning for nouns, you have your list of likely objects on which to save data. These will typically become tables after we complete our design, but still need to be refined by the normalization process that we will cover in the next section:

1. Computer

2. Employee

The next step is to look for attributes of each object. You do this by scanning the text looking for bits of information that might be stored about each object. For the Computer object, you see that there is a Type of Computer (laptop, workstation, or tablet), an Associated Item List, a Tag, a Tag Company, and a Tag Company URL, along with the Cost of the computer and employee that the computer is assigned to. Additionally, in the requirements, we also have the fact that they keep the computer inventoried for the first three years after purchase if it is > $300, so we need to record the Purchase Date. For the Employee object we are required to capture their Name and Employee Number.

Now we have the basic table structures to extract from the requirements, (though we still require some refinement in the following section on normalization) and we also define schemas, which are security/organizational groupings of tables and code for our implemented database. In our case, we define two schemas: Equipment and HumanResources.

Our design consists of the following possible tables and columns:

1. Equipment.Computer: (ComputerType, AssociatedItemList, Tag, TagCompany, TagCompanyURL, ComputerCost, PurchaseDate, AssignedEmployee)

2. HumanResources.Employee: (Name, EmployeeNumber)

The next step in the process is to look for how you would uniquely identify a row in your potential database. For example, how do you tell one computer from another. In the requirements, we are told that, “Each computer has a tag number,” so we will identify that the Tag attribute must be unique for each Computer.

This process of designing the database requires you to work through the requirements until you have a set of tables that match the requirements you’ve been given.

In the real world, you don’t alter the design from the provided requirements unless you discuss it with the customer. And in an exam question, you do whatever is written, regardless of whether it makes perfect sense. Do you need the URL of the TagCompany, for instance? If so, why? For the purposes of this exam, we will focus on the process of translating words into tables.

Note Logical Database Model

Our progress so far in designing this sample database is similar to what is referred to as a logical database model. For brevity, we have skipped some of the steps in a realistic design process. We continue to refine this example in upcoming sections.

Improving the design of tables by using normalization

Normalization is a set of “rules” that cover some of the most fundamental structural issues with relational database designs (there are other issues beyond normalization—for example, naming—that we do not talk about.) All of the rules are very simple at their core and each will deal with eliminating some issue that is problematic to the users of a database when trying to store data with the least redundancy and highest potential for performance using SQL Server 2016’s relational engine.

The typical approach in database design is to work instinctively and then use the principles of normalization as a test to your design. You can expect questions on normalization to be similar, asking questions like, “is this a well-designed table to meet some requirement?” and any of the normal forms that might apply.

However, in this section, we review the normal forms individually, just to make the review process more straightforward. The rules are stated in terms of forms, some of which are numbered, and some which are named for the creators of the rule. The rules form a progression, with each rule becoming more and more strict. To be in a stricter normal form, you need to also conform to the lesser form, though none of these rules are ever followed one hundred percent of the time.

The most important thing to understand will be the concepts of normalization, and particularly how to verify that a design is normalized. In the following sections, we will review two families of normalization concepts:

Rules covering the shape of a table

Rules covering the relationship of non-key attributes to key attributes

Rules covering the shape of a table

A table’s structure—based on what SQL Server (and most relational database management systems, or RDBMSs) allow—is a very loose structure. Tables consist of rows and columns. You can put anything you want in the table, and you can have millions, even billions of rows. However, just because you can do something, doesn’t mean it is correct.

The first part of these rules is defined by the mathematical definition of a relation (which is more or less synonymous with the proper structure of a table). Relations require that you have no duplicated rows. In database terminology, a column or set of columns that are used to uniquely identify one row from another is called a key. There are several types of keys we discuss in the following section, and they are all columns to identify a row (other than what is called a foreign key, which are columns in a table that reference another table’s key attributes). Continuing with the example we started in the previous section, we have one such example in our design so far with: HumanResources.Employee: (Name, EmployeeNumber).

Using the Employee table definition that we started with back in the first section of this chapter, it would be allowable to have the following two rows of data represented:

Table of Contents for Chapter 1. Design and implement database objects

Create new playlist

Sign In

Sign Up

Chapter 1. Design and implement database objects

Skill 1.1: Design and implement a relational database schema

Designing tables and schemas based on business requirements

Improving the design of tables by using normalization

Rules covering the shape of a table

Rules covering the relationship of non-key attributes to key attributes

Writing table create statements

Determining the most efficient data types to use

Computed columns

Dynamic data masking

Skill 1.2: Design and implement indexes

Design new indexes based on provided tables, queries, or plans

Indexing during the database design phase

Uniqueness Constraints

Foreign Key Columns

Indexing once data is in your tables

Common search paths discovered during development

Joins

Sorts

Distinguish between indexed columns and included columns

Implement clustered index columns by using best practices

Recommend new indexes based on query plans

Table of Contents for
Chapter 1. Design and implement database objects