INTRODUCTION

It has been estimated that more than 80 percent of all computer programming is database-related. This is certainly easy to believe. After all, a database can be a powerful tool for doing exactly what computer programs do best: store, manipulate, and display data.

Even many programs that seem at first glance to have little to do with traditional business-oriented data use databases to make processing easier. In fact, looking back on 40 some years of software development experience, I'm hard-pressed to think of a single nontrivial application that I've worked on that didn't use some kind of database.

Not only do databases play a role in many applications, but they often play a critical role. If the data is not properly stored, it may become corrupted, and the program will be unable to use it meaningfully. If the data is not properly organized, the program may be unable to find what it needs in a reasonable amount of time.

Unless the database stores its data safely and effectively, the application will be useless no matter how well-designed the rest of the system may be. The database is like the foundation of a building; without a strong foundation, even the best crafted building will fail, sometimes spectacularly (the Leaning Tower of Pisa notwithstanding).

With such a large majority of applications relying so heavily on databases, you would expect everyone involved with application development to have a solid, formal foundation in database design and construction. Everyone, including database designers, application architects, programmers, database administrators, and project managers, should ideally understand what makes a good database design. Even an application's key customers and users could benefit from understanding how databases work.

Sadly, that is usually not the case. Many IT professionals have learned what they know about databases through rumor, trial-and-error, tarot cards, and painful experience. Over the years, some develop an intuitive feel for what makes a good database design, but they may still not understand the reasons a design is good or bad, and they may leave behind a trail of rickety, poorly constructed programs built on shaky database foundations.

This book provides the tools you need to design a database. It explains how to determine what should go in a database and how a database should be organized to ensure data integrity and a reasonable level of performance. It explains techniques for designing a database that is strong enough to store data safely and consistently, flexible enough to allow the application to retrieve the data it needs quickly and reliably, and adaptable enough to accommodate a reasonable amount of change.

With the ideas and techniques described in this book, you will be able to build a strong foundation for database applications.

WHO THIS BOOK IS FOR

This book is intended for IT professionals and students who want to learn how to design, analyze, and understand databases. The material will benefit those who want a better high-level understanding of databases such as proposal managers, architects, project managers, and even customers. The material will also benefit those who will actually design, build, and work with databases such as database designers, database administrators, and programmers. In many projects, these roles overlap so the same person may be responsible for working on the proposal, managing part of the project, and designing and creating the database.

This book is aimed at readers of all experience levels. It does not assume that you have any previous experience with databases or programs that use them. It doesn't even assume that you have experience with computers. All you really need is a willingness and desire to learn.

WHAT THIS BOOK COVERS

This book explains database design. It tells how to plan a database's structure so the database will be robust, resistant to errors, and flexible enough to accommodate a reasonable amount of future change. It explains how to discover database requirements, build data models to study data needs, and refine those models to improve the database's effectiveness.

The book solidifies these concepts by working through a detailed example that designs a (sort of) realistic database. Later chapters explain how to actually build databases using a few different database products. The book finishes by describing topics you need to understand to keep a database running effectively such as database maintenance and security.

WHAT YOU NEED TO USE THIS BOOK

This book explains database design. It tells how to determine what should go in a database and how the database should be structured to give the best results.

This book does not focus on actually creating the database. The details of database construction are different for different database tools, so to remain as generally useful as possible, this book doesn't concentrate on any particular database system. You can apply most of the techniques described here equally to whatever database tool you use, whether it's MariaDB, PostgreSQL, SQL Server, or some other database product.

To remain database-neutral, most of the book does not assume you are using a particular database, so you don't need any particular software or hardware. To work through the exercises, all you need is a pencil and some paper. You are welcome to type solutions into your computer if you like, but you may actually find working with pencil and paper easier than using a graphical design tool to draw pictures, at least until you are comfortable with database design and are ready to pick a computerized design tool.

Chapters 16 through 25 build example databases using particular database offerings, so their material is tied to the databases that they demonstrate. Chapter 15, “Example Overview,” introduces those chapters and lists the databases that they use.

To experiment with the SQL database language described in Chapter 26, “Introduction to SQL,” and Chapter 27, “Building Databases with SQL Scripts,” you need any database product that supports SQL (that includes pretty much all relational databases) running on any operating system.

HOW THIS BOOK IS STRUCTURED

The chapters in this book are divided into five parts plus appendixes. The chapters in each part are described here. If you have previous experience with databases, you can use these descriptions to decide which chapters to skim and which to read in detail.

Part I: Introduction to Databases and Database Design

The chapters in this part of the book provide background that is necessary to understand the chapters that follow. You can skim some of this material if it is familiar to you, but don't take it too lightly. If you understand the fundamental concepts underlying database design, it will be easier to understand the point behind important design concepts presented later.

Chapter 1, “Database Design Goals,” explains the reasons people and organizations use databases. It explains a database's purpose and conditions that it must satisfy to be useful. This chapter also describes the basic ACID (Atomicity, Consistency, Isolation, Durability) and CRUD (Create, Read, Update, Delete) features that any good database should have. It explains in high-level general terms what makes a good database and what makes a bad database.

Chapter 2, “Relational Overview,” explains basic relational database concepts such as tables, rows, and columns. It explains the common usage of relational database terms in addition to the more technical terms that are sometimes used by database theorists. It describes different kinds of constraints that databases use to guarantee that the data is stored safely and consistently.

Chapter 3, “NoSQL Overview,” explains the basics of NoSQL databases, which are growing quickly in popularity. Those databases include document, key-value, column-oriented, and graph databases. Both relational and NoSQL databases can run either locally or in the cloud, but many NoSQL databases are more cloud-oriented, largely because they are newer technology so they're cloud-native.

Part II: Database Design Process and Techniques

The chapters in this part of the book discuss the main pieces of relational database design. They explain how to understand what should be in the database, develop an initial design, separate important pieces of the database to improve flexibility, and refine and tune the design to provide the most stable and useful design possible.

Chapter 4, “Understanding User Needs,” explains how to learn about the users' needs and gather user requirements. It tells how to study the users' current operations, existing databases (if any), and desired improvements. It describes common questions that you can ask to learn about users' operations, desires, and needs, and how to build the results into requirements documents and specifications. This chapter explains what use cases are and shows how to use them and the requirements to guide database design and to measure success.

Chapter 5, “Translating User Needs into Data Models,” introduces data modeling. It explains how to translate the user's conceptual model and the requirements into other, more precise models that define the database design rigorously. This chapter describes several database modeling techniques, including user-interface models, semantic object models, entity-relationship diagrams, and relational models.

Chapter 6, “Extracting Business Rules,” explains how a database can handle business rules. It explains what business rules are, how they differ from database structure requirements, and how you can identify business rules. This chapter explains the benefits of separating business rules from the database structure and tells how to achieve that separation.

Chapter 7, “Normalizing Data,” explains one of the most important tools in relational database design: normalization. Normalization techniques allow you to restructure a database to increase its flexibility and make it more robust. This chapter explains various forms of normalization, emphasizing the stages that are most common and important: first, second, and third normal forms (1NF, 2NF, and 3NF). It explains how each of these kinds of normalization helps prevent errors and tells why it is sometimes better to leave a database slightly less normalized to improve performance.

Chapter 8, “Designing Databases to Support Software,” explains how databases fit into the larger context of application design and the development life cycle. This chapter explains how later development depends on the underlying database design. It discusses multi-tier architectures that can help decouple the application and database so there can be at least some changes to either without requiring changes to both.

Chapter 9, “Using Common Design Patterns,” explains some common patterns that are useful in many applications. Some of these techniques include implementing various kinds of relationships among objects, storing hierarchical and network data, recording temporal data, and logging and locking.

Chapter 10, “Avoiding Common Design Pitfalls,” explains some common design mistakes that occur in database development. It describes problems that can arise from insufficient planning, incorrect normalization, and obsession with ID fields and performance.

Part III: A Detailed Case Study

If you follow all of the examples and exercises in the earlier chapters, by this point you will have seen all of the major steps for producing a good database design. However, it's often useful to see all the steps in a complicated process put together in a continuous sequence. The chapters in this part of the book walk through a detailed case study following all the phases of database design for the fictitious Pampered Pet database.

Chapter 11, “Defining User Needs and Requirements,” walks through the steps required to analyze the users' problem, define requirements, and create use cases. It describes interviews with fictitious customers that are used to identify the application's needs and translate them into database requirements.

Chapter 12, “Building a Data Model,” translates the requirements gathered in the previous chapter into a series of data models that precisely define the database's structure. This chapter builds user interface models, entity-relationship diagrams, semantic object models, and relational models to refine the database's initial design. The final relational models match the structure of a relational database fairly closely, so they are easy to implement.

Chapter 13, “Extracting Business Rules,” identifies the business rules embedded in the relational model constructed in the previous chapter. It shows how to extract those rules in order to separate them logically from the database's structure. This makes the database more robust in the face of future changes to the business rules.

Chapter 14, “Normalizing and Refining,” refines the relational model developed in the previous chapter by normalizing it. It walks through several versions of the database that are in different normal forms. It then selects the degree of normalization that provides a reasonable trade-off between robust design and acceptable performance.

Part IV: Example Programs

Though this book focuses on abstract database concepts that do not depend on a particular database product, it's also worth spending at least some time on more concrete implementation issues. The chapters in this part of the book describe some of those issues and explain how to build simple example programs that demonstrate a few different database products.

Chapter 15, “Example Overview,” provides a roadmap for the chapters that follow. It tells which chapters use which databases and how to get the most out of those chapters. Chapters 16 through 25 come in pairs, with the first describing an example in Python and the second describing a similar (although not always identical) program in C#.

Chapters 16 and 17 describe examples that use the popular MariaDB column-oriented relational database running on the local machine.

Chapters 18 and 19 demonstrate the (also popular) PostgreSQL database, also running on the local machine.

Chapters 20 and 21 show how to use the Neo4j AuraDB graph database running in the cloud.

Chapters 22 and 23 describe examples that use the MongoDB Atlas document database, also running in the cloud.

Chapters 24 and 25 demonstrate the Apache Ignite key-value database running locally.

These examples are just intended to get you started. They are relatively simple examples and they do not show all of the possible combinations. For example, you can run an Apache Ignite database in the cloud if you like; there were just too many combinations to cover them all in this book.

Part V: Advanced Topics

Although this book does not assume you have previous database experience, that doesn't mean it cannot cover some more advanced subjects. The chapters in this part of the book explain some more sophisticated topics that are important but not central to database design.

Chapter 26, “Introduction to SQL,” provides an introduction to SQL (Structured Query Language). It explains how to use SQL commands to add, insert, update, and delete data. By using SQL, you can help insulate a program from the idiosyncrasies of the particular database product that it uses to store data.

Chapter 27, “Building Databases with SQL Scripts,” explains how to use SQL scripts to build a database. It explains the advantages of this technique, such as the ability to create scripts to initialize a database before performing tests. It also explains some of the restrictions on this method, such as the fact that the user may need to create and delete tables in a specific order to satisfy table relationships.

Chapter 28, “Database Maintenance,” explains some of the database maintenance issues that are part of any database application. Though performing and restoring backups, compressing tables, rebuilding indexes, and populating data warehouses are not strictly database design tasks, they are essential to any working application.

Chapter 29, “Database Security,” explains database security issues. It explains the kinds of security that some database products provide. It also explains some additional techniques that can enhance database security such as using database views to appropriately restrict the users' access to data.

Appendixes

The book's appendixes provide additional reference material to supplement the earlier chapters.

Appendix A, “Exercise Solutions,” gives solutions to the exercises at the end of most of the book's chapters so that you can check your progress as you work through the book.

Appendix B, “Sample Relational Designs,” shows some sample designs for a variety of common database situations. These designs store information about such topics as books, movies, documents, customer orders, employee timekeeping, rentals, students, teams, and vehicle fleets.

The Glossary provides definitions for useful database and software development terms. The Glossary includes terms defined and used in this book in addition to a few other useful terms that you may encounter while reading other database material.

HOW TO USE THIS BOOK

Because this book is aimed at readers of all experience levels, you may find some of the material familiar if you have previous experience with databases. In that case, you may want to skim chapters covering material that you already thoroughly understand.

If you are familiar with relational databases, you may want to skim Chapter 1, “Database Design Goals,” and Chapter 2, “Relational Overview.” Similarly if you have experience with NoSQL databases, you may want to skip Chapter 3, “NoSQL Overview.”

If you have previously helped write project proposals, you may understand some of the questions you need to ask users to properly understand their needs. In that case, you may want to skim Chapter 4, “Understanding User Needs.”

If you have built databases before, you may understand at least some of the data normalization concepts explained in Chapter 7, “Normalizing Data.” This is a complex topic, however, so I recommend that you not skip this chapter unless you really know what you're doing.

If you have extensive experience with SQL, you may want to skim Chapter 26, “Introduction to SQL.” (Many developers who have used but not designed databases fall into this category.)

In any case, I strongly recommend that you at least skim the material in every chapter to see if there are any new concepts you can pick up along the way. At least look at the Exercises at the end of each chapter before you decide that you can safely skip to the next. If you don't know how to outline the solutions to the Exercises, then you should consider looking at the chapter more closely.

Different people learn best in different ways. Some learn best by listening to lecturers, others by reading, and others by doing. Everyone learns better by combining learning styles. You will get the most from this book if you read the material and then work through the Exercises. It's easy to think to yourself, “Yeah, that makes sense” and believe you understand the material, but working through some of the Exercises will help solidify the material in your mind. Doing so may also help you see new ways that you can apply the concepts covered in the chapter.

After you have learned the ideas in the book, you can use it for a reference. For example, when you start a new project, you may want to refer to Chapter 4, “Understanding User Needs,” to refresh your memory about the kinds of questions you should ask users to discover their true needs.

Visit the book's website to look for updates and addendums. If readers find typographical errors or places where a little additional explanation may help, I'll post updates on the website.

Finally, if you get stuck on a really tricky concept and need a little help, email me at [email protected] and I'll try to help you out.

NOTE TO INSTRUCTORS

Database programming is boring. Maybe not to you and me, who have discovered the ecstatic joy of database design, the thrill of normalization, and the somewhat risqué elation brought by slightly denormalizing a database to achieve optimum performance. But let's face it, to a beginner, database design and development can be a bit dull.

There's little you can do to make the basic concepts more exciting, but you can do practically anything with the data. At some point it's useful to explain how to design a simple inventory system, but that doesn't mean you can't use other examples designed to catch students' attention. Data that relates to the students' personal experiences or that is just plain outrageous keeps them awake and alert (and most of us know that it's easier to teach students who are awake).

The examples in this book are intended to demonstrate the topic at hand but not all of them are strictly business-oriented. I've tried to make them cover a wide variety of topics from serious to silly. To keep your students interested and alert, you should add new examples from your personal experiences and from your students' interests.

I've had great success in my classroom using examples that involve sports teams (particularly local rivalries), music (combining classics such as Bach, Beethoven, and Tone Loc), the students in the class (but be sure not to put anyone on the spot), television shows and stars, comedians, and anything else that interests the students.

For exercises, encourage students to design databases that they will find personally useful. I've had students build databases that track statistics for the players on their favorite football teams, inventory their DVD or CD collections, file and search recipe collections, store data on “Magic: The Gathering” trading cards, track role-playing game characters, record information about classic cars, and schedule athletic tournaments. (The tournament scheduler didn't work out too well—the scheduling algorithms were too tricky.) One student even built a small but complete inventory application for his mother's business that she actually found useful. I think he was as shocked as anyone to discover he'd learned something practical.

When students find an assignment interesting and relevant, they become emotionally invested and will apply the same level of concentration and intensity to building a database that they normally reserve for console gaming, Star Wars, and World of Warcraft. They may spend hours crafting a database to track WoW alliances just to fulfill a 5-minute assignment. They may not catch every nuance of domain/key normal form, but they'll probably learn a lot about building a functional database.

NOTE TO STUDENTS

If you're a student and you peeked at the previous section, “Note to Instructors,” shame on you! If you didn't peek, do so now.

Building a useful database can be a lot of work, but there's no reason it can't be interesting and useful to you when you're finished. Early in your reading, pick some sort of database that you would find useful (see the previous section for a few ideas) and think about it as you read through the text. When the book talks about creating an initial design, sketch out a design for your database. When the book explains how to normalize a database, normalize yours. As you work through the exercises, think about how they would apply to your dream database.

Don't be afraid to ask your instructor if you can use your database instead of one suggested by the book for a particular assignment (unless you have one of those instructors who hand out extra work to anyone who crosses their path; in that case, keep your head down). Usually an instructor's thought process is quite simple: “I don't care what database you use as long as you learn the material.” Your instructor may want your database to contain several related tables so that you can create the complexity needed for a particular exercise, but it's usually not too hard to make a database complicated enough to be interesting.

When you're finished, you will hopefully know a lot more about database design than you do now, and if you're persistent, you might just have a database that's actually good for something. Hopefully you'll also know how to design other useful databases in the future. (And when you're finished, email me at [email protected] and let me know what you built!)

CONVENTIONS

To help you get the most from the text and keep track of what's happening, we've used a number of conventions throughout the book.

SOURCE CODE

As you work through the examples in this book, you may choose either to type in all the code manually or to use the source code files that accompany the book. All of the source code used in this book is available for download at www.wiley.com/go/beginningdbdesign2e.

CONTACTING THE AUTHOR

If you have questions, suggestions, comments, want to swap cookie recipes, or just want to say “Hi,” email me at [email protected]. I can't promise that I'll be able to help you with every problem, but I do promise to try.

DISCLAIMER

Many of the examples in this book were chosen for interest or humorous effect. They are not intended to disparage anyone. I mean no disrespect to police officers (or anyone else who regularly carries a gun), plumbers, politicians, jewelry store owners, street luge racers (or anyone else who wears helmets and Kevlar body armor to work), or college administrators. Or anyone else for that matter.

Well, maybe politicians.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.64.66