An abstract graphic consisting of small, medium, and large cubes.

© Chapter opener image: © cherezoff/Shutterstock

PREFACE

Purpose of This Book

Databases have become a central element of the framework of modern computer systems. It follows that the study of database systems, design, and management is an essential part of the education of computer science, data science, information science, and information technology curricula. A course in database technology should provide students with a strong theoretical background, practice in database design, and experience creating and developing a working database. Students should also be exposed to emerging data management issues and new directions for database technology. This book is designed to help students integrate theoretical material with practical knowledge, using an approach that applies theory to database implementation. It will also help students understand the difference between traditional database technology and the new directions created by big data, outlining the motivation, fundamental concepts, technologies, and challenges associated with handling large datasets. Issues of data security and privacy are incorporated throughout the text, as well as in a special chapter devoted to these topics.

Structure

Theoretical foundations are presented early, and theoretical concepts are incorporated throughout the book, including in the chapters that address implementation. Chapters are sequenced to take the student through the process of planning, designing, and implementing databases using the leading models. Conceptual and logical database design are given thorough consideration. The entity-relationship (ER) and enhanced entity-relationship (EER) models are introduced early and then mapped to the relational model. Full examples of implementations of relational models are presented in detail, with Java Database Connectivity (JDBC) illustrated as an application programming interface for accessing relational databases. Reverse engineering from the relational to the ER model is included. Relational normalization is studied in detail, and many examples of the normalization process are discussed. The Unified Modeling Language (UML) is presented as a vehicle for conceptual design for the object-oriented model and a full example of an object-oriented model database is given. The object-relational model is described, with mappings from both UML and EER models shown. Details of implementing the object-relational model are presented. For the semi-structured model, both JavaScript Object Notation (JSON) and Extensible Markup Language (XML) implementations are presented. NoSQL models are described, with examples of documentbased and graph-based systems presented in detail.

A continuing example of a university database is incorporated throughout the text, to illustrate concepts and techniques and to provide both continuity and contrast. Other examples are presented as needed. Purely relational, object-relational, object-oriented, and NoSQL database systems are described and used for implementation of the examples. Details of database management systems are explored so that students can learn the specifics of these real-life systems down to the implementation level. OpenOffice Base is used initially in the relational examples, but Oracle is introduced as the material is developed, and its features are described when they connect to the topic being presented. The examples are suitable, however, for use with any relational or object-relational database management system. InterSystems Iris is used to illustrate object-oriented databases.

The fundamentals of storing and accessing large data sets are presented. Hadoop is presented as the framework that gave birth to the era of big data storage and analytics at companies such as Facebook, Twitter, LinkedIn, Etsy, Netflix, and Disney, to name a few. The Hadoop distributed file system (HDFS) is described as the basis for supporting programming models such as MapReduce for batch-oriented analytics of large data stores and Spark for streaming, real-time data analytics. The Hive data warehouse infrastructure of the Hadoop environment is examined as a means for expressing Structured Query Language (SQL)-like queries over large files. NoSQL systems are also addressed to provide DBMS functionality for big data. Data organization and query capabilities for two popular types of NoSQL systems are described. Mongo DB is used as an example of a document database, and Neo4J as a graph-based system.

Individual chapters cover the topics of database security, relational query optimization, transaction management, distributed databases, data warehouses with data mining, and social/ethical issues.

New to This Edition

  • Online virtual labs to accompany the text

  • Expanded coverage of security issues such as SQL injection with mitigation examples

  • New coverage of blockchain technology and applications

  • Forward and reverse engineering between conceptual and relational models

  • Updated discussion of large-scale data models

  • Expanded treatment of data warehouses and data analytics, including SQL analytic enhancements

  • Expanded discussion of big data and NoSQL, including MongoDB and Neo4j, with examples

  • Discussion of latest changes in the SQL Standard

  • Explanation and examples of new Oracle features including SQL for JSON data, SQL analytic functions, and private temporary tables

  • Discussion of new Intersystems Iris features for object-oriented databases

  • Updated discussion of social and ethical issues, including ethical use of social media, realistic examples of ethical dilemmas and a framework for ethical decision making, new professional standards for data professionals, data de-identification and new privacy legislation from an international perspective, and new intellectual property laws

  • Mapping of latest curriculum standards in computer science, data science, information science, and information technology to the text

Learning Features

A unique feature of this book is the accompanying set of online labs, which coordinate with the chapters and provide students an opportunity for hands-on learning through guided instruction in topics including database design and database system development and usage.

The writing style is conversational. Each chapter begins with a statement of learning objectives. Examples and applications are presented throughout the text. Illustrations are used both to clarify the material and to vary the presentation. Exercises appear at the end of each chapter and in the supplementary online material, with solutions provided in the instructor materials. The online sample project is an important part of the text, offering an opportunity for students to see how to apply the material just presented. It includes code for implementing the resulting databases. The online student projects can be introduced after the first chapter; students are expected to choose one, or have one assigned, and to develop that project to parallel the sample as they progress through the corresponding chapters. Student projects may be completed individually or in groups. Solutions for the student projects, since they are intended to be used as assignments, are not included there, but are available in the instructor materials.

Oracle, Iris, MongoDB, and Neo4j code for implementing the example databases used in the text are available on the student website. Chapter summaries are included in the text to provide a rapid review or preview of the material and to help students understand the relative importance of the concepts presented. The instructor materials include slides for each chapter in PowerPoint format, along with full statements of objectives, teaching hints, suggestions for exercises and project steps, solutions for the student projects, and solutions to exercises for each of the chapters. Alternative student projects are also included.

Audience

The material is suitable for college juniors or seniors who are computer science, data science or information science majors with a solid technical background. Students should have completed at least one year of courses in programming, including data structures. The book can also be used as an introductory database text for graduate students or for self-study.

Mapping to Curriculum Guidelines

Although the material in this book is based on the authors’ experiences, it also fits the ACM Computing Curricula 2020 recommendations for undergraduate computer programs in computer science, data science, information systems, and information technology.

The Computer Science Curricula 2013 guidelines describe 18 Knowledge Areas with accompanying Knowledge Units for inclusion in undergraduate curricula. This text addresses primarily the Information Management (IM) Knowledge Area but includes material in the areas of Social Issues and Professional Practice (SP) and Information Assurance and Security (IAS). The following chart shows how the Knowledge Units map to the chapters or sections of this text.

CS Knowledge Area/Knowledge UnitChapters or Sections
IM Information Management ConceptsCh 1, Ch 2
IM Database SystemsCh 2
IM Data ModelingSect 2.7, Ch 3, Ch 4, Ch 9, Ch 13, Ch 14, Sect 15.7
IM IndexingSect 5.3, Sect 10.3, Sect 15.10, App A
IM Relational DatabasesCh 4, Ch 5, Ch 7
IM Query LanguagesSect 4.5, Ch 5, Ch 7, Sect 9.5, Ch 10, Sect 12.7, Ch13, Sect 14.4, Sect 14.5, Sect 15.10
IM Transaction ProcessingCh 11
IM Distributed DatabasesCh 12
IM Physical Database DesignSect 2.6, Ch 10, App A
IM Data MiningSect 15.11
IAS Foundational ConceptsCh 8, Sect 16.3
IAS Security Policy and GovernanceCh 8
IAS CryptographySect 8.6
IAS Web SecuritySect 8.12
SP Professional EthicsSect 16.2
SP Intellectual PropertySect 16.4
SP Privacy and Civil LibertiesSect 16.3
SP HistorySect 1.6, Sect 5.1, Sect 13.1, Sect 14.1, Sect 15.1

The ACM Data Science Task Force 2021 report, Computing Competencies for Undergraduate Data Science Curricula, lists several knowledge areas that are addressed in this text.

DS Knowledge AreaChapters or Sections
Big Data Systems (BDS)Ch 14
Data Acquisition, Management and Governance (DG)Ch 2, Sect 16.3
Data Mining (DM)Sect 15.11
Data Privacy, Security, Integrity, and Analysis for Security (DP)Ch 8, Sect 16.3
Professionalism (PR)Sect 16.2

The Joint ACM/AIS IS2020 Task Force recommendations contained in IS2020, A Competency Model for Undergraduate Programs in Information Systems, identify competency areas rather than knowledge areas. The competency areas fall into six realms, namely IS foundations, data, technology, development, organizational, and integration competencies. The realms are further subdivided into nineteen competency areas, of which ten are required and nine are elective. The data realm is most relevant to this text, with the competency area of Data/Information Management being the required area.

IS Competency (Required)Chapters or Sections
Query the relational modelSect 4.5, Sect 5.4, Ch 7
Design relational databasesCh 4, Ch 6
Program database systems using functions and triggersSect 7.5, Sect 7.7
Secure a databaseCh 8
Compare tradeoffs of different concurrency modesCh 11, Sect 12.6
Develop non-relational modelsCh 9, Ch 13, Ch14

An optional competency area in the data realm is Data/Business Analytics. Competencies in that area include some that are addressed in this text.

IS Competency (Optional)Chapters or Sections
Explain the core principles behind various analytics tasksSect 15.10
Articulate the nature and potential of Big DataCh 14
Demonstrate the use of big data toolsCh 14

The Information Technology Curricula 2017 (IT2017) Curriculum Guidelines for Baccalaureate Degree Programs in Information Technology lists essential and supplemental domains. One of the essential domains is information management. The topics listed for that domain and the corresponding coverage are listed below

IT Domain (Essential)Chapters or Sections
ITE-IMA-01 Perspectives and impactCh 1
ITE-IMA-02 Data-information conceptsCh 2
ITE-IMA-03 Data modelingCh 3, Ch 4, Ch 6, Ch 9, Ch 13, Ch 14
ITE-IMA-04 Database query languagesCh 4, Ch 5, Ch 7, Ch 10, Sect 12.7, Ch 13, Ch 14
ITE-IMA-05 Data organization architectureCh 2
ITE-IMA-06 Special-purpose databasesCh 9, Ch 12, Ch 13, Ch 14, Ch 15
ITE-IMA-07 Managing the databaseCh 2, Ch 5, Ch 8, Ch 11

In addition to this coverage, the sample project and student projects also provide practice in data modeling, database design, human–computer interaction, relational databases, object-oriented databases, object-relational databases, database query languages, distributed databases, web-based databases, and physical database design. Some aspects of security, privacy, and civil liberties are discussed in the sample project, and similar issues will arise in the student projects and should be addressed.

Acknowledgments

We are grateful to the many people who offered useful comments, advice, and encouragement during the writing of this book. We thank the reviewers, past readers, and colleagues throughout the world who provided comments that have helped shape our revision. The editorial, production, and marketing staff at Jones & Bartlett Learning, especially Ned Hinman, Melissa Duffy, Daniel Stone, Faith Brosnan and James Fortney, have been unfailingly helpful and encouraging. Above all, we appreciate our spouses and families, whose patience and understanding enabled us to focus on this work.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.32.154