35,99 €
A concise introduction to database design concepts, methods, and techniques in and out of the cloud In the newly revised second edition of Beginning Database Design Solutions: Understanding and Implementing Database Design Concepts for the Cloud and Beyond, Second Edition, award-winning programming instructor and mathematician Rod Stephens delivers an easy-to-understand guide to designing and implementing databases both in and out of the cloud. Without assuming any prior database design knowledge, the author walks you through the steps you'll need to take to understand, analyze, design, and build databases. In the book, you'll find clear coverage of foundational database concepts along with hands-on examples that help you practice important techniques so you can apply them to your own database designs, as well as: * Downloadable source code that illustrates the concepts discussed in the book * Best practices for reliable, platform-agnostic database design * Strategies for digital transformation driven by universally accessible database design An essential resource for database administrators, data management specialists, and database developers seeking expertise in relational, NoSQL, and hybrid database design both in and out of the cloud, Beginning Database Design Solutions is a hands-on guide ideal for students and practicing professionals alike.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 1124
Veröffentlichungsjahr: 2023
COVER
TITLE PAGE
INTRODUCTION
WHO THIS BOOK IS FOR
WHAT THIS BOOK COVERS
WHAT YOU NEED TO USE THIS BOOK
HOW THIS BOOK IS STRUCTURED
HOW TO USE THIS BOOK
NOTE TO INSTRUCTORS
NOTE TO STUDENTS
CONVENTIONS
SOURCE CODE
CONTACTING THE AUTHOR
DISCLAIMER
PART 1: Introduction to Databases and Database Design
1 Database Design Goals
THE IMPORTANCE OF DESIGN
INFORMATION CONTAINERS
STRENGTHS AND WEAKNESSES OF INFORMATION CONTAINERS
DESIRABLE DATABASE FEATURES
SUMMARY
2 Relational Overview
PICKING A DATABASE
RELATIONAL POINTS OF VIEW
TABLE, ROWS, AND COLUMNS
RELATIONS, ATTRIBUTES, AND TUPLES
KEYS
INDEXES
CONSTRAINTS
DATABASE OPERATIONS
POPULAR RDBs
SPREADSHEETS
SUMMARY
3 NoSQL Overview
THE CLOUD
PICKING A DATABASE
NoSQL PHILOSOPHY
NoSQL DATABASES
LESS EXOTIC OPTIONS
MORE EXOTIC OPTIONS
DATABASE PROS AND CONS
SUMMARY
PART 2: Database Design Process and Techniques
4 Understanding User Needs
MAKE A PLAN
BRING A LIST OF QUESTIONS
MEET THE CUSTOMERS
LEARN WHO'S WHO
PICK THE CUSTOMERS’ BRAINS
WALK A MILE IN THE USER'S SHOES
STUDY CURRENT OPERATIONS
BRAINSTORM
LOOK TO THE FUTURE
UNDERSTAND THE CUSTOMERS’ REASONING
LEARN WHAT THE CUSTOMERS REALLY NEED
PRIORITIZE
VERIFY YOUR UNDERSTANDING
CREATE THE REQUIREMENTS DOCUMENT
MAKE USE CASES
DECIDE FEASIBILITY
SUMMARY
5 Translating User Needs into Data Models
WHAT ARE DATA MODELS?
USER INTERFACE MODELS
SEMANTIC OBJECT MODELS
ENTITY-RELATIONSHIP MODELS
RELATIONAL MODELS
SUMMARY
6 Extracting Business Rules
WHAT ARE BUSINESS RULES?
IDENTIFYING KEY BUSINESS RULES
EXTRACTING KEY BUSINESS RULES
MULTI-TIER APPLICATIONS
SUMMARY
7 Normalizing Data
WHAT IS NORMALIZATION?
FIRST NORMAL FORM (1NF)
SECOND NORMAL FORM (2NF)
THIRD NORMAL FORM (3NF)
STOPPING AT THIRD NORMAL FORM
BOYCE-CODD NORMAL FORM (BCNF)
FOURTH NORMAL FORM (4NF)
FIFTH NORMAL FORM (5NF)
DOMAIN/KEY NORMAL FORM (DKNF)
ESSENTIAL REDUNDANCY
THE BEST LEVEL OF NORMALIZATION
NOSQL NORMALIZATION
SUMMARY
8 Designing Databases to Support Software
PLAN AHEAD
DOCUMENT EVERYTHING
CONSIDER MULTI-TIER ARCHITECTURE
CONVERT DOMAINS INTO TABLES
KEEP TABLES FOCUSED
USE THREE KINDS OF TABLES
USE NAMING CONVENTIONS
ALLOW SOME REDUNDANT DATA
DON'T SQUEEZE IN EVERYTHING
SUMMARY
9 Using Common Design Patterns
ASSOCIATIONS
TEMPORAL DATA
LOGGING AND LOCKING
SUMMARY
10 Avoiding Common Design Pitfalls
LACK OF PREPARATION
POOR DOCUMENTATION
POOR NAMING STANDARDS
THINKING TOO SMALL
NOT PLANNING FOR CHANGE
TOO MUCH NORMALIZATION
INSUFFICIENT NORMALIZATION
INSUFFICIENT TESTING
PERFORMANCE ANXIETY
MISHMASH TABLES
NOT ENFORCING CONSTRAINTS
OBSESSION WITH IDs
NOT DEFINING NATURAL KEYS
SUMMARY
PART 3: A Detailed Case Study
11 Defining User Needs and Requirements
MEET THE CUSTOMERS
PICK THE CUSTOMERS' BRAINS
WRITE USE CASES
WRITE THE REQUIREMENTS DOCUMENT
DEMAND FEEDBACK
SUMMARY
12 Building a Data Model
SEMANTIC OBJECT MODELING
ENTITY-RELATIONSHIP MODELING
RELATIONAL MODELING
PUTTING IT ALL TOGETHER
SUMMARY
13 Extracting Business Rules
IDENTIFYING BUSINESS RULES
DRAWING A NEW RELATIONAL MODEL
SUMMARY
14 Normalizing and Refining
IMPROVING FLEXIBILITY
VERIFYING FIRST NORMAL FORM
VERIFYING SECOND NORMAL FORM
VERIFYING THIRD NORMAL FORM
SUMMARY
PART 4: Example Programs
15 Example Overview
TOOL CHOICES
JUPYTER NOTEBOOK
VISUAL STUDIO
DATABASE ADAPTERS
PROGRAM PASSWORDS
SUMMARY
16 MariaDB in Python
INSTALL MariaDB
RUN HeidiSQL
CREATE THE PROGRAM
SUMMARY
17 MariaDB in C#
CREATE THE PROGRAM
SUMMARY
18 PostgreSQL in Python
INSTALL PostgreSQL
RUN pgAdmin
CREATE THE PROGRAM
SUMMARY
19 PostgreSQL in C#
CREATE THE PROGRAM
SUMMARY
20 Neo4j AuraDB in Python
INSTALL NEO4J AURADB
NODES AND RELATIONSHIPS
CYPHER
CREATE THE PROGRAM
SUMMARY
21 Neo4j AuraDB in C#
CREATE THE PROGRAM
SUMMARY
22 MongoDB Atlas in Python
NOT NORMAL BUT NOT ABNORMAL
XML, JSON, AND BSON
INSTALL MongoDB ATLAS
FIND THE CONNECTION CODE
CREATE THE PROGRAM
SUMMARY
23 MongoDB Atlas in C#
CREATE THE PROGRAM
SUMMARY
24 Apache Ignite in Python
INSTALL APACHE IGNITE
START A NODE
CREATE THE PROGRAM
SUMMARY
25 Apache Ignite in C#
CREATE THE PROGRAM
SUMMARY
PART 5: Advanced Topics
26 Introduction to SQL
BACKGROUND
FINDING MORE INFORMATION
STANDARDS
MULTISTATEMENT COMMANDS
BASIC SYNTAX
COMMAND OVERVIEW
CREATE TABLE
CREATE INDEX
DROP
INSERT
SELECT
UPDATE
DELETE
SUMMARY
27 Building Databases with SQL Scripts
WHY BOTHER WITH SCRIPTS?
SCRIPT CATEGORIES
ORDERING SQL COMMANDS
SUMMARY
28 Database Maintenance
BACKUPS
DATA WAREHOUSING
REPAIRING THE DATABASE
COMPACTING THE DATABASE
PERFORMANCE TUNING
SUMMARY
29 Database Security
THE RIGHT LEVEL OF SECURITY
PASSWORDS
PRIVILEGES
INITIAL CONFIGURATION AND PRIVILEGES
TOO MUCH SECURITY
PHYSICAL SECURITY
SUMMARY
A: Exercise Solutions
CHAPTER 1: DATABASE DESIGN GOALS
CHAPTER 2: RELATIONAL OVERVIEW
CHAPTER 3: NOSQL OVERVIEW
CHAPTER 4: UNDERSTANDING USER NEEDS
CHAPTER 5: TRANSLATING USER NEEDS INTO DATA MODELS
CHAPTER 6: EXTRACTING BUSINESS RULES
CHAPTER 7: NORMALIZING DATA
CHAPTER 8: DESIGNING DATABASES TO SUPPORT SOFTWARE
CHAPTER 9: USING COMMON DESIGN PATTERNS
CHAPTER 10: AVOIDING COMMON DESIGN PITFALLS
CHAPTER 11: DEFINING USER NEEDS AND REQUIREMENTS
CHAPTER 12: BUILDING A DATA MODEL
CHAPTER 13: EXTRACTING BUSINESS RULES
CHAPTER 14: NORMALIZING AND REFINING
CHAPTER 15: EXAMPLE OVERVIEW
CHAPTER 16: MARIADB IN PYTHON
CHAPTER 17: MARIADB IN C#
CHAPTER 18: POSTGRESQL IN PYTHON
CHAPTER 19: POSTGRESQL IN C#
CHAPTER 20: NEO4J AURADB IN PYTHON
CHAPTER 21: NEO4J AURADB IN C#
CHAPTER 22: MONGODB ATLAS IN PYTHON
CHAPTER 23: MONGODB ATLAS IN C#
CHAPTER 24: APACHE IGNITE IN PYTHON
CHAPTER 25: APACHE IGNITE IN C#
CHAPTER 26: INTRODUCTION TO SQL
CHAPTER 27: BUILDING DATABASES WITH SQL SCRIPTS
CHAPTER 28: DATABASE MAINTENANCE
CHAPTER 29: DATABASE SECURITY
B: Sample Relational Designs
BOOKS
MOVIES
MUSIC
DOCUMENT MANAGEMENT
CUSTOMER ORDERS
EMPLOYEE SHIFTS AND TIMESHEETS
EMPLOYEES, PROJECTS, AND DEPARTMENTS
EMPLOYEE SKILLS AND QUALIFICATIONS
IDENTICAL OBJECT RENTAL
DISTINCT OBJECT RENTAL
STUDENTS, COURSES, AND GRADES
TEAMS
INDIVIDUAL SPORTS
VEHICLE FLEETS
CONTACTS
PASSENGERS
RECIPES
GLOSSARY
INDEX
COPYRIGHT
DEDICATION
ABOUT THE AUTHOR
ABOUT THE TECHNICAL EDITOR
ACKNOWLEDGMENTS
END USER LICENSE AGREEMENT
Chapter 1
TABLE 1.1: Good vs. bad design
Chapter 15
TABLE 15.1: Examples, the database each demonstrates, and the location ...
Chapter 20
TABLE 20.1: Action methods and their purposes
Chapter 21
TABLE 21.1: Action methods and their purposes
Chapter 22
TABLE 22.1: Graduate assignments
TABLE 22.2: Helper methods and their purposes
TABLE 22.3: Data inserted by the
create_data
method
Chapter 23
TABLE 23.1: Assignment data
TABLE 23.2: Helper methods and their purposes
TABLE 23.3 : Data inserted by the
CreateData
method
Chapter 26
TABLE 26.1: DDL commands
TABLE 26.2: DML commands
TABLE 26.3: DCL commands
TABLE 26.4: TCL commands
TABLE 26.5: Courses records
TABLE 26.6: Enrollments records
TABLE 26.7: Courses records
TABLE 26.8: Courses records
TABLE 26.9: Courses records
TABLE 26.10: Courses records
TABLE 26.11: Courses records
Chapter 27
TABLE 27.1: Initial predecessors
TABLE 27.2: Table predecessors after one round
TABLE 27.3: Table predecessors after two rounds
TABLE 27.4: Table predecessors after three rounds
TABLE 27.5: Table predecessors after four rounds
TABLE 27.6: Table predecessors after five rounds
Chapter 1
FIGURE 1.1
Chapter 2
FIGURE 2.1
FIGURE 2.2
FIGURE 2.3
Chapter 3
FIGURE 3.1
FIGURE 3.2
FIGURE 3.3
FIGURE 3.4
Chapter 4
FIGURE 4.1
FIGURE 4.2
Chapter 5
FIGURE 5.1
FIGURE 5.2
FIGURE 5.3
FIGURE 5.4
FIGURE 5.5
FIGURE 5.6
FIGURE 5.7
FIGURE 5.8
FIGURE 5.9
FIGURE 5.10
FIGURE 5.11
FIGURE 5.12
FIGURE 5.13
FIGURE 5.14
FIGURE 5.15
FIGURE 5.16
FIGURE 5.17
FIGURE 5.18
FIGURE 5.19
FIGURE 5.20
FIGURE 5.21
FIGURE 5.22
FIGURE 5.23
FIGURE 5.24
FIGURE 5.25
FIGURE 5.26
FIGURE 5.27
FIGURE 5.28
FIGURE 5.29
Chapter 6
FIGURE 6.1
FIGURE 6.2
FIGURE 6.3
Chapter 7
FIGURE 7.1
FIGURE 7.2
FIGURE 7.3
FIGURE 7.4
FIGURE 7.5
FIGURE 7.6
FIGURE 7.7
FIGURE 7.8
FIGURE 7.9
FIGURE 7.10
FIGURE 7.11
FIGURE 7.12
FIGURE 7.13
FIGURE 7.14
FIGURE 7.15
FIGURE 7.16
FIGURE 7.17
FIGURE 7.18
FIGURE 7.19
FIGURE 7.20
FIGURE 7.21
FIGURE 7.22
FIGURE 7.23
FIGURE 7.24
FIGURE 7.25
FIGURE 7.26
FIGURE 7.27
FIGURE 7.28
FIGURE 7.29
FIGURE 7.30
Chapter 8
FIGURE 8.1
FIGURE 8.2
FIGURE 8.3
FIGURE 8.4
Chapter 9
FIGURE 9.1
FIGURE 9.2
FIGURE 9.3
FIGURE 9.4
FIGURE 9.5
FIGURE 9.6
FIGURE 9.7
FIGURE 9.8
FIGURE 9.9
FIGURE 9.10
FIGURE 9.11
FIGURE 9.12
FIGURE 9.13
FIGURE 9.14
FIGURE 9.15
FIGURE 9.16
FIGURE 9.17
FIGURE 9.18
FIGURE 9.19
FIGURE 9.20
FIGURE 9.21
FIGURE 9.22
FIGURE 9.23
FIGURE 9.24
FIGURE 9.25
FIGURE 9.26
FIGURE 9.27
FIGURE 9.28
Chapter 10
FIGURE 10.1
FIGURE 10.2
FIGURE 10.3
Chapter 11
FIGURE 11.1
FIGURE 11.2
Chapter 12
FIGURE 12.1
FIGURE 12.2
FIGURE 12.3
FIGURE 12.4
FIGURE 12.5
FIGURE 12.6
FIGURE 12.7
FIGURE 12.8
FIGURE 12.9
FIGURE 12.10
FIGURE 12.11
FIGURE 12.12
FIGURE 12.13
FIGURE 12.14
FIGURE 12.15
FIGURE 12.16
Chapter 13
FIGURE 13.1
FIGURE 13.2
FIGURE 13.3
FIGURE 13.4
Chapter 14
FIGURE 14.1
FIGURE 14.2
FIGURE 14.3
FIGURE 14.4
FIGURE 14.5
Chapter 15
FIGURE 15.1
FIGURE 15.2
FIGURE 15.3
FIGURE 15.4
FIGURE 15.5
FIGURE 15.6
FIGURE 15.7
FIGURE 15.8
FIGURE 15.9
Chapter 16
FIGURE 16.1
FIGURE 16.2
FIGURE 16.3
FIGURE 16.4
FIGURE 16.5
FIGURE 16.6
Chapter 17
FIGURE 17.1
Chapter 18
FIGURE 18.1
FIGURE 18.2
FIGURE 18.3
FIGURE 18.4
FIGURE 18.5
FIGURE 18.6
FIGURE 18.7
Chapter 20
FIGURE 20.1
FIGURE 20.2
FIGURE 20.3
FIGURE 20.4
Chapter 21
FIGURE 21.1
FIGURE 21.2
Chapter 22
FIGURE 22.1
FIGURE 22.2
FIGURE 22.3
FIGURE 22.4
Chapter 24
FIGURE 24.1
Chapter 25
FIGURE 25.1
Chapter 26
FIGURE 26.1
FIGURE 26.2
FIGURE 26.3
Chapter 27
FIGURE 27.1
FIGURE 27.2
Chapter 29
FIGURE 29.1
Appendix A
FIGURE A.1
FIGURE A.2
FIGURE A.3
FIGURE A.4
FIGURE A.5
FIGURE A.6
FIGURE A.7
FIGURE A.8
FIGURE A.9
FIGURE A.10
FIGURE A.11
FIGURE A.12
FIGURE A.13
FIGURE A.14
FIGURE A.15
FIGURE A.16
FIGURE A.17
FIGURE A.18
FIGURE A.19
FIGURE A.20
FIGURE A.21
FIGURE A.22
FIGURE A.23
FIGURE A.24
FIGURE A.25
FIGURE A.26
FIGURE A.27
FIGURE A.28
FIGURE A.29
FIGURE A.30
FIGURE A.31
FIGURE A.32
FIGURE A.33
FIGURE A.34
FIGURE A.35
FIGURE A.36
FIGURE A.37
FIGURE A.38
FIGURE A.39
FIGURE A.40
Appendix B
FIGURE B.1
FIGURE B.2
FIGURE B.3
FIGURE B.4
FIGURE B.5
FIGURE B.6
FIGURE B.7
FIGURE B.8
FIGURE B.9
FIGURE B.10
FIGURE B.11
FIGURE B.12
FIGURE B.13
FIGURE B.14
FIGURE B.15
FIGURE B.16
FIGURE B.17
FIGURE B.18
FIGURE B.19
FIGURE B.20
FIGURE B.21
FIGURE B.22
FIGURE B.23
FIGURE B.24
Cover
Title Page
Copyright
Dedication
About the Author
Acknowledgments
Introduction
Table of Contents
Begin Reading
A: Exercise Solutions
B: Sample Relational Designs
Glossary
Index
End User License Agreement
v
xxv
xxvi
xxvii
xxviii
xxix
xxx
xxxi
xxxii
xxxiii
xxxiv
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
303
304
305
306
307
308
309
310
311
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
355
356
357
358
359
360
361
362
363
364
365
366
367
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
417
418
419
420
421
422
423
424
425
426
427
428
429
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
477
478
479
480
481
482
483
484
485
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
vi
vii
viii
ix
703
Second Edition
Rod Stephens
It has been estimated that more than 80 percent of all computer programming is database-related. This is certainly easy to believe. After all, a database can be a powerful tool for doing exactly what computer programs do best: store, manipulate, and display data.
Even many programs that seem at first glance to have little to do with traditional business-oriented data use databases to make processing easier. In fact, looking back on 40 some years of software development experience, I'm hard-pressed to think of a single nontrivial application that I've worked on that didn't use some kind of database.
Not only do databases play a role in many applications, but they often play a critical role. If the data is not properly stored, it may become corrupted, and the program will be unable to use it meaningfully. If the data is not properly organized, the program may be unable to find what it needs in a reasonable amount of time.
Unless the database stores its data safely and effectively, the application will be useless no matter how well-designed the rest of the system may be. The database is like the foundation of a building; without a strong foundation, even the best crafted building will fail, sometimes spectacularly (the Leaning Tower of Pisa notwithstanding).
With such a large majority of applications relying so heavily on databases, you would expect everyone involved with application development to have a solid, formal foundation in database design and construction. Everyone, including database designers, application architects, programmers, database administrators, and project managers, should ideally understand what makes a good database design. Even an application's key customers and users could benefit from understanding how databases work.
Sadly, that is usually not the case. Many IT professionals have learned what they know about databases through rumor, trial-and-error, tarot cards, and painful experience. Over the years, some develop an intuitive feel for what makes a good database design, but they may still not understand the reasons a design is good or bad, and they may leave behind a trail of rickety, poorly constructed programs built on shaky database foundations.
This book provides the tools you need to design a database. It explains how to determine what should go in a database and how a database should be organized to ensure data integrity and a reasonable level of performance. It explains techniques for designing a database that is strong enough to store data safely and consistently, flexible enough to allow the application to retrieve the data it needs quickly and reliably, and adaptable enough to accommodate a reasonable amount of change.
With the ideas and techniques described in this book, you will be able to build a strong foundation for database applications.
This book is intended for IT professionals and students who want to learn how to design, analyze, and understand databases. The material will benefit those who want a better high-level understanding of databases such as proposal managers, architects, project managers, and even customers. The material will also benefit those who will actually design, build, and work with databases such as database designers, database administrators, and programmers. In many projects, these roles overlap so the same person may be responsible for working on the proposal, managing part of the project, and designing and creating the database.
This book is aimed at readers of all experience levels. It does not assume that you have any previous experience with databases or programs that use them. It doesn't even assume that you have experience with computers. All you really need is a willingness and desire to learn.
This book explains database design. It tells how to plan a database's structure so the database will be robust, resistant to errors, and flexible enough to accommodate a reasonable amount of future change. It explains how to discover database requirements, build data models to study data needs, and refine those models to improve the database's effectiveness.
The book solidifies these concepts by working through a detailed example that designs a (sort of) realistic database. Later chapters explain how to actually build databases using a few different database products. The book finishes by describing topics you need to understand to keep a database running effectively such as database maintenance and security.
This book explains database design. It tells how to determine what should go in a database and how the database should be structured to give the best results.
This book does not focus on actually creating the database. The details of database construction are different for different database tools, so to remain as generally useful as possible, this book doesn't concentrate on any particular database system. You can apply most of the techniques described here equally to whatever database tool you use, whether it's MariaDB, PostgreSQL, SQL Server, or some other database product.
NOTE Most database products include free editions that you can use for smaller projects. For example, SQL Server Express Edition, Oracle Express Edition, and MariaDB Community Server are all free.
To remain database-neutral, most of the book does not assume you are using a particular database, so you don't need any particular software or hardware. To work through the exercises, all you need is a pencil and some paper. You are welcome to type solutions into your computer if you like, but you may actually find working with pencil and paper easier than using a graphical design tool to draw pictures, at least until you are comfortable with database design and are ready to pick a computerized design tool.
Chapters 16 through 25 build example databases using particular database offerings, so their material is tied to the databases that they demonstrate. Chapter 15, “Example Overview,” introduces those chapters and lists the databases that they use.
To experiment with the SQL database language described in Chapter 26, “Introduction to SQL,” and Chapter 27, “Building Databases with SQL Scripts,” you need any database product that supports SQL (that includes pretty much all relational databases) running on any operating system.
The chapters in this book are divided into five parts plus appendixes. The chapters in each part are described here. If you have previous experience with databases, you can use these descriptions to decide which chapters to skim and which to read in detail.
The chapters in this part of the book provide background that is necessary to understand the chapters that follow. You can skim some of this material if it is familiar to you, but don't take it too lightly. If you understand the fundamental concepts underlying database design, it will be easier to understand the point behind important design concepts presented later.
Chapter 1, “Database Design Goals,” explains the reasons people and organizations use databases. It explains a database's purpose and conditions that it must satisfy to be useful. This chapter also describes the basic ACID (Atomicity, Consistency, Isolation, Durability) and CRUD (Create, Read, Update, Delete) features that any good database should have. It explains in high-level general terms what makes a good database and what makes a bad database.
Chapter 2, “Relational Overview,” explains basic relational database concepts such as tables, rows, and columns. It explains the common usage of relational database terms in addition to the more technical terms that are sometimes used by database theorists. It describes different kinds of constraints that databases use to guarantee that the data is stored safely and consistently.
Chapter 3, “NoSQL Overview,” explains the basics of NoSQL databases, which are growing quickly in popularity. Those databases include document, key-value, column-oriented, and graph databases. Both relational and NoSQL databases can run either locally or in the cloud, but many NoSQL databases are more cloud-oriented, largely because they are newer technology so they're cloud-native.
The chapters in this part of the book discuss the main pieces of relational database design. They explain how to understand what should be in the database, develop an initial design, separate important pieces of the database to improve flexibility, and refine and tune the design to provide the most stable and useful design possible.
Chapter 4, “Understanding User Needs,” explains how to learn about the users' needs and gather user requirements. It tells how to study the users' current operations, existing databases (if any), and desired improvements. It describes common questions that you can ask to learn about users' operations, desires, and needs, and how to build the results into requirements documents and specifications. This chapter explains what use cases are and shows how to use them and the requirements to guide database design and to measure success.
Chapter 5, “Translating User Needs into Data Models,” introduces data modeling. It explains how to translate the user's conceptual model and the requirements into other, more precise models that define the database design rigorously. This chapter describes several database modeling techniques, including user-interface models, semantic object models, entity-relationship diagrams, and relational models.
Chapter 6, “Extracting Business Rules,” explains how a database can handle business rules. It explains what business rules are, how they differ from database structure requirements, and how you can identify business rules. This chapter explains the benefits of separating business rules from the database structure and tells how to achieve that separation.
Chapter 7, “Normalizing Data,” explains one of the most important tools in relational database design: normalization. Normalization techniques allow you to restructure a database to increase its flexibility and make it more robust. This chapter explains various forms of normalization, emphasizing the stages that are most common and important: first, second, and third normal forms (1NF, 2NF, and 3NF). It explains how each of these kinds of normalization helps prevent errors and tells why it is sometimes better to leave a database slightly less normalized to improve performance.
Chapter 8, “Designing Databases to Support Software,” explains how databases fit into the larger context of application design and the development life cycle. This chapter explains how later development depends on the underlying database design. It discusses multi-tier architectures that can help decouple the application and database so there can be at least some changes to either without requiring changes to both.
Chapter 9, “Using Common Design Patterns,” explains some common patterns that are useful in many applications. Some of these techniques include implementing various kinds of relationships among objects, storing hierarchical and network data, recording temporal data, and logging and locking.
Chapter 10, “Avoiding Common Design Pitfalls,” explains some common design mistakes that occur in database development. It describes problems that can arise from insufficient planning, incorrect normalization, and obsession with ID fields and performance.
If you follow all of the examples and exercises in the earlier chapters, by this point you will have seen all of the major steps for producing a good database design. However, it's often useful to see all the steps in a complicated process put together in a continuous sequence. The chapters in this part of the book walk through a detailed case study following all the phases of database design for the fictitious Pampered Pet database.
Chapter 11, “Defining User Needs and Requirements,” walks through the steps required to analyze the users' problem, define requirements, and create use cases. It describes interviews with fictitious customers that are used to identify the application's needs and translate them into database requirements.
Chapter 12, “Building a Data Model,” translates the requirements gathered in the previous chapter into a series of data models that precisely define the database's structure. This chapter builds user interface models, entity-relationship diagrams, semantic object models, and relational models to refine the database's initial design. The final relational models match the structure of a relational database fairly closely, so they are easy to implement.
Chapter 13, “Extracting Business Rules,” identifies the business rules embedded in the relational model constructed in the previous chapter. It shows how to extract those rules in order to separate them logically from the database's structure. This makes the database more robust in the face of future changes to the business rules.
Chapter 14, “Normalizing and Refining,” refines the relational model developed in the previous chapter by normalizing it. It walks through several versions of the database that are in different normal forms. It then selects the degree of normalization that provides a reasonable trade-off between robust design and acceptable performance.
Though this book focuses on abstract database concepts that do not depend on a particular database product, it's also worth spending at least some time on more concrete implementation issues. The chapters in this part of the book describe some of those issues and explain how to build simple example programs that demonstrate a few different database products.
Chapter 15, “Example Overview,” provides a roadmap for the chapters that follow. It tells which chapters use which databases and how to get the most out of those chapters. Chapters 16 through 25 come in pairs, with the first describing an example in Python and the second describing a similar (although not always identical) program in C#.
Chapters 16 and 17 describe examples that use the popular MariaDB column-oriented relational database running on the local machine.
Chapters 18 and 19 demonstrate the (also popular) PostgreSQL database, also running on the local machine.
Chapters 20 and 21 show how to use the Neo4j AuraDB graph database running in the cloud.
Chapters 22 and 23 describe examples that use the MongoDB Atlas document database, also running in the cloud.
Chapters 24 and 25 demonstrate the Apache Ignite key-value database running locally.
These examples are just intended to get you started. They are relatively simple examples and they do not show all of the possible combinations. For example, you can run an Apache Ignite database in the cloud if you like; there were just too many combinations to cover them all in this book.
Although this book does not assume you have previous database experience, that doesn't mean it cannot cover some more advanced subjects. The chapters in this part of the book explain some more sophisticated topics that are important but not central to database design.
Chapter 26, “Introduction to SQL,” provides an introduction to SQL (Structured Query Language). It explains how to use SQL commands to add, insert, update, and delete data. By using SQL, you can help insulate a program from the idiosyncrasies of the particular database product that it uses to store data.
Chapter 27, “Building Databases with SQL Scripts,” explains how to use SQL scripts to build a database. It explains the advantages of this technique, such as the ability to create scripts to initialize a database before performing tests. It also explains some of the restrictions on this method, such as the fact that the user may need to create and delete tables in a specific order to satisfy table relationships.
Chapter 28, “Database Maintenance,” explains some of the database maintenance issues that are part of any database application. Though performing and restoring backups, compressing tables, rebuilding indexes, and populating data warehouses are not strictly database design tasks, they are essential to any working application.
Chapter 29, “Database Security,” explains database security issues. It explains the kinds of security that some database products provide. It also explains some additional techniques that can enhance database security such as using database views to appropriately restrict the users' access to data.
The book's appendixes provide additional reference material to supplement the earlier chapters.
Appendix A, “Exercise Solutions,” gives solutions to the exercises at the end of most of the book's chapters so that you can check your progress as you work through the book.
Appendix B, “Sample Relational Designs,” shows some sample designs for a variety of common database situations. These designs store information about such topics as books, movies, documents, customer orders, employee timekeeping, rentals, students, teams, and vehicle fleets.
The Glossary provides definitions for useful database and software development terms. The Glossary includes terms defined and used in this book in addition to a few other useful terms that you may encounter while reading other database material.
Because this book is aimed at readers of all experience levels, you may find some of the material familiar if you have previous experience with databases. In that case, you may want to skim chapters covering material that you already thoroughly understand.
If you are familiar with relational databases, you may want to skim Chapter 1, “Database Design Goals,” and Chapter 2, “Relational Overview.” Similarly if you have experience with NoSQL databases, you may want to skip Chapter 3, “NoSQL Overview.”
If you have previously helped write project proposals, you may understand some of the questions you need to ask users to properly understand their needs. In that case, you may want to skim Chapter 4, “Understanding User Needs.”
If you have built databases before, you may understand at least some of the data normalization concepts explained in Chapter 7, “Normalizing Data.” This is a complex topic, however, so I recommend that you not skip this chapter unless you really know what you're doing.
If you have extensive experience with SQL, you may want to skim Chapter 26, “Introduction to SQL.” (Many developers who have used but not designed databases fall into this category.)
In any case, I strongly recommend that you at least skim the material in every chapter to see if there are any new concepts you can pick up along the way. At least look at the Exercises at the end of each chapter before you decide that you can safely skip to the next. If you don't know how to outline the solutions to the Exercises, then you should consider looking at the chapter more closely.
Different people learn best in different ways. Some learn best by listening to lecturers, others by reading, and others by doing. Everyone learns better by combining learning styles. You will get the most from this book if you read the material and then work through the Exercises. It's easy to think to yourself, “Yeah, that makes sense” and believe you understand the material, but working through some of the Exercises will help solidify the material in your mind. Doing so may also help you see new ways that you can apply the concepts covered in the chapter.
NOTE Normally, when I read a new technical book, I work through every example, modifying the problems to see what happens if I try different things not covered by the author. I work through as many questions and exercises as I can until I reach the point where more examples don't teach me anything new (or I'm tired of breaking my system and having to reinstall things). Then I move on. It's one thing to read about a concept in the chapter; it's another to try to apply it to data that is meaningful to you.
After you have learned the ideas in the book, you can use it for a reference. For example, when you start a new project, you may want to refer to Chapter 4, “Understanding User Needs,” to refresh your memory about the kinds of questions you should ask users to discover their true needs.
Visit the book's website to look for updates and addendums. If readers find typographical errors or places where a little additional explanation may help, I'll post updates on the website.
Finally, if you get stuck on a really tricky concept and need a little help, email me at [email protected] and I'll try to help you out.
Database programming is boring. Maybe not to you and me, who have discovered the ecstatic joy of database design, the thrill of normalization, and the somewhat risqué elation brought by slightly denormalizing a database to achieve optimum performance. But let's face it, to a beginner, database design and development can be a bit dull.
There's little you can do to make the basic concepts more exciting, but you can do practically anything with the data. At some point it's useful to explain how to design a simple inventory system, but that doesn't mean you can't use other examples designed to catch students' attention. Data that relates to the students' personal experiences or that is just plain outrageous keeps them awake and alert (and most of us know that it's easier to teach students who are awake).
The examples in this book are intended to demonstrate the topic at hand but not all of them are strictly business-oriented. I've tried to make them cover a wide variety of topics from serious to silly. To keep your students interested and alert, you should add new examples from your personal experiences and from your students' interests.
I've had great success in my classroom using examples that involve sports teams (particularly local rivalries), music (combining classics such as Bach, Beethoven, and Tone Loc), the students in the class (but be sure not to put anyone on the spot), television shows and stars, comedians, and anything else that interests the students.
For exercises, encourage students to design databases that they will find personally useful. I've had students build databases that track statistics for the players on their favorite football teams, inventory their DVD or CD collections, file and search recipe collections, store data on “Magic: The Gathering” trading cards, track role-playing game characters, record information about classic cars, and schedule athletic tournaments. (The tournament scheduler didn't work out too well—the scheduling algorithms were too tricky.) One student even built a small but complete inventory application for his mother's business that she actually found useful. I think he was as shocked as anyone to discover he'd learned something practical.
When students find an assignment interesting and relevant, they become emotionally invested and will apply the same level of concentration and intensity to building a database that they normally reserve for console gaming, Star Wars, and World of Warcraft. They may spend hours crafting a database to track WoW alliances just to fulfill a 5-minute assignment. They may not catch every nuance of domain/key normal form, but they'll probably learn a lot about building a functional database.
If you're a student and you peeked at the previous section, “Note to Instructors,” shame on you! If you didn't peek, do so now.
Building a useful database can be a lot of work, but there's no reason it can't be interesting and useful to you when you're finished. Early in your reading, pick some sort of database that you would find useful (see the previous section for a few ideas) and think about it as you read through the text. When the book talks about creating an initial design, sketch out a design for your database. When the book explains how to normalize a database, normalize yours. As you work through the exercises, think about how they would apply to your dream database.
Don't be afraid to ask your instructor if you can use your database instead of one suggested by the book for a particular assignment (unless you have one of those instructors who hand out extra work to anyone who crosses their path; in that case, keep your head down). Usually an instructor's thought process is quite simple: “I don't care what database you use as long as you learn the material.” Your instructor may want your database to contain several related tables so that you can create the complexity needed for a particular exercise, but it's usually not too hard to make a database complicated enough to be interesting.
When you're finished, you will hopefully know a lot more about database design than you do now, and if you're persistent, you might just have a database that's actually good for something. Hopefully you'll also know how to design other useful databases in the future. (And when you're finished, email me at [email protected] and let me know what you built!)
To help you get the most from the text and keep track of what's happening, we've used a number of conventions throughout the book.
NOTE Tips, hints, tricks, and asides to the current discussion are offset and placed in italics like this.
Activities are exercises that you should work through, following the text in the book.
They usually consist of a set of steps.
Each step has a number.
Follow the steps with your copy of the database.
After most activity instruction sections, the process you've stepped through is explained in detail.
As for styles in the text:
We
highlight
new terms and important words when we introduce them.
We show keyboard strokes like this: Ctrl+A.
We show filenames, URLs, and code within the text like so:
SELECT * FROM Students
.
We present blocks of code like this:
We use a monofont type with no highlighting for code examples.
As you work through the examples in this book, you may choose either to type in all the code manually or to use the source code files that accompany the book. All of the source code used in this book is available for download at www.wiley.com/go/beginningdbdesign2e.
If you have questions, suggestions, comments, want to swap cookie recipes, or just want to say “Hi,” email me at [email protected]. I can't promise that I'll be able to help you with every problem, but I do promise to try.
Many of the examples in this book were chosen for interest or humorous effect. They are not intended to disparage anyone. I mean no disrespect to police officers (or anyone else who regularly carries a gun), plumbers, politicians, jewelry store owners, street luge racers (or anyone else who wears helmets and Kevlar body armor to work), or college administrators. Or anyone else for that matter.
Well, maybe politicians.
Chapter 1:
Database Design Goals
Chapter 2:
Relational Overview
Chapter 3:
NoSQL Overview
The chapters in this part of the book provide background that is useful when studying database design.
Chapter 1 explains the reasons why database design is important. It discusses the goals that you should keep in mind while designing databases. If you keep those goals in mind, then you can stay focused on the end result and not become bogged down in the minutiae of technical details. If you understand those goals, then you will also know when it might be useful to bend the rules a bit.
Chapter 2 provides background on relational databases. It explains common relational database terms and concepts that you need to understand for the chapters that follow. You won’t get as much out of the rest of the book if you don’t understand the terminology.
Chapter 3 describes NoSQL databases. While this book (and most other database books) focuses on relational databases, there are other kinds of databases that are better suited to some tasks. NoSQL databases provide some alternatives that may work better for you under certain circumstances. (I once worked on a 40-developer project that failed largely because it used the wrong kind of database. Don’t let that happen to you!)
Even if you’re somewhat familiar with databases, give these chapters at least a quick glance to ensure that you don’t miss anything important. Pay particular attention to the terms described in Chapter 2, because you’ll need to know them later.
Using modern database tools, just about anyone can build a database. The question is, will the resulting database be useful?
A database won't do you much good if you can't get data out of it quickly, reliably, and consistently. It won't be useful if it's full of incorrect or contradictory data, nor will it be useful if it is stolen, lost, or corrupted by data that was only half written when the system crashed.
You can address all of these potential problems by using modern database tools, a good database design, and a pinch of common sense, but only if you understand what those problems are so you can avoid them.
The first step in the quest for a useful database is understanding database goals. What should a database do? What makes a database useful and what problems can it solve? Working with a powerful database tool without goals is like flying a plane through clouds without a compass—you have the tools you need but no sense of direction.
This chapter describes the goals of database design. By studying information containers, such as files that can play the role of a database, the text defines properties that good databases have and problems that they should avoid.
In this chapter, you will learn about the following:
Why a good database design is important
The strengths and weaknesses of various kinds of information containers that can act as databases
How computerized databases can benefit from those strengths and avoid those weaknesses
How good database design helps achieve database goals
What CRUD, ACID, and BASE are, and why they are relevant to database design
Forget for a moment that this book is about designing databases and consider software design in general. Software design plays a critical role in software development. The design lays out the general structure and direction that future development will take. It determines which parts of the system will interact with other parts. It decides which subsystems will provide support for other pieces of the application.
If an application's underlying design is flawed, the system as a whole is at risk. Bad assumptions in the design creep into the code at the application's lowest levels, resulting in flawed subsystems. Higher-level systems built on those subsystems inherit those design flaws, and soon their code is corrupted, too.
Sometimes, a sort of decay pervades the entire system and nobody notices until relatively late in the project. The longer the project continues, the more entrenched the incorrect assumptions become, and the more reluctant developers are to scrap the whole design and start over. The longer problems remain in the system, the harder they are to remove. At some point, it might be easier to throw everything away and start over from scratch, a decision that few managers will want to present to upper management.
An engineer friend of mine was working on a really huge satellite project. After a while, the engineers all realized that the project just wasn't feasible given the current state of technology and the design. Eventually, the project manager was forced to admit this to upper management and he was fired. The new project manager stuck it out for a while and then he, too, was forced to confess to upper management that the project was unfeasible. He, too, was fired.
For a while, this process continued—with a new manager taking over, realizing the hopelessness of the design, and being fired. That is, until eventually even upper management had to admit the project wasn't going to work out and the whole thing collapsed.
They could have saved time, money, and several careers if they had spent more time up-front on the design and either fixed the problems or realized right away that the project wasn't going to work and scrapped it at the start.
Building an application is often compared to building a house or skyscraper. You probably wouldn't start building a multibillion-dollar skyscraper without a comprehensive design that is based on well-established architectural principles. Unfortunately, software developers often rush off to start coding as soon as they possibly can because coding is more fun and interesting than design is. Coding also lets developers tell management and customers how many lines of code they have written, so it seems like they are making progress even if the lines of code are corrupted by false assumptions. Only later do they realize that the underlying design is flawed, the code they wrote is worthless, and the project is in serious trouble.
Now, let's get back to database design. Few parts of an application's design are as critical as the database's design. The database is the repository of the information that the rest of the application manages and displays to the users. If the database doesn't store the right data, doesn't keep the data safe, or doesn't let the application find the data it needs, then the application has little chance for success. Here, the garbage-in, garbage-out (GIGO) principle is in full effect. If the underlying data is unsound, it doesn't matter what the application does with it; the results will be suspect at best.
For example, imagine that you've built an order-tracking system that can quickly fetch information about a customer's past orders. Unfortunately, every time you ask the program to fetch a certain customer's records, it returns a slightly different result. Although the program can find data quickly, the results are not trustworthy enough to be usable.
For another example, imagine that you have built an amazing program that can track the thousands of tasks that make up a single complex job, such as building a cruise liner or passenger jet. It can track each task's state of completion, determine when you need to order new parts for them to be ready for future construction phases, and can even determine the present value of future purchases so you can decide whether it is better to buy parts now or wait until they are needed. Unfortunately, the program takes hours to recalculate the complex task schedule and pricing details. Although the calculations are correct, they are so slow that users cannot reasonably make any changes. Changing the color of the fabric of a plane's seats or the tile used in a cruise liner's hallways could delay the whole project. (I once worked on a project with a similar issue. It worked, but it was so slow that it became a serious problem.)
For a final example, suppose you have built an efficient subscription application that lets customers subscribe to your company's quarterly newsletters, data services, and sarcastic demotivational quote of the day. It lets you quickly find and update any customer's subscriptions, and it always consistently shows the same values for a particular customer. Unfortunately, when you change the price of one of your publications, you find that not all of the customers' records show the updated price. Some customers' subscriptions are at the new rate, some are at the old rate, and some seem to be at a rate you've never seen before. (This example isn't as far-fetched as it may seem. Some systems allow you to offer sale prices or special incentives to groups of customers, or they allow sales reps to offer special prices to particular customers. That kind of system requires careful design if you want to be able to do things like change standard prices without messing up customized pricing.)
Poor database design can lead to these and other annoying and potentially expensive scenarios. A good design creates a solid foundation on which you can build the rest of the application.
Experienced developers know that the longer a bug remains in a system, the harder it is to find and fix. From that it logically follows that it is extremely important to get the design right before you start building on it.
Database design is no exception. A flawed database design can doom a project to failure before it has begun as surely as ill-conceived software architecture, poor implementation, or incompetent programming can.
What is a database? This may seem like a trivial question, but if you take it seriously the result can be pretty enlightening. By studying the strengths and weaknesses of some physical objects that meet the definition of a database, you can learn about the features that you might like a computerized database to have.
DEFINITION A database is a tool that stores data and lets you create, read, update, and delete the data in some manner.
This is a pretty broad definition and includes a lot of physical objects that most people don't think of as modern databases. For example, Figure 1.1 shows a box full of business cards, a notebook, a filing cabinet full of customer records, and your brain, all of which fit this definition. Each of these physical databases has advantages and disadvantages that can give insight into the features that you might like in a computer database.
FIGURE 1.1
A box of business cards is useful as long as it doesn't contain too many cards. You can find a particular piece of data (for example, the phone number for your favorite Canadian restaurant) by looking through all the cards. You can easily expand the database by shoving more cards into the box, at least up to a point. If you have more than a dozen or so business cards, finding a particular card can be time consuming. You can even rearrange the cards a bit to improve performance for cards you use often. Each time you use a card, you can move it to the front of the box. Over time, those that are used most often will migrate to the front.
A notebook (the cardboard and paper kind, not the small laptop kind) is small, easy to use, easy to carry, doesn't require electricity, and doesn't need to boot before you can use it. A notebook database is also easily extensible because you can buy another notebook to add to your collection when the first one is full. However, a notebook's contents are arranged sequentially. If you want to find information about a particular topic, you'll have to look through the pages one at a time until you find what you want. The more data you have, the harder this kind of search becomes.