80,99 €
A hands-on beginner’s guide to designing relational databases and managing data using Microsoft Access
Relational databases represent one of the most enduring and pervasive forms of information technology. Yet most texts covering relational database design assume an extensive, sophisticated computer science background. There are texts on relational database software tools like Microsoft Access that assume less background, but they focus primarily on details of the user interface, with inadequate coverage of the underlying design issues of how to structure databases. Growing out of Professor Jonathan Eckstein’s twenty years’ experience teaching courses on management information systems (MIS) at Rutgers Business School, this book fills this gap in the literature by providing a rigorous introduction to relational databases for readers without prior computer science or programming experience.
Relational Database Design for Business, with Microsoft Access helps readers to quickly develop a thorough, practical understanding of relational database design. It takes a step-by-step, real-world approach, using application examples from business and finance every step the way. As a result, readers learn to think concretely about database design and how to address issues that commonly arise when developing and manipulating relational databases. By the time they finish the final chapter, students will have the knowledge and skills needed to build relational databases with dozens of tables. They will also be able to build complete Microsoft Access applications around such databases. This text:
Introductory Relational Database Design for Business, with MicrosoftAccess is the definitive guide for undergraduate and graduate students in business, finance, and data analysis without prior experience in database design. While Microsoft Access is its primary “hands-on” learning vehicle, most of the skills in this text are transferrable to other relational database software such as MySQL.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 487
Veröffentlichungsjahr: 2017
Cover
Title Page
Preface
Why Did We Write this Book?
1 Basic Definitions and Concepts
Basic Terms and Definitions
Types of Information Systems
2 Beginning Fundamentals of Relational Databases and MS Access
Beginning Fundamentals of MS Access
Introduction to Forms
Another Method to Create Forms
Introduction to Reports
Introduction to Queries
Common Datatypes in MS Access
Exercises
3 Introduction to Data Management and Database Design
Introduction to Data Management
General Data Management Issues
Classifying Information Systems Tasks: Transaction and Analytical Processing
What Is Wrong with Just One Table?
Repeating Groups
An Illustration of Multiple Tables and Foreign Keys
4 Basic Relational Database Theory
Tables and Their Characteristics
Primary Keys and Composite Keys
Foreign Keys and Outline Notation
Creating Entity‐Relationship (ER) Diagrams
Functional Dependency
Dependency Diagrams
Partial Dependency
Transitive Dependency
Database Anomalies
What Causes Anomalies?
How to Fix Anomalies
Good Database Design Principles
Normalization and Zip Codes
Expanding the Customer Loans Database
DVD Lending Library Example without Loan History
The DVD Lending Library Example with Loan History
Subtypes
Exercises
5 Multiple Tables in Access
The Relationships Window
and Referential Integrity
Nested Table View
Nested Forms
Queries with Multiple Tables
Multiple Joins
and Aggregation
Personnel: Database Design with Multiple Paths between Tables
Creating the Database in Access using Autonumber Keys
A Simple Query and a Different Way to Express Joins in SQL
Exercises
6 More about Forms and Navigation
More Capabilities of Forms
Packaging it Up – Navigation
Exercises
7 Many‐to‐Many Relationships
Focus Groups Example
The Plumbing Store
: Many‐to‐Many with an Additional Quantity Field
Hands‐On Exercise and More About Queries and SQL
Project Teams
: Many‐to‐Many with “Flavors” of Membership
The Library
Exercises
8 Multiple Relationships between the Same Pair of Tables
Commuter Airline Example
The College
Sports League Example
Multiple Relationships in Access
Exercises
9 Normalization
First Normal Form
Second Normal Form
Third Normal Form
More Normal Forms
Key Factors to Recognize 3NF
Example with Multiple Candidate Keys
Normalizing an Office Supplies Database
Summary of Guidelines for Database Design
Exercises
10 Basic Structured Query Language (SQL)
Using SQL in Access
The SELECT … FROM Statement
WHERE Conditions
Inner Joins
Cartesian Joins and a Different Way to Express Inner Joins
Aggregation
GROUP BY
HAVING
ORDER BY
The Overall Conceptual Structure of Queries
Exercises
11 Advanced Query Techniques
Outer Joins
Outer Joins and Aggregation
Joining Multiple Records from the Same Table: AS in the FROM Clause
Another Use for AS in the FROM Clause
An Introduction to Query Chaining
and Nesting
A More Complicated Example of Query Chaining: The League Standings
Subqueries and Back to the Plumbing Store Database
Practical Considerations and “Bending the Rules” Against Redundancy
Exercises
12 Unary Relationships
Employee Database
Setting Up and Querying a Unary Relationship in Access
The Course Catalog Database
Exercises
Further Reading
Database Theory and Design
SQL
Microsoft Access
Management Information Systems in General
Index
End User License Agreement
Chapter 02
Table 2.1 Example table of student data.
Table 2.2 Sample data entered into the STUDENT table.
Table 2.3 Common MS Access datatypes.
Chapter 04
Table 4.1 Sample table of YWCA activities, one activity permitted per member.
Table 4.2 Sample table of YWCA activities, multiple activities permitted per member.
Table 4.3 Sample table of car data.
Table 4.4 Sample data for states and cities.
Table 4.5 YWCA activity data again, multiple activities per member.
Table 4.6 Activities table again, one activity per member.
Chapter 05
Table 5.1 Output of the multi‐table query with record selection criteria.
Table 5.2 Output of the three‐table query.
Table 5.3 Result of three‐table query with aggregation.
Table 5.4 Result of aggregation query with criteria.
Table 5.5 Output of query aggregating by customer.
Table 5.6 Output of query grouping by customer IDs but not displaying them.
Table 5.7 Output of personnel database query.
Chapter 07
Table 7.1 Output of the first plumbing store query.
Table 7.2 Output of the plumbing store query with a computed field.
Table 7.3 Completed plumbing store query with total value of each order.
Chapter 09
Table 9.1 Investment holdings database that is not in first normal form.
Table 9.2 Example data in the normalized investment holdings database.
Table 9.3 Office supply data before normalization.
Table 9.4 Data for temporary employment record normalization exercise.
Table 9.5 Data for hospital blood test normalization exercise.
Table 9.6 Data for car wash normalization exercise.
Table 9.7 Data for fire and rescue squad normalization exercise.
Table 9.8 Data for temporary employment agency normalization exercise.
Table 9.9 Data for parking authorization normalization exercise.
Table 9.10 Data for marketing visit normalization exercise.
Table 9.11 Data for pizza delivery normalization exercise.
Table 9.12 Data for ad placement normalization exercise.
Chapter 10
Table 10.1 Output from first SQL inner join example query.
Table 10.2 Results of query selecting products appearing at least 10 times in the same order.
Table 10.3 Results of OR query after correcting inadvertent Cartesian join.
Table 10.4 Common SQL aggregation functions.
Table 10.5 Output of query computing revenue on each date.
Table 10.6 Query computing revenue on each date, using AS to obtain a more understandable column name.
Table 10.7 Output of query showing revenue aggregated by customer.
Table 10.8 Output of simpler query computing revenue per customer, but identifying customers only by
CustomerID
.
Table 10.9 Output of query identifying high‐spending customers.
Table 10.10 Output of query showing high‐spending customers sorted by expenditure.
Table 10.11 High‐spending customers sorted by their names.
Table 10.12 Orders sorted by customer name and date.
Chapter 11
Table 11.1 The COACH table.
Table 11.3 Result of a LEFT JOIN of the COACH and TEAM tables.
Table 11.4 Teams with no coaches do not appear in the query output when using a conventional join.
Table 11.5 Using an outer join causes teams with no coaches to appear in the query output.
Table 11.6 Immediate result of the outer join operation between TEAM and COACH.
Table 11.7 First seven rows of output of the initial version of low‐selling product query.
Table 11.8 Output of low‐selling product query using an outer join.
Table 11.9 The GAME table.
Table 11.10 Output of query showing games with the nicknames of both participating teams.
Table 11.11 Result of chained query showing each customer's largest order.
Table 11.12 Query output showing games won by each team, but omitting winless teams.
Table 11.13 Query output correctly showing wins for each team.
Table 11.14 Query output showing losses for each team.
Table 11.15 Results of the completed league standings query chain.
Chapter 12
Table 12.1 Query output showing names of employees and names of their supervisors.
Table 12.2 Query output showing names of employees and supervisors, using an outer join to avoid excluding the highest‐level supervisor.
Table 12.3 Output of query showing employees and supervisors in different locations.
Table 12.4 Output of query showing employees who directly supervise at least four others.
Table 12.5 Output of query showing employees and their supervisors’ supervisors.
Table 12.6 Data for employee database normalization exercise.
Chapter 01
Figure 1.1 Information systems and the levels of an organization.
Chapter 02
Figure 2.1 Objects within an Access file.
Figure 2.2 A table interacting with a form.
Figure 2.3 When Access opens.
Figure 2.4 A new Access table.
Figure 2.5 Naming a table in Access.
Figure 2.6 Table design view.
Figure 2.7 Completed Design View for the STUDENT table.
Figure 2.8 Design View of a form linked to the STUDENT table.
Figure 2.9 Completed student form with GPA display.
Figure 2.10 The Form Wizard.
Figure 2.11 Selecting “columnar” form layout.
Figure 2.12 Finding the combo box tool.
Figure 2.13 Completed student form with GPA display and combo box to select majors.
Figure 2.14 Grouping control in the Report Wizard.
Figure 2.15 Initial appearance of the “students by major” report.
Figure 2.16 Adding a report footer.
Figure 2.17 The Access query grid (Design View).
Figure 2.18 Student query results.
Figure 2.19 Sorted student query results.
Figure 2.20 Student query results with a calculated field.
Figure 2.21 Adding selection criteria to the query grid.
Figure 2.22 Desired appearance of the country form.
Figure 2.23 Desired appearance of the history book form.
Figure 2.24 Desired appearance of the county form.
Chapter 03
Figure 3.1 The CUSTOMER and LOAN entities.
Figure 3.2 Completed ER diagram for customers and loans.
Chapter 04
Figure 4.1 ER diagram for states and cities.
Figure 4.2 Dependency diagram for activity table, one activity per member.
Figure 4.3 Outline description and dependency diagram for activity table, multiple activities per member.
Figure 4.4 A partial dependency.
Figure 4.5 A transitive dependency.
Figure 4.6 Dependency diagram with a partial dependency.
Figure 4.7 Dependency diagram for normalized activity database, one activity per member.
Figure 4.8 Dependency diagram for a table with zip codes.
Figure 4.9 ER diagram for the loans and customers database.
Figure 4.10 One‐to‐many relationship between loans and payments.
Figure 4.11 ER diagram of completed loan database with payments.
Figure 4.12 Dependency diagram for entire DVD library database without loan histories.
Figure 4.13 ER diagram for DVD library database without loan histories.
Figure 4.14 ER diagram for DVD library database with a categories table but no loan histories.
Figure 4.15 ER diagram for DVD library database with a category table and loan histories.
Figure 4.16 ER diagram for economics department database, first version.
Figure 4.17 ER diagram for economics department database with a subtype.
Figure 4.18 ER diagram for economics department database with a subtype and two advisor relationships.
Figure 4.19 Example of a tree of subtypes.
Figure 4.20 ER diagram of economics department database, with FACULTY and STUDENT both subtypes of PERSON.
Chapter 05
Figure 5.1 ER diagram for loan database with payments.
Figure 5.2 Relationships Window before creating relationships.
Figure 5.3 The Edit Relationship dialog box.
Figure 5.4 Relationships Window with CUSTOMER‐LOAN relationship.
Figure 5.5 Completed Relationships Window for loans database with payments.
Figure 5.6 Access displays the LOAN table.
Figure 5.7 Trying to enter a nonexistent customer in the LOAN table.
Figure 5.8 Access displays the CUSTOMER table.
Figure 5.9 Table Design View, showing a validation rule.
Figure 5.10 Example of nested table view.
Figure 5.11 Using nested table view with multiple levels of nesting.
Figure 5.12 Access creates a nested form for customers and loans.
Figure 5.13 Fine‐tuning the design of the nested customer‐loan form.
Figure 5.14 How Access displays two related tables in Query Design View.
Figure 5.15 Design View of a query joining the CUSTOMER and LOAN tables.
Figure 5.16 Manipulating join properties in Query Design View.
Figure 5.17 How Access displays an outer join.
Figure 5.18 A multi‐table query with record selection criteria.
Figure 5.19 How Access displays three related tables in Query Design View.
Figure 5.20 Query Design View for a three‐table query.
Figure 5.21 Three‐table query with aggregation.
Figure 5.22 Adding criteria to the query with aggregation.
Figure 5.23 Aggregating by customer instead of by order.
Figure 5.24 Grouping by customer IDs without displaying them.
Figure 5.25 ER diagram for the personnel database, before including zip codes.
Figure 5.26 ER diagram for personnel database with zip codes.
Figure 5.27 Completed Relationships Window for personnel database.
Chapter 06
Figure 6.1 Form produced by the Form Wizard.
Figure 6.2 Employee form with combo boxes to select health plans and branch offices.
Figure 6.3 Nested form for branch offices.
Figure 6.4 Nested form for branch offices, after formatting adjustments.
Figure 6.5 Nested form for health providers.
Figure 6.6 Database navigation form.
Figure 6.7 Adding clip art to the database navigation form.
Chapter 07
Figure 7.1 Incomplete ER diagram for the focus groups database.
Figure 7.2 How the MODERATOR and GROUP tables are indirectly related through MEETING.
Figure 7.3 Introducing a new table to mediate the many‐to‐many relationship between GROUP and CONSUMER.
Figure 7.4 Shorthand depiction of a many‐to‐many relationship.
Figure 7.5 Completed ER diagram for the focus groups database.
Figure 7.6 Plumbing store database depicted with only one‐to‐many relationships.
Figure 7.7 Plumbing store database depicted as having a many‐to‐many relationship.
Figure 7.8 Sample invoice from the plumbing store.
Figure 7.9 Relationship Window for the plumbing store database.
Figure 7.10 Design View for the first query of the plumbing store database.
Figure 7.11 Adding a computed field to the plumbing store query.
Figure 7.12 Introducing aggregation to the computed field query.
Figure 7.13 Completed plumbing store query showing the total value of each order.
Figure 7.14 ER diagram for project teams database.
Figure 7.15 ER diagram for project teams database, showing two one‐to‐many relationships in place of the many‐to‐many relationship.
Figure 7.16 ER diagram for project teams database with two parallel many‐to‐many relationships (not recommended).
Figure 7.17 A subtype‐based approach to multiple “flavors” of membership.
Figure 7.18 ER diagram for the library database.
Chapter 08
Figure 8.1 ER diagram for initial design of commuter airline database.
Figure 8.2 Corrected ER diagram for commuter airline database.
Figure 8.3 ER diagram for the commuter airline database with an AIRPORT table.
Figure 8.4 Incorrect approach to adding airport information to the commuter airline database.
Figure 8.5 Alternative design of the commuter airline database with airports, using an intermediary table.
Figure 8.6 Intermediary table approach to the commuter airline database, depicted as a many‐to‐many relationship.
Figure 8.7 ER diagram for the college database.
Figure 8.8 ER diagram for sports league database.
Figure 8.9 Incomplete Relationships Window for sports league database.
Figure 8.10 Completed Relationships Window for sports league database.
Chapter 09
Figure 9.1 Hierarchy of normal forms.
Figure 9.2 Dependency diagram of the investment holdings database after conversion to first normal form.
Figure 9.3 Partial dependencies in the HOLDING table.
Figure 9.4 Dependency diagram of the investment holdings database after conversion to second normal form.
Figure 9.5 A transitive dependency in the CUSTOMER table.
Figure 9.6 ER diagram for the investment holdings database converted to third normal form.
Figure 9.7 Dependency diagram of the investment holdings database in third normal form.
Figure 9.8 Relationships Window for the normalized investment holdings database.
Figure 9.9 Dependency diagram for a table with multiple candidate keys.
Figure 9.10 Dependency diagram for the office supply table, before normalization.
Figure 9.11 Office supply dependency diagram after removing partial dependencies.
Figure 9.12 Office supply dependency diagram after conversion to third normal form.
Figure 9.13 ER diagram for normalized office supply database, showing only one‐to‐many relationships.
Figure 9.14 ER diagram for normalized office supply database, depicted with a many‐to‐many relationship.
Figure 9.15 Relationships Window for normalized office supply database.
Chapter 10
Figure 10.1 Relationships Window for plumbing supply store database.
Figure 10.2 Example of a Cartesian join.
Figure 10.3 Example of a Cartesian join being reduced to an inner join through a WHERE condition.
Figure 10.4 Conceptual order of query processing steps.
Chapter 11
Figure 11.1 Flow of information between tables and queries when computing the league standings.
Figure 11.2 ER diagram for the original design of the plumbing store database.
Figure 11.3 ER diagram for the plumbing store database with an additional table for prices.
Chapter 12
Figure 12.1 Entity‐relationship (ER) diagram for the employee database (incomplete).
Figure 12.2 ER diagram for the employee database with separate tables for employees and supervisors (not recommended).
Figure 12.3 More complicated ER diagram for employee database with separate tables for employees and supervisors (also not recommended).
Figure 12.4 Representing supervisors through a subtype.
Figure 12.5 ER diagram for the employee database with a unary relationship.
Figure 12.6 Relationships Window for the employee database, before creating the unary relationship.
Figure 12.7 Adding a second record from the EMPLOYEE table to the Relationships Window.
Figure 12.8 Completed employee database Relationships Window showing a unary relationship.
Figure 12.9 ER diagram of course catalog database without course prerequisite information.
Figure 12.10 ER diagram for the course catalog database using a unary many‐to‐many relationship.
Figure 12.11 Depicted with one‐to‐many relationships, a unary many‐to‐many relationship looks the same as a two‐to‐many relationship.
Cover
Table of Contents
Begin Reading
iii
iv
ix
x
xi
xii
1
2
3
4
5
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
43
44
45
46
47
48
49
50
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
301
302
303
304
305
306
307
308
309
Jonathan Eckstein
MSIS DepartmentRutgers Business SchoolUnited States
Bonnie R. Schultz
Schultz Writing ServicesPrinceton, New JerseyUnited States
This edition first published 2018© 2018 John Wiley & Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Jonathan Eckstein and Bonnie R. Schultz to be identified as the author of this work has been asserted in accordance with law.
Registered Office(s)John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USAJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial Office9600 Garsington Road, Oxford, OX4 2DQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of WarrantyWhile the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging‐in‐Publication Data
Names: Eckstein, Jonathan, author. | Schultz, Bonnie R., author.Title: Introductory relational database design for business, with Microsoft Access / by Jonathan Eckstein and Bonnie R. Schultz.Description: Hoboken : Wiley, 2017. | Includes bibliographical references and index. | Identifiers: LCCN 2017019748 (print) | LCCN 2017028117 (ebook) | ISBN 9781119329428 (pdf) | ISBN 9781119329442 (epub) | ISBN 9781119329411 (hardback)Subjects: LCSH: Relational databases. | Microsoft Access. | BISAC: BUSINESS & ECONOMICS / Statistics. | COMPUTERS / Management Information Systems.Classification: LCC QA76.9.D3 (ebook) | LCC QA76.9.D3 E325 2017 (print) | DDC 005.75/65–dc23LC record available at https://lccn.loc.gov/2017019748
Cover Design: WileyCover Image: Relationships Windows: Courtesy of Jonathan Eckstein
This book arose from the first author’s experience of teaching an undergraduate management information systems (MIS) course in the business school of Rutgers University in Piscataway, NJ, United States. This experience consisted of teaching 20 different sections in 12 different semesters, spread over a 20‐year time span.
Rutgers’ undergraduate New Brunswick business program’s approach to teaching MIS differs from that of most business schools. Typically, MIS courses and textbooks stress superficial familiarity with dozens or even hundreds of aspects of information technology. The Rutgers approach, even before the first author arrived there, was different. At least two thirds of the course is spent achieving a relatively deep understanding of one of the most pervasive, durable, and persistent technologies in information technology: relational databases. Finding suitable textbooks was difficult, however. For some time, we used two books, one being a traditional MIS book and the other covering the Microsoft Access relational database product. This solution was expensive and not entirely satisfactory, and became less so over time. With each release of Access, the available Access books became increasingly focused on details of the user interface, and shied away from explaining the underlying design issues of how to structure databases. Giving such a book to somebody without solid prior experience in designing databases is like having somebody without a driver’s license read the owner’s manual of a feature‐laden luxury car: while they might learn how to set the climate control to keep the passenger and driver at different temperatures, they would be no closer to being able to properly use the car for its fundamental task of transportation. Books specifically about database design also exist but are primarily aimed at computer science majors. They are overly abstract and too technical for business students just beginning to learn about information technology.
This book, which began as a set of class notes, takes a different approach. It develops an understanding of relational databases step by step, through numerous compact but realistic examples that gradually build in complexity. While readers will not necessarily gain enough experience to design large‐scale organizational systems with hundreds or thousands of tables, they do get a thorough grounding in the technology and its applications, enough to build useful systems with dozens of tables. At every stage, the technology is presented through application examples from business, as well as other fields, giving the reader a chance to concretely think through the details and issues that often arise.
One may well ask, “why should one teach an introductory MIS course this way?” The main reasons are as follows:
Relatively lasting hands‐on knowledge of a pervasive and useful technology
Acquisition of immediately marketable skills
Development of analytical thinking and problem solving
The currently prevalent approach to teaching MIS stresses “buzzword”‐level knowledge of numerous currently popular technologies. But without the foundation of hands‐on application and problem solving, such material is quickly forgotten. Such knowledge may be useful for those in high‐level decision‐making positions, but by the time most undergraduate students might reach such positions, the knowledge will most likely be largely forgotten and outdated.
Relational databases are one of the most durable technologies in information systems. For decades, they have been the dominant way most organizations store most of their operational data. While databases have grown larger and data are being gathered at ever‐increasing rates, the basic concepts and techniques of the technology have remained stable (much more stable, in fact, than procedural programming languages). Once one is comfortable with basic productivity software such as e‐mail clients, word processors, spreadsheets, and presentation packages, there could scarcely be a more important or foundational technology to learn, even for manipulating data on one’s own personal computer. By designing dozens of (albeit relatively simple) databases and formulating dozens of queries, students using this book acquire an understanding of relational databases in a way that should be more durable than knowledge acquired by memorizing facts or concepts.
Being able to understand and work with relational databases is a marketable skill that students can put to work at the beginning of their careers in almost any industry. While we first introduce queries using Microsoft Access’ QBE (query‐by‐example) grid, most of this book’s coverage of queries is through SQL (Structured Query Language), which is used with minor variations in nearly all relational database systems. We have received positive feedback from students who used earlier versions of this text distributed as class notes, to the effect that they were able to “hit the ground running” in jobs or internships because they already understood how to formulate complex database queries in SQL. Superficial “survey” MIS courses do not provide such skills.
Designing a database is a highly analytical skill, involving breaking down a situation into its critical components such as things, people, and events, and clearly elucidating the relationships between these components. Learning such a skill develops the mind generally, fostering abilities in critical thinking and problem solving. Developing such abilities is an important component of any college education, regardless of students’ fields of study. Just because a course is in a business school does not mean it should convey only facts – students in business programs deserve to develop their fundamental thinking skills just as much as (for example) majors in philosophy, mathematics, or chemistry. Such considerations motivate our approach of not teaching just facts and trends, but of also covering relevant material that helps students learn new ways of thinking and solving problems. Relational database design is an ideal vehicle for such mental development. Compared to other cognitively demanding IT‐related skills like procedural computer programming, we have found that relational databases are relatively accessible and easily related to a wide range of nontrivial applications. The somewhat widespread notion that only computer scientists can or should design databases is simply not true. Almost any business student can learn how to design databases with up to a dozen or so tables, and for most people it is a much less frustrating means of cognitive development than learning, for example, Python or Java.
When embedded in packages such as Microsoft Access, relational database technology now allows the production of relatively sophisticated software applications with little or no computer programming in the traditional procedural sense. In fact, Access’ Form, Report, Navigation, and Query features allow construction of professional‐looking and useful applications without any “classical” programming whatsoever. Chapter 6 explores these abilities of Access, and its exercises provide a number of different mini‐projects for student assignments. Being able to completely build such an application gives students a feeling of mastery and accomplishment.
This book uses Microsoft Access as a vehicle for learning about relational databases because it is widely available and relatively easy to use. But this is not “an Access book.” We leave many features of Access uncovered and focus on basic skills that largely transfer to other relational database settings. Students need “hands‐on” experience, and Access is simply the most logical vehicle to use. For more exhaustive coverage of the many “nooks and crannies” of Access, numerous books are already available. However, they all assume that their readers already know how to design a database.
When we teach MIS, we also cover some material not included in this book. In the course of a typical 28‐class semester, we might have 6–7 lectures on other topics such as spreadsheets, network technology, security, and ethics. We chose not to include such material in this text because it is amply covered in other textbooks, especially at the level of detail that only 6–7 classes permit. Instead, this book focuses on what is unique about our approach to teaching MIS. Instructors are encouraged to combine this book with other books, excerpts from other books, or their own notes and lectures on topics not covered here.
Finally, while this book was conceived as a textbook for undergraduate business students, it could also be used in other educational situations or even outside the context of a graded course, as a relatively “friendly” introduction to database technology. We are not aware of other books, textbooks or otherwise, that develop relational database technology in the incremental, example‐rich manner that has proved effective at Rutgers over the past two decades.
This chapter covers the following topics:
Basic definitions and concepts in database technology
The role of computers and network technology in helping run businesses and other organizations
Common types of information processing systems in current use
There are some basic definitions and concepts that should provide useful context for understanding database design. Some of the terms we define are in common use but take on specific meaning in the information technology field.
Datum is a singular word, and data is its plural. A datum (sometimes called a “data item”) is a “particle” of information like “12” or “Q.”
Information refers to data that are structured and organized to be useful in making a decision or performing some task. Relational databases are currently the most common way data are organized into information; hence this book’s focus on relational databases.
Knowledge denotes understanding or evaluating information. An example could be when Casleton Corporation analyzes its recruiting data and concludes that recruits from Driftwood College tend to have good performance evaluations only if their GPAs are at least 3.0. Based on this “knowledge,” Casleton’s managers might choose to screen applicants from Driftwood College by their GPAs, interviewing only those graduates with at least a 3.0 GPA.
For this book, we will focus on representing information within computer systems. Note, however, that knowledge can also be represented within computers. One common kind of knowledge representation (KR) within computers is part of the field of artificial intelligence (AI). One common business application of AI in business is in automated business rules systems. Another recently popularized AI application is the “Siri” personal assistant on iPhones and iPads, or the similar “Google Voice” app on Android devices. Although its business uses are substantial and gradually expanding, we will not discuss AI, as relational database systems are simpler and far more ubiquitous.
Information systems consist of the ways that organizations store, move, organize, and manipulate/process their information. The components that implement information systems – in other words, information technology – consist of the following:
Hardware
– physical tools: computer and network hardware, but also low‐tech objects such as pens and paper
Software
– (changeable) instructions for the hardware (when applicable; the simplest hardware does not need software)
People
Procedures – instructions for people
Data/databases
Information systems existed before computers and networks – they just used relatively simple hardware that usually did not need software (at least as we know it today). For example, filing all sales receipts alphabetically by customer in a filing cabinet is a form of information system, although it is not electronic. Tax records kept on clay tablets by ancient civilizations were also a form of information system. Strictly speaking, this book is about an aspect of CBISs (computer‐based information systems). Because of the present ubiquity of computers in information systems, we usually leave out the “CB,” treating it as implicit.
Present‐day CBISs have the following advantages over older, manual information systems:
They can perform numerical computations and other data processing much more quickly, accurately, and cheaply than people.
They can communicate very quickly and accurately.
They can store large amounts of information quickly and cheaply, and information retrieval can often be very rapid.
They can, to varying degrees, automate tasks and processes that previously required human labor.
Information no longer needs to be “stuck” with particular things, locations, or people.
However, increasingly, automated systems can have drawbacks, such as the following:
Small errors can have a much wider impact than in a less automated system. For example, in March 2003, a minor software bug in some airport data collection code – which programmers were aware of but considered too small to cause operational problems – grounded all aircraft in Japan for two days.
Fewer people in the organization understand exactly how information is processed.
Sometimes, malfunctions may go unnoticed. For example, American Airlines once discovered a serious bug in its “yield management” software only after reporting quarterly results that were significantly lower than expected. (“Yield management” refers to the process of deciding how many aircraft seats to make available for sale at different fare levels.)
Information architecture is the particular way an organization has arranged its information systems: for example, a particular network of computers running particular software might support a firm’s marketing organization, while another network of computers running different software might support its production facilities, and so forth.
Information infrastructure consists of the hardware and software that support an organization’s information architecture, together with the personnel and services dedicated primarily to maintaining and developing that hardware and software.
Application and application program (nowadays sometimes simply “app”) are somewhat ill‐defined terms but typically denote computer software and databases supporting a particular task or group of tasks. For example, a firm’s human resource department might use one application to analyze benefit costs and usage, and another to monitor employee turnover.
A classic business IT problem is that applications, especially those used by different parts of an organization, may not communicate with one another effectively – for example, a new hire or retirement might have to be separately entered into both of the human resources systems described above because they do not communicate or share a common database.
Particular information systems may be intended for use at one or more levels of an organization, as follows (Figure 1.1):
Figure 1.1 Information systems and the levels of an organization.
The operational level
– day‐to‐day operations and routine decisions. In an airline, for example, an operational decision is whether to cancel a particular flight on a particular day, or what type of aircraft to schedule on a particular flight during the summer flying season. Operational events that that might need to be recorded could include a customer scanning her boarding pass as she boards a flight, or an aircraft arriving at its destination gate.
The strategic level
– the highest‐level, “big picture” decisions. In the example of an airline, whether to serve the Asia–US market, or whether to emphasize cost over service quality.
The tactical level
– decisions in between operational and strategic levels; for an airline, such a decision might be whether to increase or decrease service to a particular city.
In reality, the boundaries between these levels are typically somewhat indistinct: the levels form a continuous “spectrum.” But labeling different segments of this spectrum as “levels” is useful conceptually.
Organizations are also typically divided into functional areas, meaning that different parts of the organization have different functions (that is, they do different things). These divisions vary by organization, but Figure 1.1 shows a fairly standard division into accounting, finance, operations, marketing, and human resources.
Transaction processing systems(TPSs) gather data about everyday business events in “real time” as they occur. Examples:
You buy three items at a local store.
A shipment of coffee beans arrives at a local distribution center.
A passenger checks in for a flight.
A package is unloaded from a FedEx or UPS aircraft.
Although only one of the above events is a transaction in the classical economic sense, from an information systems perspective all of these events are examples of transactions that may be immediately tracked by a TPS. Often, technology like barcodes and scanners makes tracking such transactions quicker, cheaper, and more detailed than if their associated data were to be keypunched manually. TPS systems are always operational‐level systems, but they may also be used at other levels, or feed information to other systems at higher levels.
Functional area information systems (FAISs), also called departmental information systems (DISs), are designed to be operated within a single traditional functional department of an organization such as sales, human resources, or accounting. In the early days of CBIS, these were often the only kind of systems that were practical, because managing the data from more than one functional area would have required too much storage or computing power for a single system.
When an organization has multiple functional area systems, properly coordinating them becomes a potentially difficult issue. The systems may require overlapping data and can therefore become “out of sync” with one another. ERP(enterprise resource planning) systems are a relatively extreme reaction to the problem of poorly coordinated functional area systems, and are offered by vendors such as SAP and Oracle. They aim to support the entire organization’s needs with essentially one single integrated system. They have enormous potential benefits but are also notoriously tricky and expensive to configure and install. Note that the only really meaningful word in the ERP acronym is “enterprise,” denoting a system for the entire enterprise, and the reasons for “resource planning” in the acronym are historical. Such systems can perform resource planning but not particularly more than any other business function.
Some other common terms, some of which we will define in more detail later in the book, include the following:
MIS
–
management information system
– refers to a standard system that consolidates operational data into reports useful to managers.
DSS
–
decision support system
– refers to a system designed to help analyze and make specific kinds of decisions (at any level of the management hierarchy).
ES
–
expert system
– refers to a system that mimics the knowledge and behavior of human experts in particular domains, such as diagnosing problems with complicated equipment.
EIS
–
executive information system
Microsoft Access is an example of relational database software, usually called a relational database management system (RDBMS). Access is just one of many relational database offerings in the software marketplace. Others include packages such as Oracle and Ingres. Some database software, such as MySQL, is available free of cost, while other packages are sold by commercial vendors such as Oracle and IBM.
All of these database packages are conceptually similar to MS Access. The greatest difference is in the scale of operation each package supports, in terms of both the volume of data and the number of simultaneous users. User interfaces also differ from package to package.
It is important to note that not all databases are relational. Some older technologies are still in use in the business environment, and other modern approaches exist, such as object databases. However, relational databases are by far the most commonly used today, especially in business applications, which is why this textbook focuses on them.
In relational databases, all data are kept in tables, also called relations. Most relational databases contain more than one table, but for now we will keep things simple and consider only a single table. A database with only one table is often called a flat file database.
Table 2.1 shows an example of a data table pertaining to students.
Table 2.1 Example table of student data.
ID#
FirstName
LastName
ZipCode
14758993
Joseph
Ho
08765
23458902
Karen
Leigh
21678
89312199
Max
Saperstein
11572
90926431
Alex
Holmes
08743
82938475
Meera
Rajani
99371
19284857
Evan
Chu
34012
The rows of the table, also called tuples or records, correspond to things or events that we wish to store information about, such as people, orders, or products. In Table 2.1, each row corresponds to a student.
The columns of the table, also called attributes or fields, record various properties of the things or events being described. In this example, the attributes are the ID number, first name, last name, and zip code of each student.
Other kinds of software can store tables of data. For example, spreadsheet programs such as Microsoft Excel can store data tables. The biggest different between Excel and Access is in the way in which Access allows for relationships between multiple tables. However, other differences exist. For instance, each column in a relational database table has a fixed datatype. Here “datatype” refers to the kind of data being stored: for example, an amount of money, some other kind of number, or a character string like a person’s name. In a relational database, every datum stored in a column must have the same datatype; that is, every entry in the column must be a percentage, or every entry must be a character string, and so forth. In spreadsheets, you can have data of different types within the same column. For example, a name might be stored in a particular cell, but another cell in the same column might contain a percentage.
Another difference is that in relational databases, one identifies columns by a user‐specified attribute name (such as ID# or FirstName above) rather than by sequential letters (A, B, C,…), or column numbers as in a spreadsheet.
One more difference is that in spreadsheets, rows and columns are essentially symmetrical in their basic function. For example, it is no harder to add a column to a spreadsheet than it is to add a row. In relational database tables, rows and columns have fundamentally different roles. In relational databases, you can add or delete rows easily and quickly, whereas columns are largely static. Depending on the specific relational database software one is using, one might be able to add or delete columns in a table after it has been created, but if the table already contains a large amount of data, such an operation may be very time consuming and require significant computing resources.
Microsoft Access is both of the following:
A relational database system
A graphical, object‐oriented software development environment (but not an object database or object‐oriented database, which would imply a different, more flexible data‐storage model)
To develop an Access application, one uses various tools and “wizards” to create, customize, and link “objects” to suit one’s needs. It is also possible to write segments of computer programming code in the Visual Basic language and combine them with objects, but doing so is often unnecessary for simple Access applications. Thus, Access allows someone to build fairly sophisticated applications without engaging in classic text‐based computer programming.
Access’ most frequently used kinds of objects are tables, forms, queries, and reports. Access stores all the objects for a database in a single file with the type “.accdb” (or “.mdb” in earlier versions of Access).
Typically, Access keeps each database and its entire constituent objects inside a single operating‐system file (Figure 2.1). For different database software or operating systems other than Microsoft Windows, the situation might be different: the database or even a single table might be spread across multiple operating‐system files.
Figure 2.1 Objects within an Access file.
Access allows you to link objects in useful ways. We will start by examining the simplest such linkage, between a table and a form. Essentially, the table provides a way to store your basic data, and the form provides an alternative way to view that data on the screen. Information can flow in both directions between the table and form (Figure 2.2).
Figure 2.2 A table interacting with a form.
Let us suppose we want to track information about students at a small college. We want to keep the following information for each student:
ID number (9 digits)
Name (First, Last, and Middle name or initial)
Address, consisting of street address, city, state, and zip code
Major
Gender
Birth date (question: why is it better to store a person’s birth date than their age?)
Whether or not the student is on financial aid (for the purposes of this example, a simple yes/no)
Credits taken
Grade points amassed
Note that if you take a three‐credit class and get a B+, that means you get 3 × 3.5 = 10.5 grade points. Your GPA is the ratio of your grade points amassed to your total credits taken.
Let us create a database to store this information:
Open Microsoft Access from the “Start” or Windows menu at the bottom left of the screen (
Figure 2.3
).
Click the “Blank desktop database” icon (“Blank database” in earlier versions of Access).
In the resulting dialog box, provide a file name, for example, “students” (if Windows is configured not to display file types) or “students.accdb” (if Windows is configured to display file types).
Click “Create.”
Figure 2.3 When Access opens.
Access assumes that the first thing we want to do is to create a table. We see an empty table called “Table1.” Now, we want to define what information resides in this table. This step is called table design (Figure 2.4).
Figure 2.4 A new Access table.
Accordingly, we click the “View” button at the top left (make sure you have the “FIELDS” tab selected), and select “Design View.
”
A small dialog box appears requiring us to give the table a name – we call the table STUDENT and then click OK (
Figure 2.5
).
Figure 2.5 Naming a table in Access.
We now see a list of the attributes in the table (somewhat counterintuitively, the columns of the table appear as rows in this view; Figure 2.6).
Figure 2.6 Table design view.
We now specify the columns in our table. In designing a database, you should keep in mind the following guidelines:
Try to plan for the future and include all the data you are likely to need
. In Access, it is possible to change attribute datatypes or add attributes later, but if the database is already large or many people are using it simultaneously, adding attributes or modifying datatypes could be slow or could disrupt the use of your system.
Have a separate field for each division of the data you anticipate needing
. As a general rule, it is much easier to put data together than to take them apart. For example, if we store student’s names as three different fields (first, middle, and last names) then we can easily form a student’s whole name by concatenating (placing end‐to‐end) the contents of these fields. If we just store the name as one large field, then certain tasks, such as sorting students by their first or last names, may become unnecessarily difficult and prone to error.
Avoid storing calculated fields
. If you have already stored the total credits and grade points amassed, you can easily calculate GPA. There is no need to redundantly store the students’ GPAs. That will take up unnecessary space and create an opportunity for the fields of the database to become inconsistent with one another. In relational databases like Access, you should generally use table objects only for your “base” data. Calculations based on those data reside in other objects, such as queries, forms, and reports. This separation of base and calculated data is another way in which relational databases are different from spreadsheets, in which all “base” data coexist with calculations within the same two‐dimensional grid of cells.
Let us now proceed with the design of our database:
Access has already provided a field called “ID,” in keeping with the standard procedure of every table including a
primary key
attribute whose value uniquely identifies each row of a table. This is another difference from spreadsheets, which identify rows by simply numbering them sequentially. We will extensively discuss the choice and construction of primary keys later in this book.
Now, each student should already have a unique student ID number, so that even if there were two students named “John Smith,” each would have a different ID. Access defaults to a primary key field called “ID” with a datatype of “Autonumber,” which means that Access will assign IDs automatically. Let us assume for this example that the college registrar has already assigned 9‐digit ID numbers in a social‐security‐number‐like format. We do not want to create our own different ID numbers, but would prefer to use the same ID numbers assigned by the registrar, in the same format. Therefore, we change the datatype of the ID field to text, rather than Autonumber, by selecting “Short Text” (for Access 2013) or “Text” (for earlier versions) from the pull‐down menu under “Data Type.” Note that it is generally customary to use text fields to represent ID numbers and the like, which do not have any specific arithmetic meaning – for example, it makes no sense to add or subtract two students’ ID numbers.
Versions of Access prior to 2013 had two kinds of text fields, “Text” for standard, fixed‐length fields and “Memo” for potentially very long, free‐form, variable‐length fields. In Access 2013 the terminology for these two kinds of fields was changed to “Short Text” and “Long Text,” respectively. This terminology can be a bit confusing because “Short Text” has the potential to be quite long (255 characters), and a “Long Text” field does not necessarily have to be long. In this textbook, we will use the term “Text” to refer to standard, “short” text fields, and we will not use “Long Text” fields in our exercises. Therefore, we will refer to “Short Text” fields simply as “Text” from this point forward.
In the “field properties” at the bottom of the screen, we enter “9” in “field size” (to set the ID length to nine characters) and “000\‐00\‐0000” under “input mask.” This input mask allows you to enter only numbers (because of the “0” characters), and the placement of the “\‐” characters causes the IDs to display in a social‐security‐number‐like format. Note that the hyphens in the input mask are not actually stored in the database. Instead of just typing in the input mask, we can instead click on the “…” button in the input mask property, and select from commonly used input masks in the “input mask wizard” that then pops up. Finally, we can enter an explanation like “Student ID number from registrar” in the “description” area.
In the next row, we create a “FirstName” field by typing “FirstName”. The datatype defaults to “Short Text” (or, equivalently, “Text” in older versions of Access), with a length of 255 characters. Here, we may select a shorter length, like 40. Here and below, we will select some reasonable lengths for text fields, but there is nothing magical or “best” about the lengths chosen.
Create a “MiddleName” text field; set the length to 20.
Create a “LastName” text field; set its length to 50.
Create a “StreetAddress” text field; set its length 80.
Create a “City” text field; set its length to 50.
Create a “State” text field, and set its length to 2 (assuming we will use standard postal two‐letter codes for states).
Create a “ZipCode” text field, and set its length to 9 (for modern zip + 4 codes). We can click the “…” box on the right of “input mask,” and after saving the table, select a standard mask for zip codes. Note that adding or multiplying zip codes makes no sense, so we store them as text.
Create a “Gender” text field; set its length to 1. We will just store “M” or “F” in this field. Later in this chapter we will see how to ensure that a user does not enter other letters.
Create a “Major” text field; set its length to 30 (later in this book, we will see how to allow only real majors to be entered in situation like this).
Create a “BirthDate” field. Access has a special datatype for dates and times, called “Date/Time,
” which can store combined dates/times with an accuracy of seconds. Select this datatype. Under “field properties,” we can also select a format to use to display this information – for example, “short date.” Note that it is much better to store a student’s birth date, which is static, than their age, which would have to be periodically updated.
