36,59 €
Learn everything you need to know to build efficient SQL queries using this easy-to-follow beginner's guide
Key Features
Book Description
SQL is a powerful querying language that's used to store, manipulate, and retrieve data, and it is one of the most popular languages used by developers to query and analyze data efficiently.
If you're looking for a comprehensive introduction to SQL, Learn SQL Database Programming will help you to get up to speed with using SQL to streamline your work in no time. Starting with an overview of relational database management systems, this book will show you how to set up and use MySQL Workbench and design a database using practical examples. You'll also discover how to query and manipulate data with SQL programming using MySQL Workbench. As you advance, you'll create a database, query single and multiple tables, and modify data using SQL querying. This SQL book covers advanced SQL techniques, including aggregate functions, flow control statements, error handling, and subqueries, and helps you process your data to present your findings. Finally, you'll implement best practices for writing SQL and designing indexes and tables.
By the end of this SQL programming book, you'll have gained the confidence to use SQL queries to retrieve and manipulate data.
What you will learn
Who this book is for
This book is for business analysts, SQL developers, database administrators, and students learning SQL. If you want to learn how to query and manipulate SQL data for database administration tasks or simply extract and organize relevant data for analysis, you'll find this book useful. No prior SQL experience is required.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 543
Veröffentlichungsjahr: 2020
Copyright © 2020 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Brandon D'AbreoAcquisition Editor:Reshma RamanContent Development Editor:Nazia ShaikhSenior Editor: Ayaan HodaTechnical Editor: Utkarsha S. KadamCopy Editor: Safis EditingProject Coordinator:Aishwarya MohanProofreader: Safis EditingIndexer:Manju ArasanProduction Designer:Jyoti Chauhan
First published: May 2020
Production reference: 1290520
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-83898-476-2
www.packt.com
Packt.com
Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Fully searchable for easy access to vital information
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Josephine Bush has over 10 years of experience as a Database Administrator. Her experience is extensive and broad-based, including in financial, business, and energy data systems using MySQL, SQL Server, Oracle, and PostgreSQL. She is a Microsoft Certified Solutions Expert: Data Management and Analytics. She holds a BS in Information Technology, an MBA in IT Management, and an MS in Data Analytics.
Starting out with the Microsoft stack, Frank Solomon gradually focused on SQL Server and database development. He then extended to writing and technical writing. He writes for SQL Shack, he blogs at Bit Vectors, and he had the lead co-author role for The SQL Workshop, a Packt book. Find Frank at LinkedIn, and reach out to him with writing / technical writing/development opportunities. He levers sharp software development and writing skills to build awesome products. He has plenty of remoting experience, and he uniquely relies on the active voice to build high-quality writing products.
Awni Al Saqqa is a Microsoft Technology Specialist in MS SQL Server and a certified solutions developer since 2007. He has over a decade of experience with database development and administration on SQL Server, Oracle, and MySQL. He is a solutions architect, who is hands-on in many enterprise projects for different business sectors, such as education, hospitality, retail, manufacturing, marketing, and more, which has given him the perfect combination between business and technical experience. Awni is also the Lead Author for The SQL Workshop book which is published by Packt
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Learn SQL Database Programming
About Packt
Why subscribe?
Contributors
About the author
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Section 1: Database Fundamentals
Introduction to Relational Database Management Systems
Understanding SQL
Elements of SQL
Understanding databases
Tables
Fields
Records or rows
Columns
Understanding data integrity
Types of integrity
Entity integrity
Unique constraints
Not null constraints
The primary key
Referential integrity
Domain integrity
Database normalization 
The first normal form
The second normal form
The third normal form
Types of RDMS
Oracle
MySQL
SQL Server
PostgreSQL
RDMS SQL differences
Summary
Questions
Installing and Using MySQL Workbench
Technical requirements
Installing MySQL Workbench
Installing on Windows
Installing MySQL on Windows
Configuring MySQL on Windows 
Installing on Mac 
Installing MySQL Server on Mac 
Checking the status of MySQL Server on Mac
Installing MySQL Workbench on Mac 
Using MySQL Workbench
Connecting to your local instance
Connecting to another instance or setting up your local instance
Restoring a database 
Summary
Questions
Understanding Data Types
Understanding string data types
String data types in MySQL
MySQL string data type table summary
String data types in other RDMS
Oracle 
PostgreSQL
SQL Server 
String data types RDMS table comparison
Understanding numeric data types
Numeric data types in MySQL 
MySQL numeric data type table summary
Numeric data types in other RDMSes
SQL Server
Oracle
PostgreSQL
Numeric data types table comparison
Understanding date and time data types
Date and time data types in MySQL 
MySQL date and time data type table summary
Date and time data types in other RDMSes
Oracle 
PostgreSQL 
SQL Server 
Date and time data types table comparison
Understanding other data types
Other data types in MySQL 
Other data types in other RDMSes
Oracle 
PostgreSQL 
SQL Server 
Choosing the right data type
Examples of choosing a data type 
Summary
Questions
Designing and Creating a Database
Technical requirements
Creating a database
Guidelines for naming conventions
Avoiding keywords
Avoiding spaces 
Descriptive and accurate naming
Case and separating words 
Allowed characters when naming database objects 
Learning how to create a database
Creating a database via the MySQL Workbench interface
Creating a database via MySQL Workbench with a SQL script 
Understanding SQL code errors
Understanding table relationships
Understanding entity-relationship diagrams
Understanding one-to-one table relationships
Understanding one-to-many table relationships 
Understanding many-to-many table relationships
Creating a table in the database
Understanding how to apply data types and data integrity to your table
Learning to create a database table
Natural and surrogate primary keys 
Creating a database table via MySQL Workbench
Creating a database table via MySQL Workbench with SQL scripts 
Learning how to format SQL code for readability
Commenting SQL code
Understanding indexes
Understanding how indexing relates to data integrity 
Types of indexes
Understanding how indexing impacts performance
Understanding naming conventions for indexes 
Summary 
Further reading
Questions
Importing and Exporting Data
Technical requirements
Understanding table data import and export
Importing CSV files with table data import
Exporting to CSV files with table data export
Understanding SQL data import and export
Importing via data import in MySQL Workbench
Exporting via data export in MySQL Workbench
Understanding result data export
Exporting data directly from a result set
Understanding SQL syntax for importing and exporting data 
Importing with a SQL script
Exporting with a SQL script 
Summary
Further reading
Questions
Section 2: Basic SQL Querying
Querying a Single Table
Technical requirements
Using the SELECT statement and FROM clause
Understanding the SELECT statement and the FROM clause
Learning the correct order of other clauses you can use with SELECT
Understanding the different ways to query with a SELECT statement
Learning how to use column aliases
Using the USE statement
Learning how to use the DISTINCT clause 
Learning how to use the LIMIT clause
Limiting results on other Relational Database Management Systems (RDMSes)
Learning how to save a SQL query to a file
Learning how to open a SQL file
Learning how to add comments to your SQL code
Commenting code on other RDMSes 
Using the WHERE clause
Understanding how and when to use the WHERE clause to limit query results
Learning how to use the AND and OR operators
Learning how to use the NOT, IN, and BETWEEN operators
Learning how to use the LIKE operator and wildcards
Using the percent (%) wildcard 
Using the underscore (_) wildcard
Escaping wildcard values 
Differences between LIKE in other RDMSes
Learning how to filter on NULL values
Using the ORDER BY clause
Learning how to use the ORDER BY clause to order query results
Learning how to use the ORDER BY clause to sort by one or more columns
Using indexes with your queries 
Learning how to see what indexes your query is using
Summary
Questions
Further reading 
Querying Multiple Tables
Technical requirements
Understanding joins 
Understanding results returned with an inner join 
Understanding results returned with a left outer join 
Understanding results returned with a right outer join
Understanding results returned with a full outer join 
Using INNER JOIN 
Learning INNER JOIN syntax 
Learning how to use table aliases 
Using OUTER JOIN 
Learning LEFT OUTER JOIN syntax
Learning RIGHT OUTER JOIN syntax
Exploring differences in other relational data models
Using FULL OUTER JOIN
Using advanced joins 
Understanding what a CROSS JOIN is and how to use it 
Understanding what a NATURAL JOIN is and how to use it
Understanding what a SELF JOIN is and how to use it
Understanding set theory
Understanding what a UNION join is and learning how to use it in a SQL query
UNION
UNION ALL
Understanding what an intersect is and learning how to use it in a SQL query 
Looking at intersection in other RDMS
Understanding what difference is and learning how to use it in a SQL query 
Exploring differences in other RDMS
EXCEPT
MINUS
Using indexes with your queries 
Summary 
Questions
Further reading 
Modifying Data and Table Structures
Technical requirements
Inserting data into tables
Gathering information to insert, update, or delete data
Using the INSERT statement 
Single-row inserts
Multiple row inserts
Differences in other Relational Database Management Systems
Inserting data from one table into another table
Differences to other RDMSes
Deleting data from tables
Using the DELETE statement with a WHERE clause
Deleting all the data from a table
Learning an alternative way to delete data with the TRUNCATE statement 
Updating data in tables
Using the UPDATE statement with a WHERE clause
Updating all the data in a table 
Updating table data from another existing table 
Using transactions to save or revert changes
Understanding a SQL transaction
Learning the SQL syntax for SQL transactions
Differences in RDMS transaction syntax
Modifying the table structure
Adding a column
Dropping a column
Renaming a column
Changing the data type of a column
Adding or changing a column constraint
Dropping a constraint, key, or index 
Differences to other RDMS
Dropping a table
Summary 
Questions 
Further reading 
Section 3: Advanced SQL Querying
Working with Expressions
Technical requirements
Using expressions
Literal values 
Operators
Comparison operators 
Logical operators
Mathematical operators 
Operator precedence 
Column values
Built-in functions
String built-in functions
Differences in RDMS (Relational Database Management Systems) 
Numeric built-in functions 
Differences in RDMS
Datetime built-in functions 
Working with time zones 
Differences in RDMS
Advanced built-in functions 
Working with NULL values
Differences in advanced built-in functions in RDMS
Built-in functions and indexing 
Using statistical functions
Learning how to use built-in statistical functions
Exploring differences in RDMS 
Using generated columns 
Types of generated columns 
Creating a generated column 
Differences in RDMSes
Summary 
Questions 
Further reading 
Grouping and Summarizing Data
Technical requirements
Understanding aggregate functions
Numeric aggregate functions 
Statistical aggregate functions
Using the GROUP BY clause
Understanding how GROUP BY works without aggregate functions
Using WHERE with GROUP BY
Using ORDER BY with GROUP BY
Learning how to use the GROUP BY clause to group query results using aggregate functions
Learning how to use the ROLLUP modifier
Differences in RDBMSes 
Using the HAVING clause
Learning how to use the HAVING clause to limit query results
Understanding the difference between the HAVING and WHERE clauses
Understanding SQL query order of execution 
Summary 
Questions 
Advanced Querying Techniques
Technical requirements
Using subqueries
Understanding the different types of subqueries and their usage
Using non-correlated subqueries 
Using a non-correlated subquery in the WHERE clause
Using a non-correlated subquery in the SELECT clause
Using a non-correlated subquery in the FROM clause
Using INSERT, UPDATE, and DELETE with non-correlated subqueries
Differences between non-correlated subqueries in other relational database management systems (RDMSes)
Using correlated subqueries 
Using a correlated subquery in the WHERE clause
Using a correlated subquery in the SELECT clause
Using common table expressions
Using non-recursive CTEs
Non-recursive CTE with the SELECT statement 
Using recursive CTEs 
Differences between CTEs in other RDMSes
Using query hints and transaction isolation levels
Understand the concepts of locking, blocking, and deadlocking
Learning how to use index hints to improve queries
Learning how to use transaction isolation levels
Summary 
Questions 
Further reading 
Programmable Objects
Technical requirements
Creating and using views
Learning how to create and query a view
Learning how to modify data returned in a view
Updating data using a view
Updating data using a view that has multiple tables
Inserting data using a view
Inserting data using a view that has multiple tables
Deleting data using a view
Deleting data using a view that has multiple tables
Learning how to update or delete a view
Differences between views in other relational database management systems (RDBMSes) 
Creating and using variables
Learning how to create and assign values to variables
Learning how to use variables in SQL statements
Differences between variables in other RDBMSes
Creating and using stored procedures
Creating a stored procedure
Learning how to alter and drop stored procedures 
Using variables and parameters in stored procedures
IN parameter
OUT parameter
Using flow control statements
Understanding the different types of flow control statements
Understanding the difference between the IF and CASE statements and how to use them 
Understanding how to loop through statements
Using error handling
Understanding error handling syntax and how to implement error handling
Differences between stored procedures in other RDBMSes
Oracle 
Creating and calling a stored procedure in Oracle
Flow control in Oracle
Error handling in Oracle
PostgreSQL
Creating a stored procedure in PostgreSQL 
Flow control in PostgreSQL 
Error handling in PostgreSQL
SQL Server 
Creating and calling a stored procedure in SQL Server 
Flow control in SQL Server 
Error handling in SQL Server 
Creating and using functions
Understanding the difference between a function and a stored procedure
Learning how to create and use functions
Learning how to alter or delete functions 
Differences between functions in other RDBMSes
Oracle 
PostgreSQL 
SQL Server 
Creating and using triggers
Learning how to create and use a trigger 
Creating and using a trigger with one statement
Creating and using a trigger with multiple statements
Creating and using multiple triggers on the same table 
Deleting a trigger
Differences between triggers in other RDBMSes
Creating and using temporary tables
Learning how to create and use a temporary table
Learning how to delete a temporary table
Differences between temporary tables in other RDBMSes
Summary 
Questions
Further reading 
Section 4: Presenting Your Findings
Exploring and Processing Your Data
Technical requirements
Exploring your dataset
Getting to know your data using statistical identities
Detecting rare and outlier values
Detecting missing values
Detecting duplicate and erroneous values 
Consulting with experts or becoming the expert
Creating a data dictionary
Using regular expressions
Combining regular expression characters 
Processing your dataset
Fixing rare and outlier values
Fixing missing values
Removing or fixing duplicates 
Removing duplicates 
Fixing duplicates 
Fixing erroneous data 
Summary 
Questions 
Telling a Story with Your Data
Technical requirements
Finding a narrative
Types of data stories
Asking questions to find your narrative 
Using the statistical identity of your data to determine a narrative
Knowing your audience
Determining who your audience is
Creating a compelling presentation for your audience
Determining a presentation framework
Explaining the question
Explaining the answer
Explaining your methodology
Using visualizations
Common mistakes to avoid in visualizations 
Using data visualization tools
Summary 
Questions 
Section 5: SQL Best Practices
Best Practices for Designing and Querying
Technical requirements
Best practices for database design
Understanding data integrity
Naming conventions of database objects
Understanding what data types to use
Best practices for indexing
Understanding when to create indexes
Best practices for querying and modifying data
Understanding how to write clean code
Understanding query optimization 
Understanding best practices when querying data
Understanding best practices when modifying data
Summary 
Questions
SQL Appendix
SQL for designing databases
Syntax for creating a database
Syntax for creating and altering tables
Syntax for creating and altering indexes
SQL for selecting data
Syntax for selecting data 
Syntax for filtering data 
Syntax for ordering results
Syntax for joining tables 
Syntax for grouping results
Syntax for filtering grouped results 
Syntax for using aggregate functions 
SQL for modifying data
Syntax for inserting data
Syntax for updating data 
Syntax for deleting data 
Syntax for SQL transactions
SQL expressions
Types of expressions
Syntax for using generated columns 
Advanced query techniques
Syntax for subqueries
Syntax for common table expressions
Syntax for query hints
Syntax for transaction isolation level 
Programmable objects
Syntax for views
Syntax for variables
Syntax for stored procedures
Syntax for flow control statements
Syntax for error handling
Syntax for functions
Syntax for triggers
Syntax for temporary tables
Summary
Assessments
Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6 
Chapter 7
Chapter 8
Chapter 9
Chapter 10
Chapter 11
Chapter 12
Chapter 13
Chapter 14
Chapter 15
Other Books You May Enjoy
Leave a review - let other readers know what you think
SQL is a powerful querying language used to store, manipulate, and retrieve data, and is one of the most popular languages used by developers to query and analyze data efficiently. If you're looking for a comprehensive introduction to SQL, Learn SQL Database Programming will help you to get up to speed with using SQL to streamline your work in no time. Starting with an overview of relational database management systems, this book will show you how to set up and use MySQL Workbench and design a database using practical examples. You'll also discover how to query and manipulate data with SQL programming using MySQL Workbench. As you advance, you'll create a database, query single and multiple tables, and modify data using SQL querying. This SQL book covers advanced SQL techniques, including aggregate functions, flow control statements, error handling, and subqueries, and helps you process your data to present your findings. Finally, you'll implement best practices for writing SQL and designing indexes and tables.
By the end of this SQL programming book, you'll have gained the confidence to use SQL queries for retrieving and manipulating data.
This book is for business analysts, SQL developers, database administrators, and students learning SQL. If you want to learn how to query and manipulate SQL data for database administration tasks or to simply extract and organize relevant data for analysis, you'll find this book useful. No prior SQL experience is required.
Chapter 1,Introduction to Relational Database Management Systems, introduces the concepts required to understand the basics of relational database management systems. It introduces foundational topics such as understanding SQL, the relational model, data integrity, database normalization, and the various types of relational database management systems. It gives you fundamental knowledge about SQL and databases that will be required throughout the book.
Chapter 2,Installing and Using MySQL Workbench, covers how to install MySQL Workbench on Windows and Mac, including step-by-step instructions to help you walk through each part of the installation process. The instructions also include the configuration of MySQL Workbench on both Windows and Mac. We will walk through some examples of connecting to your local MySQL and also setting up connections to other MySQL servers. We conclude with a step-by-step explanation of how to restore a database to MySQL.
Chapter 3,Understanding Data Types, covers what data types are and how they are used. You will learn about specific data types and what data can be stored in each of them. The data types include string, numeric, and date and time. String data types include char and varchar, binary and varbinary, blob, enum, and text. Numeric data types include bit, int, float, double, and decimal. Date and time data types include date, time, datetime, timestamp, and year. You will learn from the perspective of MySQL data types, but where there are differences versus SQL Server, Oracle, and PostgreSQL, those differences will be noted. We will also go through some examples of types and values of data to see how to assign them correctly to data types, including an explanation of why you need to be careful when selecting a data type and how it can impact database performance.
Chapter 4,Designing and Creating a Database, introduces you to designing and creating a database. We'll walk through the guidelines for naming conventions and understand SQL code errors. You will learn how to format SQL code for readability and apply data types and data integrity to our tables. You will also learn about the different types of table relationships and how to build entity-relationship diagrams. Going further, we will discuss the concept and usage of indexing. You will gain an understanding of how indexing helps database performance. Finally, you will learn how to create a table in a database.
Chapter 5,Importing and Exporting Data, introduces you to importing and exporting data. There are many ways to import and export data in MySQL. You will learn how to import and export data using MySQL Workbench via table data from/to CSV files. We will also cover importing and exporting via SQL data with SQL scripts. An additional way to export data via result data and query results will also be covered. The final topic discussed is using SQL syntax to import and export data.
Chapter 6,Querying a Single Table, covers how to use the basic SQL SELECT statement and the FROM, WHERE, and ORDER BY clauses. This chapter also covers how to tell which index your query is using and whether you may need additional indexes. By the end of this chapter, you will understand how to query data using the SELECT statement and the FROM clause. You will also learn how to limit results with a WHERE clause, how to use ORDER BY to return results in a specified order, and how to see information about what indexes are being used or may be needed.
Chapter 7,Querying Multiple Tables, covers how to use SQL joins to join two or more tables together, including INNER, OUTER (LEFT, RIGHT, and FULL), and advanced joins (the cross and self joins). You will learn about set theory and how to combine queries using UNION and UNION ALL, and how to get the differences and intersections of different queries. Lastly, you will learn how to optimize queries when they contain multiple tables.
Chapter 8,Modifying Data and Table Structures, goes through how to modify data in tables. This includes learning how to use INSERT, UPDATE, and DELETE statements. You will also learn about SQL transactions, which help to control the modification of data. Finally, you will learn how to modify a table structure.
Chapter 9,Working with Expressions, covers how to use literals, operators, columns, and built-in functions to create expressions. You will learn about the different types of built-in functions, including string, numeric, date and time, and advanced functions, which include casting and converting to other data types. You will learn how to use statistical functions, including how to get and use variance and standard deviation. Finally, you will learn how to create a generated column based on an expression.
Chapter 10,Grouping and Summarizing Data, covers how to use aggregate functions to group and summarize data. Aggregate functions include math functions such as AVG, SUM, COUNT, MIN, and MAX. You will also learn how to use the GROUP BY and HAVING clauses in conjunction with the aggregate functions. Finally, you will learn how MySQL executes your query clauses.
Chapter 11,Advanced Querying Techniques, covers how to use two different kinds of subqueries, correlated and non-correlated. Then, you will learn about two different types of common table expressions, recursive and non-recursive. You will learn about query hints and how to choose which index your query will use. Finally, you will learn about isolation levels and concepts relating to how data is read from and written to tables.
Chapter 12,Programmable Objects, covers how to create and use views, which includes selecting data from views, and inserting, updating, and deleting data using views. You will learn how to create and use variables, which includes how to declare and assign values to variables. You will also learn how to create and use stored procedures, including how to use variables and parameters in stored procedures, as well as how to control flow and error handling. In addition to all that, you will learn how to create and use functions, triggers, and temporary tables.
Chapter 13,Exploring and Processing Your Data, covers how to explore and process data. By the end of this chapter, you will understand how to get to know data by creating a statistical identity, you will have learned how to detect and fix anomalous and missing values, and will know how to use regular expressions to match data value patterns.
Chapter 14,Telling a Story with Your Data, teaches you how to find a narrative, including what types of stories you can tell with data and how to use the statistical identity of your data to determine a story. You will also learn about knowing your audience, including deciding who they are and what would be a compelling presentation for them. Then, you will learn how to identify a presentation framework, including explaining the question, answer, and methodology. Finally, you will learn how to use visualizations in your presentations.
Chapter 15,Best Practices for Designing and Querying, covers database best practices for database design, indexing, and querying and modifying data. You learned about these topics in the previous chapters, and this chapter will summarize and give additional tips for best practices. This chapter will also provide a way for the more experienced among you to quickly reference best practices instead of having to go through each chapter.
Chapter 16,SQL Appendix, covers the SQL commands discussed, which are outlined for quick reference. It includes the syntax for querying data, modifying data, and designing databases and tables. This chapter will help you by providing a quick reference guide, so you won't have to go back through all the chapters to check the syntax, but if you require more details about how the syntax works, you can refer to the specific chapter for that information.
For this book to be useful, you either need access to MySQL Workbench and the ability to query a MySQL Server, or the ability to install them. To install them yourself, you will need elevated permissions. Installations for MySQL Workbench are found athttps://dev.mysql.com/downloads/workbench/and installations for MySQL server are found athttps://dev.mysql.com/downloads/mysql/. If you don't want or don't have MySQL installed, you can follow along in most chapters with SQL code that will work in Oracle, PostgreSQL, or SQL Server, as well.
If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
Support
tab.
Click on
Code Downloads
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/learn-sql-database-programming. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781838984762_ColorImages.pdf.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "BINARY is like CHAR, but stores byte strings instead of character strings."
A block of code is set as follows:
<
books
>
<book
>
<
name
>
Learn SQL Programming
<
/name
> <author>Josephine Bush</author>
<
/book
>
<
/books
>
Any command-line input or output is written as follows:
SELECT * FROM lahmansbaseballdb.appearances;
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Click Download on the DMG Archive."
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
The objective of this section is to introduce you to relational database management systems, how to set up and use MySQL Workbench, how to use data types, how to design and create a database, and how to import and export data.
This section comprises the following chapters:
Chapter 1
,
Introduction to Relational Database Management Systems
Chapter 2
,
Installing and Using MySQL Workbench
Chapter 3
,
Understanding Data Types
Chapter 4
,
Designing and Creating a Database
Chapter 5
,
Importing and Exporting Data
This chapter introduces the concepts required to understand the basics of relational database management systems (RDMS). It will introduce foundational topics such as SQL, the relational model, data integrity, database normalization, and the types of relational database management systems. It will provide you with fundamentalknowledge about SQL and databases that will be required throughout this book.
In this chapter, we will cover the following topics:
Understanding SQL
Understanding databases
Understanding data integrity
Understanding database normalization
Types of RDMS
Structured Query Language, or SQL (pronounced see-quel), is the language that is used for querying and manipulating data and defining structures in databases. Initially developed at IBM in the early 1970s, SQL became an ANSI and ISO standard in 1986.
SQL is a powerful, yet simple language, and can do many things, such as execute queries, retrieve, insert, update, and delete data, create databases and tables, and much more.
These types of activities can be grouped into different subdivisions of SQL: Data Definition Language (DDL), Data Manipulation Language (DML), and Data Control Language (DCL):
Use DDL commands to specify database schema:
CREATE
: This is used to create a new database or objects in a database.
ALTER
: This is used to alter a database or objects in a database.
DROP
: This is
used to delete a database or objects in a database.
TRUNCATE
: This is
used to remove all data from a table instantaneously.
Use DML commands to query and modify data:
SELECT
: This is
used to retrieve data from a database.
INSERT
: This is
used to insert data into a database.
UPDATE
: This is
used to update data in a database.
DELETE
: This is
used to remove data from a database.
Use DCL commands to control permissions and translations:
GRANT
: This is
used to give access to a user.
REVOKE
: This is
used to take access away from a user.
COMMIT
: This is
used to save changes in a transaction.
ROLLBACK
: This is
used to remove the saved changes in a transaction.
The SQL language comprises several elements that will be explained in more depth in subsequent chapters. These elements include the following:
Queries that retrieve data based on specific criteria.
Clauses that are components of statements or queries.
Predicates that are logical conditions that evaluate to true or false. These help you to narrow down the results of your queries.
Expressions that produce either scalar values or tables of columns and rows. Expressions are a part of predicates.
Statements that are queries run against a database, comprised of clauses and, optionally, expressions and predicates.
White space that is generally ignored in SQL statements and queries, making it easier to format for readability because you don't have to worry so much about particular spacing for the SQL to run correctly.
The following diagram shows you the components of a SQL statement, which is also called a SQL query:
In the preceding diagram, you can see the different elements of a SQL statement. Each line in the preceding statement is considered a clause. Clauses use SQL keywords. Keywords are reserved words that have special significance in the SQL language—SELECT,FROM, andWHEREare just some of the keywords that are used. More information on keywords is provided inChapter 4,Designing and Creating a Database. The preceding diagram also shows an expression and predicate. A predicate helps you to narrow down your query results. The expression is a piece of a predicate that sets the value. The diagram also helps to illustrate the use of white space. You could write out your entire query on one line, but it's much easier to read when you add carriage returns and spaces. The details of the different elements of queries will be covered more in future chapters of this book.
A database is a collection of data. You store databases in a relational database management system (RDMS). The RDMS is the basis for modern database systems like MySQL, SQL Server, Oracle, PostgreSQL, and others. These will be covered in more detail later in this chapter.
In an RDMS, objects called tables store data. Tables are a collection of related data stored in columns and rows. The following screenshot is a cross-section of a table that contains data about baseball players' appearances in all-star games:
A NULL value in a table is a value that appears to be blank. It doesn't represent a string of blank spaces, zero, or a zero-length character string: it's a missing or unknown value.
A field is an intersection of a row and a column. This field could be any type of data, including a yearID, teamID, or a playerID field (using our example). Each red arrow in the following screenshot points to a value in a column that is considered a field:
A row contains values in a horizontal division of data. In this example case, it's a row or record from a table:
A column contains values in a vertical division of data. In this example case, it's the gameID column from a table:
To ensure that the data in your tables is consistent and accurate, you will need to understand data integrity. You will learn about data integrity in the next section.
Data integrity refers to the consistency and accuracy of the data. It is typically enforced by the procedures and guidelines in the database design phase. In RDMS, keys enforce data integrity. A key is user-defined and forces values in a table to conform to a specified standard. This standard will allow only certain kinds of values to be in the database.
Data integrity refers to the consistency and accuracy of data and table relationships. The following table lists the types of integrity you can use:
Entity integrity
Referential integrity
Domain integrity
Unique constraint
Foreign key
Check constraint
Not null constraint
Default constraint
Primary key
Each type of integrity and how each relates to one another is discussed in the following sections.
To ensure that each row in a table is identifiably unique, you use entity integrity. This is done with a few different types of keys or constraints, including unique, not null, and primary key constraints.
To ensure that all values in a column or columns are different from each other, you use a unique constraint. This type of key can be applied to any data type and is used to avoid duplicate data. You can apply a unique constraintto multiple columns so that it creates a unique value across those multiple columns. It can contain null values.
If you create a unique constraint on one column, it will force the table to have unique values in that specific column. If they are not unique, then the row will not be able to be inserted or updated.
In the following screenshot, the parkkey constraint is unique. All the other fields can have duplicate information as long as the parkkey unique constraint isn't violated:
If you create a unique constraint on a combination of columns in a table, it will force the table to have unique values in the combination of those columns in the unique constraint. If they are not unique, the row will not be able to be inserted or updated.
The following screenshot shows an example of a composite, unique constraint. In this case, playerID, yearID, and teamID would need to be unique for the row to be acceptable:
To ensure that all values in a column are not null, you use a not null constraint. This type of key can be applied to any data type and is used to avoid missing data. If you create a not null constraint on a column, it will force the table to have values in that specific column. If the values are null, then the row will not be inserted or updated.
In the following screenshot, you can see that the birthYear constraint is set to not null. The deathYearconstraint would allow nulls since not all people have a year of death:
The primary key is used to ensure that all values in a column are not null and unique. This key combines the unique and not null constraint properties into one key. This type of key can be applied to any data type and is used to avoid missing and duplicate data. You can only have one primary key per table.
If you create a primary key on a table, it will force the table to have unique, not null values in that specific column. If the values don't comply, then the row will not be able to be inserted or updated. You can also create a primary key on multiple columns. This is considered a composite key. In this case, the composite key would have to be unique for each row, otherwise the row could not be inserted or updated.
In the following screenshot, the playerID constraint would be the primary key because it's unique and not null for every row in the table:
In the following screenshot, theplayerID,yearID, and teamID constraints could be the composite primary key because the combination of those three columns is unique and not null for every row in the table:
Referential integrity refers to the consistency and accuracy between tables that can be linked together. By having a primary key on the parent table and a foreign key on the child table, you achieve referential integrity. A foreign key on the child table creates a link between one or more columns in the child table and the primary key on the parent table. When a foreign key is present, it must reference a valid, existing primary key in the parent table. This way, the data in both tables can maintain a proper relationship. You will learn more about this in the following example.
If you don't set up referential integrity, you wind up with orphaned records. For example, let's say that you delete a player from the first table here:
Now let's say that you didn't delete the corresponding record in the second table here. In this case, the second table's records would be orphaned:
If there was a foreign key constraint on the salary column, then the player could not be deleted from the parent table without first deleting the corresponding salary rows in the salary table. By having a foreign key constraint, we will also prevent users from adding rows to the child table without a corresponding parent row or changing values in a parent table that would result in orphaned child table records.
You won't get an error if there is incomplete data when you lack referential integrity constraints. It's basically like your records are lost in the database since they may never show up in reports or query results. This can cause all kinds of problems, such as strange results, lost orders, and potentially life-and-death situations where (for example) patients don't receive proper treatments.
When creating a foreign key constraint, the foreign key must reference a column in another table that is the primary key. It can be any data type and accept duplicate and null values by default. The foreign key constraint can maintain three types of table relationships (covered in more detail in Chapter 7, Querying Multiple Tables):
One-to-one
: This type of relationship is when one table has just one corresponding row in another table. An example of this could be a table with employees and computers. Each employee has one computer.
One-to-many
:
This type of relationship is when one table has none, one, or many corresponding rows in another table. An example of this could be a table with adults and children. An adult table row may have none, one, or many rows in the child table.
Many-to-many
:
This type of relationship is when many rows in one table correspond to many rows in another table. An example of this could be the
customers
and
products
tables. Customers can purchase many products.
In the following screenshots, the primary key would be on the first table as playerID. The second table would have a foreign key reference to playerID on the first table. In this case, there would be a one-to-many relationship between the first and second tables because there is one player in the first table and none, one, or many rows corresponding to that player in the second table.
If you had a foreign key setup on playerID in the second table, then you would not be able to delete the playerID value from the first table unless you deleted it in the second table beforehand. This key setup maintains the referential integrity and ensures that you won't have orphaned records in the second table:
To ensure that data values follow defined rules for formatting, range, and value using check and default constraints, you use domain integrity.
The check constraint is used to ensure that all values in a column are within a range of values. This type of key can be applied to any data type and is used to ensure that values aren't invalid. A check constraint is enforced with user-defined conditions and evaluates as either true or false. You can define a check constraint on a single column or a combination of columns in a table.
The following screenshot shows an example of a table where a check constraint would make sense on the inducted column. A player can either be inducted into the hall of fame or not. In this case, you could create a check constraint that only allows Y or N in that field. If the value isn't Y or N, then the row can't be updated or inserted:
The following screenshot shows an example of a table where a check constraint can be applied to multiple columns. For instance, you wouldn't wantdeathYearto be a year before thebirthYear, so you can set a check constraint that will only allow you to add or update abirthYearordeathYearthat follows a check constraint likebirthYear < deathYear:
To ensure that all rows in a column have a value, you use a default constraint. This type of key can be applied to any data type. A default constraint assigns a default value to a field. This is used to avoid having a null value for a field if a user doesn't specify a value.
The following screenshot shows an example of a table where a default constraint could make sense on theabcolumn:
A player can be in a game without having any at-bats. In this case, you could create a default constraint that sets theabcolumn to0if the user provides no value.
Database normalization is the process of putting your raw data into tables using rules to avoid redundant data, optimize database performance, and ensure data integrity.
Without proper normalization, not only can you have data redundancy, which uses additional storage space, but it can be more difficult to update and maintain the database without data loss.
Normalization requires forms. Forms are sets of rules to follow to normalize your data into database tables. There are three forms that we will discuss: the first normal form, the second normal form, and the third normal form. Each of these forms has a set of rules to ensure that your database complies with the form. Each of the forms builds on the previous forms.
The first normal form (1NF) is the first level of database normalization. You will need to complete this step before proceeding to other database normalization forms. The primary reason to implement 1NF is to eliminate repeating groups. This ensures that you can use simple SQL statements to query the data. It also ensures that you aren't duplicating data, which uses additional storage and computing time. This step will ensure that you are doing the following:
Defining data, columns, and data types and putting related data into columns
Eliminating repeating groups of data:
This means that you will not have repeating columns, such as
Year1
,
Year2
,
Year3
, but instead will have a column that is named
Year
, and each row in the table will be a different year.
Another example of this is not having multiple values in the same field, such as
1985
,
1987
,
1989
, but instead placing each year in a row.
This means that there are no exact duplicate rows. The example following this bullet list will explain this concept in more depth.
Creating a primary key for each table
In the following example, you could make the first column the primary key in the people table and the foreign key in the salaries table. In the salaries table, you could create a new primary key or create a composite key that is an amalgamation of multiple fields.
Here is a denormalized sample table:
Name
birthYear
Salary
SalaryYear
Jim Jones
1981
2750000, 4500000
2010, 2011
Joe Smith
1974
10600000
2014
There is a right way and wrong way to normalize this table. Let's go over the wrong way first:
Name
birthYear
Salary1
Salary2
SalaryYear1
SalaryYear2
Jim Jones
1981
2750000
4500000
2010
2011
Joe Smith
1974
10600000
2014
The preceding design has introduced new problems. Even though it doesn't have groups of repeating data in one column, the salary is limited to two values. What if a player has more than two salaries? You don't have anywhere to put another salary without adding a third column. This also wastes space for those players that only have one salary, and searching through this table for a player with a specific salary becomes difficult. The same goes for theSalaryYearcolumns.
The right way to normalize the denormalized table to the first normal form is to ensure that there aren't repeating groups, as shown in the following table. The people table with player information would look like the following:
playerID
nameFirst
nameLast
birthYear
jjones01
Jim
Jones
1981
jsmith01
Joe
Smith
1974
TheSalary value has been removed and placed in another table with theplayerID field linking them to each other; therefore, the salaries table will look like the following:
salaryID
playerID
salary
year
1
jjones01
2750000
2010
2
jjones01
4500000
2011
3
jsmith01
10600000
2014
Let's go through a denormalization example by looking at the following table:
playerID
namefirst
namelast
birthYear
franchID
franchname
teamID
RBI
rank
yearID
abbotpa01
Paul
Abbott
1967
PHI
Philadelphia
Phillies
PHI
2
2
2004
abreubo01
Bobby
Abreu
1974
PHI
Philadelphia Phillies
PHI
79
1
2000
abreubo01
Bobby
Abreu
1974
PHI
Philadelphia Phillies
PHI
110
3
2001
alcanar01
Arismendy
Alcantara
1991
CHI
Chicago Cubs
CHI
1
8
2015
almoral01
Albert
Almora
1994
CHI
Chicago Cubs
CHI
14
8
2016
almoral01
Albert
Almora
1994
CHI
Chicago Cubs
CHI
46
6
2017
alvarpe01
Pedro
Alvarez
1987
PIT
Pittsburg Pirates
PIT
77
17
2015
alvarto01
Tony
Alvarez
1979
PIT
Pittsburg Pirates
PIT
2
9
2002
alvarto01
Tony
Alvarez
1979
PIT
Pittsburg Pirates
PIT
8
1
2004
To meet the requirements of 1NF, you would need to split this table into multiple tables. Depending on the table you are trying to normalize, you might not need to split it if it's already following the rules of 1NF.
This table only contains the information about the player and has a primary key ofplayerID:
playerID
namefirst
namelast
birthYear
abbotpa01
Paul
Abbott
1967
abreubo01
Bobby
Abreu
1974
alcanar01
Arismendy
Alcantara
1991
almoral01
Albert
Almora
1994
alvarpe01
Pedro
Alvarez
1987
alvarto01
Tony
Alvarez
1979
The other table would contain the rest of the fields from the denormalized table. The following table has a foreign key relationship to the preceding table regarding playerID:
playerID
franchID
franchname
teamID
RBI
rank
yearID
abbotpa01
PHI
Philadelphia Phillies
PHI
2
2
2004
abreubo01
PHI
Philadelphia Phillies
PHI
79
1
2000
abreubo01
PHI
Philadelphia Phillies
PHI
110
3
2001
alcanar01
CHI
Chicago Cubs
CHI
1
8
2015
almoral01
CHI
Chicago Cubs
CHI
14
8
2016
almoral01
CHI
Chicago Cubs
CHI
46
6
2017
alvarpe01
PIT
Pittsburg Pirates
PIT
77
17
2015
alvarto01
PIT
Pittsburg Pirates
PIT
2
9
2002
alvarto01
PIT
Pittsburg Pirates
PIT
8
1
2004
The second normal form (2NF) is the second level of database normalization. You will need to complete 1NF before beginning this step. The primary reason to implement 2NF is to narrow tables down to a single purpose, which makes it easier to use and design tables. This step will ensure that you do the following:
Meet the requirements of 1NF
: You will need to implement 1NF before you can use 2NF.
Remove partial dependencies
: This will entail narrowing tables down to a single purpose where possible.
Starting with the tables from our 1NF example, you can break these down further into additional tables. You will still have the same player table from 1NF since it serves a single purpose of giving us player information. The franchise table has multiple purposes with RBI and rank, so since RBI isn't related to the franchise, you will split the franchise table into two.
The franchise table has all the franchise and team information in it now, and the RBI columns and related columns can be split out into a batting table. The franchise table still has a primary key of franchID and playerID with a foreign key referring back to the player table on playerID:
playerID
franchID
franchname
teamID
rank
yearID
abbotpa01
PHI
Philadelphia Phillies
PHI
2
2004
abreubo01
PHI
Philadelphia Phillies
PHI
1
2000
abreubo01
PHI
Philadelphia Phillies
PHI
3
2001
alcanar01
CHI
Chicago Cubs
CHI
8
2015
almoral01
CHI
Chicago Cubs
CHI
8
2016
almoral01
CHI
Chicago Cubs
CHI
6
2017
alvarpe01
PIT
Pittsburg Pirates
PIT
17
2015
alvarto01
PIT
Pittsburg Pirates
PIT
9
2002
alvarto01
PIT
Pittsburg Pirates
PIT
1
2004
The batting table has a primary key of playerID and teamID and has a foreign key of playerID to the player table and a foreign key of teamID to the franchise table:
playerID
teamID
RBI
yearID
abbotpa01
PHI
2
2004
abreubo01
PHI
79
2000
abreubo01
PHI
110
2001
alcanar01
CHI
1
2015
almoral01
CHI
14
2016
almoral01
CHI
46
2017
alvarpe01
PIT
77
2015
alvarto01
PIT
2
2002
alvarto01
PIT
8
2004
