Analytic SQL in SQL Server 2014/2016 - Riadh Ghlala - E-Book

Analytic SQL in SQL Server 2014/2016 E-Book

Riadh Ghlala

0,0
139,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Business Intelligence (BI) has emerged as a field which seeks to support managers in decision-making. It encompasses the techniques, methods and tools for conducting analytically-based IT solutions, which are referred to as OLAP (OnLine Analytical Processing). Within this field, SQL has a role as a leader and is continuously evolving to cover both transactional and analytical data management. This book discusses the functions provided by Microsoft® SQL Server 2014/2016 in terms of business intelligence. The analytic functions are considered as an enrichment of the SQL language. They combine a series of practical functions to answer complex analysis requests with all the simplicity, elegance and acquired performance of the SQL language. Drawing on the wide experience of the author in teaching and research, as well as insights from contacts in the industry, this book focuses on the issues and difficulties faced by academics (students and teachers) and professionals engaged in data analysis with the SQL Server 2014/2016 database management system.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 146

Veröffentlichungsjahr: 2019

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Introduction

1 Data Analysis Fundamentals with the SQL Language

1.1. Data at the heart of the information system

1.2. SQL Server and data analysis

2 Queries

2.1. Data filtering

2.2. Sorting data

2.3. Data pagination

2.4. Subqueries

2.5. Options for the FROM clause

3 Operators

3.1. Joins

3.2. Set operators

3.3. Pivoting operators

4 Functions

4.1. Predefined functions

4.2. Aggregation functions

4.3. Windowing functions

4.4. Analytic functions

References

Index

End User License Agreement

List of Tables

Chapter 1

Table 1.1. SQL standards

Table 1.2. SQL Server versions

Chapter 4

Table 4.1. The most commonly used aggregation functions

Table 4.2. Categories of analytic functions used with the windowing mechanism

List of Illustrations

Introduction

Figure I.1. Number of Indeed job offers by programming language (January 2019)

Figure I.2. Outline of the work

Figure I.3. Mind map of the introduction

Figure I.4. Steps of programming with a procedural language

Figure I.5. Data query with the SQL language

Figure I.6. OLTP versus OLAP

Figure I.7. Code-first versus DB-first

Chapter 1

Figure 1.1. Enterprise architecture

Figure 1.2. Mind map of the first section

Figure 1.3. Database engineering process

Figure 1.4. Features of a DBMS

Figure 1.5. The main relational DBMSs

Figure 1.6. Edgar Frank Codd (23 August 1923–18 April 2003, United Kingdom). Tur...

Figure 1.7. The relational model: a model based on set theory

Figure 1.8. SQL language commands

Figure 1.9. Mind map of the second section

Figure 1.10. DBMS ranking (top 10)

Figure 1.11. SQL Server services

Figure 1.12. SQL Server editions

Figure 1.13. The evolution of SQL analytic functions in Microsoft SQL Server

Figure 1.14. Integration of the R server into SQL Server version 2016

Figure 1.15. Integration of the Python language into SQL Server version 2017

Chapter 2

Figure 2.1. Mind map of the first section

Figure 2.2. Installation of the full-text search service

Figure 2.3. Mind map of the second section

Figure 2.4. Estimated execution plan of a request

Figure 2.5. Mind map of the third section

Figure 2.6. Mental map of the fourth section

Figure 2.7. Operation of autonomous subqueries

Figure 2.8. Operation of correlated subqueries

Figure 2.9. Order of writing and execution of the different clauses of a query

Figure 2.10. Mind map of the fifth section

Figure 2.11. Reasons for using views

Chapter 3

Figure 3.1. Mind map of the first section

Figure 3.2. Cartesian product of countries with continents

Figure 3.3. Relational and set operators of SQL language

Figure 3.4. Mind map of the second section

Figure 3.5. Mind map of the third section

Chapter 4

Figure 4.1. Mind map of the first section

Figure 4.2. Built-in SQL Server functions

Figure 4.3. Mind map of the second section

Figure 4.4. Filters in a query using aggregation functions with the GROUP BY cla...

Figure 4.5. The ROLLUP option

Figure 4.6. The CUBE option

Figure 4.7. The GROUPING SETS option

Figure 4.8. Dashboard with aggregation functions. For a color version of this fi...

Figure 4.9. Mind map of the third section

Figure 4.10. Data windowing with dataset partitioning

Figure 4.11. Mind map of the fourth section

Guide

Cover

Table of Contents

Begin Reading

Pages

v

iii

iv

ix

x

xi

xii

xiii

xiv

xv

xvi

xvii

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

137

138

139

140

141

142

143

144

145

146

147

148

149

151

Series Editor

Jean-Charles Pomerol

Analytic SQL in SQL Server 2014/2016

Riadh Ghlala

First published 2019 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd

27-37 St George’s Road

London SW19 4EU

UK

www.iste.co.uk

John Wiley & Sons, Inc.

111 River Street

Hoboken, NJ 07030

USA

www.wiley.com

© ISTE Ltd 2019

The rights of Riadh Ghlala to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2019939237

British Library Cataloguing-in-Publication Data

A CIP record for this book is available from the British Library

ISBN 978-1-78630-412-4

Introduction

I.1. Motivation

Nowadays, data analysis is a skill that is in high demand in order to develop indicators and metrics to monitor our information systems. In order to pursue this mission, several methods, techniques and tools have been developed to cover this need in various aspects.

Database Management Systems (DBMS), along with their querying language, Structured Query Language (SQL), are considered a very important path in data analysis. Based on analytic queries directly executed on the server side, a wide range of indicators can be developed without going through intermediate application layers.

Since its invention in the 1980s, SQL has demonstrated its dominance in data management. According to a study conducted by Indeed1, SQL has the highest ranking among skills sought by IT talent researchers.

Figure I.1 shows the position of SQL in relation to other computer languages.

The spread of SQL is explained by its characteristics of simplicity, high performance and diversity. Indeed, this language, which was previously designed for the definition and manipulation of data, has now become essential for analyzing this data.

Figure I.1.Number of Indeed job offers by programming language (January 2019)

I.2. Outline of the work

This book, composed of four chapters, studies SQL as a tool for data management and analysis.

Although the concepts of this language are standardized, this book presents them with the Microsoft SQL Server Database Management System (DBMS) and focuses primarily on the analytical aspect of this language.

Figure I.2.Outline of the work

I.3. Mind map of the introduction

This introduction will be devoted to the presentation of SQL as a querying language for transactional and/or analytical data processing. Figure I.3 presents a mind map of the different concepts that will be covered in this introduction.

Figure I.3.Mind map of the introduction

I.4. Programming language versus querying language

The world of information technology contains a variety of languages, each dealing with a particular area of our information system. Among these languages, we find that the SQL language is designated as a querying language as opposed to the majority of other programming languages such as C#, Java or Python.

I.4.1. Programming language

Programming languages, also known as procedural languages (we also consider object-oriented languages which are procedural languages with object-oriented mechanisms), are languages that allow developers to express their needs when automating particular tasks in the information system and also to specify to the machine (the computer) the method to apply to execution. Figure I.4 shows the process followed for programming with a procedural language.

Figure I.4.Steps of programming with a procedural language

I.4.2. Querying language

The strength of querying languages, and in particular SQL, is that they ask users to focus on expressing their needs and to delegate the development of execution plans reflecting methods of execution to the tool used, which is the DBMS. Figure I.5 shows the process followed for requesting data using SQL.

Figure I.5.Data query with the SQL language

I.5. Transactional processing versus analytical processing

Data represents the static axis of any information system. The processing of this data, which is done using the SQL language, is divided into two categories: definition and manipulation. In addition to the definition of data usually performed on the database server side, the manipulation of this data is entrusted to the business logic layer to execute transactions or perform queries.

I.5.1. Transactional processing

Transactional processing, also known as OnLine Transactional Processing (OLTP), consists of acting upon the data to change its status by addition, modification or deletion. This type of processing must be subject to very strict controls to ensure proper functioning in a multi-user and distributed context.

Figure I.6.OLTP versus OLAP

I.5.2. Analytical processing

Analytical processing, also known as OnLine Analytical Processing (OLAP), consists of interrogating the data in order to develop synthetic information that can be used as indicators for the decision-maker. Figure I.6 compares the two types of processing, OLTP and OLAP.

I.6. Code-first versus database-first

The IT solutions that manage our information systems present numerous variants of OLAP. Indeed, this analytical processing can be performed with SQL queries directly executed on the database server or by integrating these queries into the application layer programs.

I.6.1. Code-first

The term “code-first” refers to an approach that favors the use of the business logic layer for the development of a new need in the information system. In the case of analytical processing, this approach consists of developing the indicators requested by managers through programming or through graphical interfaces to generate the necessary programs. This approach has several disadvantages, such as the complexity of the programs to be developed, the security of data exchanges between these programs and the database, and the performance dilemma, especially when handling large amounts of information.

I.6.2. Database-first

The term “database-first” refers to an approach that promotes, for the development of a new need in the information system, the implementation of this need directly in the database via a functionality provided by the DBMS. In the case of analytical processing, this approach consists of developing the indicators requested by decision-makers through appropriate SQL queries designated by the analytic functions of SQL or simply analytic SQL. This approach is important in information systems because of the simplicity, performance and security of the data in these types of requests. The use of this approach requires the mastery of these analytic functionalities of the SQL language.

Figure I.7.Code-first versus DB-first

I.7. Conclusion

Data analytics is a crucial need in information systems. This need can be met by various IT solutions for the development of the different indicators required by managers to supervise their firm’s activities in a highly competitive economic environment. SQL, as a querying language, is constantly being enriched with new features to improve its analytical capacity and therefore to master this aspect of analytical processing (OLAP), especially in database-first mode.

1

Indeed

is a job search engine:

https://www.indeed.com/

.

1Data Analysis Fundamentals with the SQL Language

1.1. Data at the heart of the information system

1.1.1. Introduction

The development of IT solutions becomes more complicated from one day to the next due to the ever-increasing complexity of information systems and the need to align business needs with these IT solutions. In every generation of information system, data remains the cornerstone of these IT solutions. This importance is due to the static aspect of data in real-world modeling and the ability to provide any processing requested by application developers to meet the needs of the business logic layer.

Figure 1.1 shows the position of data in the urbanization of information systems according to the Enterprise Architecture (EA) vision1. This model puts two of the components on equal footing: data and applications.

Indeed, the role of data in meeting the needs of the business logic layer is essential, whether through the applications component or by direct access to the data component by means of a query language such as SQL.

Figure 1.1.Enterprise architecture

1.1.2. Mind map of the first section

This first section will be devoted to introducing the basics of databases, the SQL language and Microsoft SQL Server DBMS.

Figure 1.2 presents a mind map of the different concepts that will be covered in this section.

1.1.3. Concept of a database

A database (referred to as DB) is a collection of homogeneous data relating to a real-world domain (information system), stored in a structured way and with as little redundancy as possible to facilitate and accelerate access to these data later on.

These data must be accessible by different users, possibly simultaneously. Thus, the notion of database is generally coupled with that of computer networks in order to be able to share this information.

Figure 1.2.Mind map of the first section

The engineering process of a database goes through three essential steps: design, generation of the database schema in an appropriate formalization (in our case, the relational model) and finally implementation of this database.

Figure 1.3.Database engineering process

1.1.4. Database Management System (DBMS)

Database management consists of the implementation, operation and administration of the database. These tasks are performed by software using a set of commands written in a query language called SQL.

The range of DBMSs is very diverse. The choice of a DBMS is made according to several criteria: functionalities provided, cost, performance, security, etc.

Figure 1.4.Features of a DBMS

Figure 1.5.The main relational DBMSs

1.1.5. The relational model

In 1970, Edgar F. Codd, a researcher at IBM2, proposed in a mathematical thesis the representation of data, links and even the expected results of a search in a database in table form. However, IBM — which was then working on another type of database — did not begin to take an interest in it until 1978, when the concept interested Lawrence Ellison, the founder of a start-up that became Oracle Corporation.

This proposal is the basis of the relational data model, a model used by almost all DBMSs. These DBMSs are qualified as Relational DBMSs (RDBMSs). In the past, this label was obvious since the majority of DBMSs were relational. Today, with a very diversified landscape including SQL, NewSQL and NoSQL, it is essential to specify the qualification of the DBMS.

This book discusses the analytic functions of the SQL language in relational DBMSs. The illustrative examples are based on the use of Microsoft SQL Server 2014/2016 DBMS.

Figure 1.6.Edgar Frank Codd (23 August 1923–18 April 2003, United Kingdom). Turing Prize in 1981

Figure 1.7.The relational model: a model based on set theory

1.1.6. The SQL language

SQL (Structured Query Language) is a computer language used to communicate with a database. Unlike programming languages where the user is called upon to specify the execution algorithm, SQL makes it possible to provide results by simply formulating our needs in the form of queries close to natural languages.

1.1.6.1. SQL commands

The SQL language consists of a set of commands that can be classified into four sub-languages according to their roles in data management within the database.

Figure 1.8 shows the SQL language commands and their classifications which are:

— Data Definition Language (DDL) containing commands for creating, modifying and deleting objects managed in the database, the most important of which are the tables that represent the data storage structures;

— Data Manipulation Language (DML) containing commands to search, insert, modify, delete and merge data into tables;

— Data Control Language (DCL) containing the commands for managing access control (allocation, removal and prohibition) to data;

— Transaction Control Language (TCL) containing transaction control commands.

Figure 1.8.SQL language commands

1.1.6.2. SQL standards