SQL Server 2017 Integration Services Cookbook - Christian Cote - E-Book

SQL Server 2017 Integration Services Cookbook E-Book

Christian Coté

0,0
55,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Harness the power of SQL Server 2017 Integration Services to build your data integration solutions with ease

About This Book

  • Acquaint yourself with all the newly introduced features in SQL Server 2017 Integration Services
  • Program and extend your packages to enhance their functionality
  • This detailed, step-by-step guide covers everything you need to develop efficient data integration and data transformation solutions for your organization

Who This Book Is For

This book is ideal for software engineers, DW/ETL architects, and ETL developers who need to create a new, or enhance an existing, ETL implementation with SQL Server 2017 Integration Services. This book would also be good for individuals who develop ETL solutions that use SSIS and are keen to learn the new features and capabilities in SSIS 2017.

What You Will Learn

  • Understand the key components of an ETL solution using SQL Server 2016-2017 Integration Services
  • Design the architecture of a modern ETL solution
  • Have a good knowledge of the new capabilities and features added to Integration Services
  • Implement ETL solutions using Integration Services for both on-premises and Azure data
  • Improve the performance and scalability of an ETL solution
  • Enhance the ETL solution using a custom framework
  • Be able to work on the ETL solution with many other developers and have common design paradigms or techniques
  • Effectively use scripting to solve complex data issues

In Detail

SQL Server Integration Services is a tool that facilitates data extraction, consolidation, and loading options (ETL), SQL Server coding enhancements, data warehousing, and customizations. With the help of the recipes in this book, you'll gain complete hands-on experience of SSIS 2017 as well as the 2016 new features, design and development improvements including SCD, Tuning, and Customizations.

At the start, you'll learn to install and set up SSIS as well other SQL Server resources to make optimal use of this Business Intelligence tools. We'll begin by taking you through the new features in SSIS 2016/2017 and implementing the necessary features to get a modern scalable ETL solution that fits the modern data warehouse.

Through the course of chapters, you will learn how to design and build SSIS data warehouses packages using SQL Server Data Tools. Additionally, you'll learn to develop SSIS packages designed to maintain a data warehouse using the Data Flow and other control flow tasks. You'll also be demonstrated many recipes on cleansing data and how to get the end result after applying different transformations. Some real-world scenarios that you might face are also covered and how to handle various issues that you might face when designing your packages.

At the end of this book, you'll get to know all the key concepts to perform data integration and transformation. You'll have explored on-premises Big Data integration processes to create a classic data warehouse, and will know how to extend the toolbox with custom tasks and transforms.

Style and approach

This cookbook follows a problem-solution approach and tackles all kinds of data integration scenarios by using the capabilities of SQL Server 2016 Integration Services. This book is well supplemented with screenshots, tips, and tricks. Each recipe focuses on a particular task and is written in a very easy-to-follow manner.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 353

Veröffentlichungsjahr: 2017

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



SQL Server 2017 Integration Services Cookbook
ETL techniques to load and transform data from various sources using SQL Server 2017 Integration Services
Christian Cote
Matija Lah
Dejan Sarka

BIRMINGHAM - MUMBAI

 

SQL Server 2017 Integration Services Cookbook

 

Copyright © 2017 Packt Publishing

 

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: June 2017

Production reference: 1300617

Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham 
B3 2PB, UK.

ISBN 978-1-78646-182-7

www.packtpub.com

Credits

AuthorsChristian Cote

Matija Lah

Dejan Sarka

Copy Editor Safis Editing

Reviewers Jasmin Azemovic

Marek Chmel

Tomaz Kastrun

Ruben Oliva Ramos

Project Coordinator Nidhi Joshi

Commissioning Editor Amey Varangaonkar

Proofreader Safis Editing

Acquisition Editor Vinay Agrekar

Indexer Pratik Shirodkar

Content Development Editor Cheryl Dsa

Graphics Tania Dutta

Technical Editor Dinesh Pawar

Production Coordinator Deepika Naik

 

About the Authors

Christian Cote is a database professional from Montreal, Quebec, Canada. For the past 16 years, he's been involved in various data warehouse projects and business intelligence projects. He has contributed to business intelligence solutions in various domains like pharmaceutical, finance, insurance, and many more. He's been a Microsoft Most Valuable Professional since 2009 and leads the Montreal PASS chapter.

Matija Lah has more than 15 years of experience working with Microsoft SQL Server, mostly from architecting data-centric solutions in the legal domain. His contributions to the SQL Server community have led to the Microsoft Most Valuable Professional award in 2007 (data platform). He spends most of his time on projects involving advanced information management, and natural language processing, but often finds time to speak at events related to Microsoft SQL Server where he loves to share his experience with the SQL Server platform.

Dejan Sarka, MCT and SQL Server Most Valuable Professional, is an independent trainer and consultant who focuses on the development of database and business intelligence applications, located in Ljubljana, Slovenia. Besides his projects, he spends around half of his time on training and mentoring. He is the founder of the Slovenian SQL Server and .NET users group. Dejan is the main author and coauthor of many books and courses about databases and SQL Server. He is a frequent speaker at many worldwide events.

About the Reviewers

Jasmin Azemovic is a university professor, active in the areas of database systems, information security, data privacy, forensic analysis, and fraud detection. His PhD degree was in the field of modeling design and developing an environment for the preservation of privacy inside database systems. He is the author of many scientific-research papers and two books: Writing T-SQL Queries for Beginners Using Microsoft SQL Server 2012 and Securing SQL Server 2012. He is an active member of the professional IT world: Microsoft MVP (Data Platform--eight years so far) and a security consultant. He is an active speaker at many IT professional and community conferences.

Marek Chmel is an IT consultant and trainer with more than 10 years of experience. He's a frequent speaker with a focus on Microsoft SQL Server, Azure ,and security topics. Marek writes for Microsoft's TechnetCZSK blog, and since 2012 he's an MVP: Data Platform. Marek is also recognized as a Microsoft Certified Trainer: Regional Lead for Czech Republic for a few years in a row, he holds many MCSE certifications, and on the top of that he's an ECCouncil Certified Ethical Hacker and holder of several eLearnSecurity certifications. Marek earned his MSc (Business and Informatics) degree from Nottingham Trent University. He started his career as a trainer for Microsoft Server courses. Later, he joined AT&T, as a sr. database administrator with a specialization in MSSQL Server, Data Platform, and Machine Learning.

Tomaz Kastrun is an SQL Server developer and data analyst. He has more than 15 years of experiences in the field of business warehousing, development, ETL, database administration, and query tuning. He also has more than 15 years of experience in the fields of data analysis, data mining, statistical research, and machine learning.He is Microsoft SQL Server MVP for data platforms and has been working with a Microsoft SQL Server since version 2000.Tomaz is a blogger, the author of many articles, the coauthor of a statistical analysis book, speaker at community and Microsoft events, and an avid coffee drinker.

Thanks to people who inspired me, the community, and the SQL family. Thank you, dear reader, for doing this. For endless inspiration, thank you Rubi.

Ruben Oliva Ramos is a computer systems engineer with a master's degree in computer and electronic systems engineering, teleinformatics, and networking specialization from University of Salle Bajio in Leon, Guanajuato, Mexico. He has more than five years of experience in developing web applications to control and monitor devices connected with Arduino and Raspberry Pi using web frameworks and cloud services to build IoT applications.

He is a mechatronics teacher at University of Salle Bajio and teaches students on the master's degree in Design and Engineering of Mechatronics Systems. He also works at Centro de Bachillerato Tecnologico Industrial 225 in Leon, Guanajuato Mexico, teaching subjects like: electronics, robotics and control, automation and microcontrollers at mechatronics technician career, consultant and developer projects in areas like: monitoring systems and datalogger data using technologies: Android, iOS, Windows Phone, HTML5, PHP, CSS, Ajax, JavaScript, Angular, ASP, .NET databases: SQlite, mongoDB, MySQL, web servers: Node.js, IIS, hardware programming: Arduino, Raspberry pi, Ethernet Shield, GPS and GSM/GPRS, ESP8266, control and monitor systems for data acquisition and programming.

 

I would like to thank God for helping me reviewing this book; my wife,Mayte, my sons, Ruben and Dario, for their support while writing this bookand in general for their support in all my projects. To my parents and mybrother and sister, whom I love.

 

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/178646182X.

If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

Table of Contents

Preface

What this book covers

What you need for this book

Who this book is for

Sections

Getting ready

How to do it...

How it works...

There's more...

See also

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

SSIS Setup

Introduction

SQL Server 2016 download

Getting ready

How to do it...

Installing JRE for PolyBase

Getting ready

How to do it...

How it works...

Installing SQL Server 2016

Getting ready

How to do it...

SQL Server Management Studio installation

Getting ready

How to do it...

SQL Server Data Tools installation

Getting ready

How to do it...

Testing SQL Server connectivity

Getting ready

How to do it...

What Is New in SSIS 2016

Introduction

Creating SSIS Catalog

Getting ready

How to do it...

Custom logging

Getting ready

How to do it...

How it works...

There's more...

Create a database

Create a simple project

Testing the custom logging level

See also

Azure tasks and transforms

Getting ready

How to do it...

See also

Incremental package deployment

Getting ready

How to do it...

There's more...

Multiple version support

Getting ready

How to do it...

There's more...

Error column name

Getting ready

How to do it...

Control Flow templates

Getting ready

How to do it...

Key Components of a Modern ETL Solution

Introduction

Installing the sample solution

Getting ready

How to do it...

There's more...

Deploying the source database with its data

Getting ready

How to do it...

There's more...

Deploying the target database

Getting ready

How to do it...

SSIS projects

Getting ready

How to do it...

Framework calls in EP_Staging.dtsx

Getting ready

How to do it...

There's more...

Data Warehouse Loading Techniques

Introduction

Designing patterns to load dimensions of a data warehouse

Getting ready

How to do it...

There's more...

Loading the data warehouse using the framework

Getting ready

How to do it...

Near real-time and on-demand loads

Getting ready

How to do it...

There's more...

Using parallelism

Getting ready

How to do it...

There's more...

Dealing with Data Quality

Introduction

Profiling data with SSIS

Getting ready

How to do it...

Creating a DQS knowledge base

Getting ready

How to do it...

Data cleansing with DQS

Getting ready

How to do it...

Creating a MDS model

Getting ready

How to do it...

Matching with DQS

Getting ready

How to do it...

Using SSIS fuzzy components

Getting ready

How to do it...

SSIS Performance and Scalability

Introduction

Using SQL Server Management Studio to execute an SSIS package

Getting ready

How to do it...

How it works...

Using T-SQL to execute an SSIS package

How to do it...

How it works...

Using the DTExec command-line utility to execute an SSIS package

How to do it...

How it works...

There's more...

Scheduling an SSIS package execution

Getting ready

How to do it...

How it works...

Using the cascading lookup pattern

How to do it...

How it works...

Using the lookup cache

How to do it...

How it works...

Using lookup expressions

How to do it...

How it works...

Determining the maximum number of worker threads in a data flow

How to do it...

How it works...

Using the master package concept

How to do it...

How it works...

Requesting an execution tree in SSDT

How to do it...

How it works...

Monitoring SSIS performance

Establishing a performance monitor session

How to do it...

How it works...

Configuring a performance monitor data collector set

How to do it...

How it works....

Unleash the Power of SSIS Script Task and Component

Introduction

Using variables in SSIS Script task

Getting ready

How to do it...

Execute complex filesystem operations with the Script task

Getting ready

How to do it...

Reading data profiling XML results with the Script task

Getting ready

How to do it...

Correcting data with the Script component

Getting ready

How to do it...

Validating data using regular expressions in a Script component

Getting ready

How to do it...

Using the Script component as a source

How to do it...

How it works...

Using the Script component as a destination

Getting ready

How to do it...

How it works...

SSIS and Advanced Analytics

Introduction

Splitting a dataset into a training and test set

Getting ready

How to do it...

Testing the randomness of the split with a SSAS decision trees model

Getting ready

How to do it...

Preparing a Naive Bayes SSAS data mining model

Getting ready

How to do it...

Querying the SSAS data mining model with the data mining query transformation

Getting ready

How to do it...

Creating an R data mining model

Getting ready

How to do it...

Using the R data mining model in SSIS

Getting ready

How to do it...

Text mining with term extraction and term lookup transformations

Getting ready

How to do it...

On-Premises and Azure Big Data Integration

Introduction

Azure Blob storage data management

Getting ready

How to do it...

Installing a Hortonworks cluster

Getting ready

How to do it...

Copying data to an on-premises cluster

Getting ready

How to do it...

Using Hive – creating a database

Getting ready

How to do it...

There's more...

Transforming the data with Hive

Getting ready

How to do it...

There's more...

Transferring data between Hadoop and Azure

Getting ready

How to do it...

Leveraging a HDInsight big data cluster

Getting ready

How to do it...

There's more...

Managing data with Pig Latin

Getting ready

How to do it...

There's more...

Importing Azure Blob storage data

Getting ready

How to do it...

There's more...

Azure Data Factory and SSIS 

Extending SSIS Custom Tasks and Transformations

Introduction

Designing a custom task

Getting ready

How to do it...

How it works...

Designing a custom transformation

How to do it...

How it works...

Managing custom component versions

Getting ready

How to do it...

How it works...

Scale Out with SSIS 2017

Introduction

SQL Server 2017 download and setup

Getting ready

How to do it...

There's more...

SQL Server client tools setup

Getting ready

How to do it...

Configuring SSIS for scale out executions

Getting ready

How to do it...

There's more...

Executing a package using scale out functionality

Getting ready

How to do it...

Preface

SQL Server Integration Services is a tool that facilitates data extraction, consolidation, and loading options (ETL), SQL Server coding enhancements, data warehousing, and customizations. With the help of the recipes in this book, you'll gain hands-on experience of SSIS 2017 as well as the new 2016 features, design and development improvements including SCD, tuning, and customizations. At the start, you'll learn to install and set up SSIS as well other SQL Server resources to make optimal use of this business intelligence tool. We'll begin by taking you through the new features in SSIS 2016/2017 and implementing the necessary features to get a modern scalable ETL solution that fits the modern data warehouse. Through the course of the book, you will learn how to design and build SSIS data warehouses packages using SQL Server Data Tools. Additionally, you'll learn how to develop SSIS packages designed to maintain a data warehouse using the data flow and other control flow tasks. You'll also go through many recipes on cleansing data and how to get the end result after applying different transformations. Some real-world scenarios that you might face are also covered and how to handle various issues that you might face when designing your packages. At the end of this book, you'll get to know all the key concepts to perform data integration and transformation. You'll have explored on-premises big data integration processes to create a classic data warehouse, and will know how to extend the toolbox with custom tasks and transforms.

What this book covers

Chapter 1, SSIS Setup, contains recipes describing the step by step setup of SQL Server 2016 to get the features that are used in the book.

Chapter2, What Is New in SSIS 2016, contains recipes that talk about the evolution of SSIS over time and what's new in SSIS 2016. This chapter is a detailed overview of Integration Services 2016, new features.

Chapter3, Key Components of a Modern ETL Solution, explains how ETL has evolved over the past few years and will explain what components are necessary to get a modern scalable ETL solution that fits the modern data warehouse. This chapter will also describe what each catalog view provides and will help you learn how you can use some of them to archive SSIS execution statistics.

Chapter4, Data Warehouse Loading Techniques, describes many patterns used when it comes to data warehouse or ODS load. You will learn how to effectively load a data warehouse and process a tabular model, maintain data partitions and modern data refresh rates.

Chapter5, Dealing with Data Quality, focuses on how SSIS can be leveraged to validate and load data. You will learn how to identify invalid data, cleanse data and load valid data to the data warehouse.

Chapter6, SSIS Performance and Scalability, will talk about how to monitor SSIS package execution. It will also provide solutions to scale out processes by using parallelism. You will learn how to identify bottlenecks and how to resolve them using various techniques.

Chapter7, Unleash the Power of SSIS Script Task and Component, covers how to use scripting with SSIS. You will learn how script tasks and script components are very valuable in many situations to overcome the limitations of stock toolbox tasks and transforms.

Chapter8, SSIS and Advanced Analytics, talks about how SSIS can be used to prepare the data you need for further analysis. Here, you will learn how you can make use of SQL Server Analysis Services (SSAS) and R models in the SSIS data flow.

Chapter9, On-Premises and Azure Big Data Integration, describes the Azure feature pack that allows SSIS to integrate Azure data from blob storage and HDInsight clusters. You will learn how to use Azure feature pack components to add flexibility to their SSIS solution architecture and integrate on-premises Big Data can be manipulated via SSIS.

Chapter10, Extending SSIS Tasks and Transformations, talks about extending and customizing the toolbox using custom developed tasks and transforms and security features. You will learn the pros and cons of creating custom tasks to extend the SSIS toolbox and secure your deployment.

Chapter 11, Scale Out with SSIS 2017, talks about scaling out SSIS package executions on multiple servers. You will learn how SSIS 2017 can scale out to multiple workers to enhance execution scalability.

What you need for this book

This book was written using SQL Server 2016 and all the examples and functions should work with it. Other tools you may need are Visual Studio 2015, SQL Data Tools 16 or higher and SQL Server Management Studio 17 or later.

In addition to that, you will need Hortonworks Sandbox Docker for Windows Azure account and Microsoft Azure.

The last chapter of this book has been written using SQL Server 2017.

Who this book is for

This book is ideal for software engineers, DW/ETL architects, and ETL developers who need to create a new, or enhance an existing, ETL implementation with SQL Server 2017 Integration Services. This book would also be good for individuals who develop ETL solutions that use SSIS and are keen to learn the new features and capabilities in SSIS 2017.

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it, How it works, There's more, and See also). To give clear instructions on how to complete a recipe, we use these sections as follows:

Getting ready

This section tells you what to expect in the recipe, and describes how to set up any software or any preliminary settings required for the recipe.

How to do it...

This section contains the steps required to follow the recipe.

How it works...

This section usually consists of a detailed explanation of what happened in the previous section.

There's more...

This section consists of additional information about the recipe in order to make the reader more knowledgeable about the recipe.

See also

This section provides helpful links to other useful information for the recipe.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "The last characters CI and AS are for case insensitive and accent sensitive, respectively." A block of code is set as follows:

USE DQS_STAGING_DATA; SELECT CustomerKey, FullName, StreetAddress, City, StateProvince, CountryRegion, EmailAddress, BirthDate, Occupation;

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Click on the Sign in visible at the right (top) to log into Visual Studio Dev Essentials."

Warnings or important notes appear in a box like this.
Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of. To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message. If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors .

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.

Hover the mouse pointer on the

SUPPORT

tab at the top.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click on

Code Download

.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account. Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/SQL-Server-2017-Integration-Services-Cookbook. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/SQLServer2017IntegrationServicesCookbook_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title. To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy. Please contact us at [email protected] with a link to the suspected pirated material. We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

SSIS Setup

In this chapter, we will cover the following recipes:

SQL Server 2016 download

Installing JRE for PolyBase

Installing SQL Server 2016

SQL Server Management Studio installation

SQL Server Data Tools installation

Test SQL Server connectivity

Introduction

This chapter will cover the basics of how to install SQL Server 2016 to properly go through the examples in this book. The version of SQL Server used through out this book is the Developer edition of SQL Server 2016. It's available for free as long as you subscribe to Visual Studio Dev Essentials.

SQL Server 2016 download

Following are the steps to download and install SQL Server 2016.

Getting ready

You need to have access to the internet for this recipe.

How to do it...

Open your browser and paste this link:

https://www.visualstudio.com/dev-essentials/

. The following page appears in your browser:

Click on 

Sign in

visible at the right (top) to log in Visual Studio Dev Essentials. If you don't have an existing subscription, you can create one by clicking on the

Join or access now

button in the middle of the page, as shown in the following screenshot:

You are directed to the

My Information

page. Click on 

My Benefits

at the top of the page to access the download section as shown in the following screenshot:

Click on the

Download

link in the

Microsoft SQL Server Developer Edition

tile as highlighted in the following screenshot:

This will redirect you to the

SQL Server 2016 Developer Edition

page. Click on the green arrow to start downloading the ISO file as shown in the following screenshot:

Due to its pretty large size, the file may take some time to download. The following screenshot is shows 44% done and 10 seconds left to download. This is due to the fact that the file is being downloaded on an Azure VM. It might take longer for you to download it. Depending on your browser, you should see the file downloading as in the following screenshot:

Don't mount the ISO file for now. We have to install an external component described in the next section before we proceed with the installation of SQL Server.

Installing JRE for PolyBase

Java Runtime Engine (JRE) is required for PolyBase installations. SQL Server PolyBase is the technology that allows data integration from other sources other than SQL Server tables. PolyBase is used to access data stored in Hadoop File System (HFS) or Windows Azure Storage Blob (WASB).

As you will see later in this book, SSIS can now interact with these types of storage natively but having PolyBase handy can save us valuable time in our ETL.

Getting ready

For this recipe you will need to have access to the internet and have administrative rights on your PC to install JRE.

How to do it...

To download JRE, follow this link:

http://www.oracle.com/technetwork/java/javase/downloads/index.html

. You will see the screen shown in the following screenshot:

This directs you to the Java SE Download at Oracle.

Click the download link in the JRE section as shown in the following screenshot:

You must accept the license agreement to be able to select a file to download. Select

Accept License Agreement

as indicated in the following screenshot:

Since SQL Server 2016 only exists in a 64-bit version, download the 64-bit JRE. The version of Java SE runtime environment might be different

from

 the one show in the screenshot, which is the one available at the time this book was written:.

Once downloaded, launch the installer. Click on

Run

as shown in Edge browser. Otherwise, go to your

Downloads

folder and double-click on the file you just downloaded (

jre-8U102-windows-x64.exe

in our case); you will see the following window:

The Oracle JRE installation starts. Click on

Install

. The following screen appears. It indicates the progress of the JRE installation.

Once the installation is completed, click on

Close

to quit the installer:

You are now ready to proceed to install SQL Server 2016. We'll do that in the next section.

How it works...

Microsoft integrated PolyBase in SQL Server 2016 to connect almost natively to the Hadoop and NoSQL platforms. Here are the technologies it allows us to connect to:

HDFS (Hortonworks and Cloudera)

Azure Blob Storage

Since Hadoop is using Java technology, JRE is used to interact with its functionalities.

Installing SQL Server 2016

This section will go through the installation of SQL Server engine, which will host the database objects used throughout this book.

These are the features available for SQL Server setup:

Database engine

: It is the core of SQL Server. It manages the various database objects such as tables, views, stored procedures, and so on.

Analysis services

: It allows us to create a data semantic layer that eases data consumption by users.

Reporting services (native)

: It allow us to create various reports, paginated, mobile, and KPI's for data consumption.

Integration services

: It is the purpose of this book, SQL Server data movement service.

Management tools

: We'll talk about these in the next section.

SQL Server Data Tools

: We'll talk about these in the next section.

Getting ready

This recipe assumes that you have downloaded SQL Server 2016 Developer Edition and you have installed Oracle JRE.

SQL Server Management Studio installation

SQL Server Management Studio is a separate download from SQL Server. This program will allow us, among other things, to create database objects and query SQL Server. Without this tool, we wouldn't be able to manage SQL Server databases easily.

Getting ready

This section assumes that you have installed SQL Server 2016.

How to do it...

To download and install SQL Server Management Studio, click on Install SQL Server Management Tools, as shown in the following screenshot:

The SSMS download page opens in your browser. Click

Download SQL Server Management Studio

(the latest version) to start the download process. Once downloaded, run the installation as shown in the following screenshot:

Click

Install

, as shown in the following screenshot:

This will direct you to the Microsoft SQL Server Management Studio installation screen as follows. The installation is in progress; it may take several minutes to complete.

 Click

Close

to close the installation wizard, as shown in the following screenshot:

SQL Server Data Tools installation

The last part of our SQL Server 2016 setup is to install SQL Server Data Tools. This will install a Visual Studio Shell that contains BI templates necessary for the following:

SQL Server integration services

SQL Server analysis services

SQL Server reporting services

Database object management

Getting ready

We'll use SSDT throughout this book to create, deploy, and maintain our SSIS packages and some databases.

How to do it...

From the SQL Server 2016 setup utility, click on

SQL Server Data Tools (SSDT)

. This will open the

Download SQL Server Data Tools (SSDT)

download page in your browser as shown in the following screenshot:

Here, there are two choices:

Install SSDT only: This is the simplest scenario. It only installs SSDT and a development shell.

Install Visual Studio and SSDT: You choose this if you plan to use source control inside Visual Studio or when you want to implement different types of development (.NET, Python, and so on) such as SSIS/SSAS/SSRS development. Since we'll talk about custom components in this book, we'll install Visual Studio Community Edition. This version is free for individuals.

Click on the 

Download Visual Studio Community 2015

link to download the Visual Studio installer.

Once downloaded, click on

Run