Knight's Microsoft SQL Server 2012 Integration Services 24-Hour Trainer - Brian Knight - E-Book

Knight's Microsoft SQL Server 2012 Integration Services 24-Hour Trainer E-Book

Brian Knight

0,0
32,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Book-and-video package gets novices up to speed on Microsoft SQL Server 2012 If you need a practical, hands-on introduction, especially to SQL Server Integration Services (SSIS), this book-and-video package from authority Brian Knight is the perfect solution. Each lesson includes three major components: a description of how each SSIS feature or process works, a tutorial that walks you through the process or technique, and an accompanying video lesson. It's a complete learning package that will give you the confidence you need to start your first SSIS project. * Guides novice database administrators and developers who are learning Microsoft SQL Server 2012 and SQL Server Integration Services (SSIS) * Provides expert instruction from leading SQL Server authority and author, Brian Knight * Includes a book and a video, complete instruction that includes lessons, hands-on tutorials, and video demonstrations by the author * Covers the very latest changes and updates in the SQL Server 2012 release Microsoft SQL Server 2012 Integration Services 24-Hour Trainer makes SQL Server 2012 and SSIS much less intimidating. Note: CD-ROM/DVD and other supplementary materials are not included as part of the e-book file, but are available for download after purchase.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 534

Veröffentlichungsjahr: 2012

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Welcome to SISS

Section 1

Chapter 1: Moving Data with the Import and Export Wizard

Try It

Chapter 2: Installing SQL Server Integration Services

Chapter 3: Installing the Sample Databases

Try It

Chapter 4: Creating a Solution and Project

Try It

Chapter 5: Exploring SQL Server Data Tools

The Solution Explorer

Deployment Models

The Properties Window

The Toolbox

The SSDT Design Environment

Chapter 6: Creating Your First Package

Creating and Using Connection Managers

Using and Configuring Tasks

Exploring Package Encryption

Executing Packages

Try It

Chapter 7: Upgrading Packages to SQL Server 2012

Try It

Chapter 8: Upgrading to the Project Deployment Model

Try It

Section 2

Chapter 9: Using Precedence Constraints

Try It

Chapter 10: Manipulating Files with the File System Task

Try It

Chapter 11: Coding Custom Script Tasks

Try It

Chapter 12: Using the Execute SQL Task

Try It

Chapter 13: Using the Execute Process Task

Try It

Chapter 14: Using the Expression Task

Try It

Chapter 15: Using the Send Mail Task

Try It

Chapter 16: Using the FTP Task

Try It

Chapter 17: Creating a Data Flow

Try It

Section 3

Chapter 18: Extracting Data from Sources

Source Assistant

OLE DB Source

Excel Source

Flat File Source

Chapter 19: Loading Data to a Destination

Destination Assistant

OLE DB Destination

Flat File Destination

Excel Destination

Chapter 20: Changing Data Types with the Data Conversion Transform

Try It

Chapter 21: Creating and Replacing Columns with the Derived Column Transform

Try It

Chapter 22: Rolling Up Data with the Aggregate Transform

Try It

Chapter 23: Ordering Data with the Sort Transform

Try It

Chapter 24: Joining Data with the Lookup Transform

Cache Modes

The Cache Connection Manager and Transform

Chapter 25: Auditing Data with the Row Count Transform

Try It

Chapter 26: Combining Multiple Inputs with the Union All Transform

Try It

Chapter 27: Cleansing Data with the Script Component

Try It

Chapter 28: Separating Data with the Conditional Split Transform

Try It

Chapter 29: Altering Rows with the OLE DB Command Transform

Try It

Chapter 30: Handling Bad Data with the Fuzzy Lookup

Try It

Chapter 31: Removing Duplicates with the Fuzzy Grouping Transform

Try It

Section 4

Chapter 32: Making a Package Dynamic with Variables

Try It

Chapter 33: Making a Package Dynamic with Parameters

Try It

Chapter 34: Making a Connection Dynamic with Expressions

Try It

Chapter 35: Making a Task Dynamic with Expressions

Try It

Section 5

Chapter 36: Loading Data Incrementally

Try It

Chapter 37: Using the CDC Components in SSIS

CDC Control Task

CDC Source Task

CDC Splitter Task

Try It

Chapter 38: Using Data Quality Services

Try It

Chapter 39: Using the DQS Cleansing Transform

Try It

Chapter 40: Creating a Master Package

Try It

Section 6

Chapter 41: Using Sequence Containers to Organize a Package

Try It

Chapter 42: Using For Loop Containers to Repeat Control Flow Tasks

Try It

Chapter 43: Using the Foreach Loop Container to Loop Through a Collection of Objects

Try It

Section 7

Chapter 44: Easing Deployment with Configuration Tables

Try It

Final Deployment

Chapter 45: Easing Deployment with Configuration Files

Try It

Final Deployment

Chapter 46: Configuring Child Packages

Configuring an Execute Package Task

Configuring a Child Package

Try It

Section 8

Chapter 47: Logging Package Data

Try It

Chapter 48: Using Event Handlers

Creating Event Handlers

Common Uses for Event Handlers

Try It

Chapter 49: Troubleshooting Errors

Working in the Progress Tab

Troubleshooting Steps

Try It

Chapter 50: Using Data Viewers

Try It

Chapter 51: Using Breakpoints

Try It

Section 9

Chapter 52: Creating and Configuring the SSIS Catalog

Creating the Catalog

Configuring the Catalog

Creating and Using Folders

Try It

Chapter 53: Deploying Packages to the Package Catalog

Using the Deployment Wizard

Deploying Packages in the Package Deployment Model

Try It

Chapter 54: Configuring the Packages

Creating Environments

Configuring the Package

Try It

Chapter 55: Configuring the Service

Try It

Chapter 56: Securing SSIS Packages

Securing Packages in the Package Deployment Model

Securing Packages in the Project Deployment Model

Try It

Chapter 57: Running SSIS Packages

Executing Packages in the Package Deployment Model

Running Packages in the Project Deployment Model

Try It

Chapter 58: Running Packages in T-SQL and Debugging Packages

Running the Package

Debugging When Something Goes Wrong

Try It

Chapter 59: Scheduling Packages

Using Proxy Accounts

Try It

Section 10

Chapter 60: Dimension Load

Try It

Chapter 61: Fact Table Load

Try It

Section 11

Chapter 62: Bringing It All Together

Lesson Requirements

Hints

Step-by-Step

Appendix A: SSIS Component Crib Notes

When to Use Control Flow Tasks

When to Use Data Flow Transforms

Appendix B: Problem and Solution Crib Notes

Appendix C: What’s on the DVD?

System Requirements

Using the DVD

What’s on the DVD

Troubleshooting

Customer Care

Preface

John Wiley & Sons, Inc. End-User License Agreement

Welcome to SSIS

SQL Server Integration Services (SSIS) is one of the most powerful applications in your arsenal for moving data in and out of various databases and files. Like the rest of the business intelligence (BI) suite that comes with SQL Server, SSIS is already included in your SQL Server license when you pay for the Standard, BI, or Enterprise editions of SQL Server. Even though SSIS is included in SQL Server, you don’t even need to have SQL Server installed to make it function. Because of that, even if your environment is not using a lot of SQL Server, you can still use SSIS as a platform for data movement.

Though ultimately this book is more interactive in nature, this introduction first walks you through a high-level tour of SSIS so you have a life preserver on prior to jumping in the pool. Each topic touched on in this introduction is covered in much more depth throughout the book in lesson form and in the supporting videos on the DVD.

Import and Export Wizard

If you need to move data quickly from almost any data source to a destination, you can use the SSIS Import and Export Wizard (shown in Figure 1). The wizard is a quick way to move the data and perform very light transformations of data, such as casting of the data into new data types. You can quickly check any table you want to transfer, as well as write a query against the data to retrieve only a selective amount of data.

Figure 1

SQL Server Data Tools

SQL Server Data Tools (SSDT) is the central tool that you’ll spend most of your time in as an SSIS developer (really as a SQL Server developer). Like the rest of SQL Server, the tool’s foundation is the Visual Studio 2010 interface (shown in Figure 2), and SSDT is installed when you install SQL Server 2012. The nicest thing about the tool is that it’s not bound to any particular SQL Server. In other words, you won’t have to connect to a SQL Server to design an SSIS package. You can design the package disconnected from your SQL Server environment and then deploy it to your target SQL Server or the filesystem on which you’d like it to run.

Architecture

Although SSIS has been a major extraction, transformation, and loading (ETL) platform for several releases of SQL Server, SQL Server 2012 has simplified the platform for developers and administrators. Because of its scalability and lower cost, SSIS is also a major player in the ETL market. What’s especially nice about SSIS is its price tag, which is free with the purchase of SQL Server. Other ETL tools can cost hundreds of thousands of dollars based on how you scale the software.

Figure 2

The SSIS architecture consists of five main components:

The SSIS service (there for legacy SSIS packages)

The SSIS runtime engine and the runtime executables

The SSIS catalog

The SSIS Data Flow engine and the Data Flow components

The SSIS clients

Let’s boil this down to the essentials that you need to know to do your job. The SSIS service (for packages running in legacy mode) and now the SSIS catalog handle the operational aspects of SSIS. The service is a Windows service that is installed when you install the SSIS component of SQL Server 2012, and it tracks the execution of packages (a collection of work items) and helps with the storage of the packages. You don’t need the SSIS service to run SSIS packages, but if the service is stopped, all the SSIS packages that are currently running will, in turn, stop by default.

This service is mainly used for packages stored in the older style of storing packages, the package deployment model. The new model, the project deployment model, uses something called the package catalog. The catalog is the newer way of storing packages that gives you many new options, like running packages with T-SQL. The catalog also stores basic operational information about your package.

The SSIS runtime engine and its complimentary programs actually run your SSIS packages. The engine saves the layout of your packages and manages the logging, debugging, configuration, connections, and transactions. Additionally, it manages handling your events to send you e-mails or log in to a database when an event is raised in your package. The runtime executables provide the following functionality to a package; these are discussed in more detail throughout this book:

Containers

—Provide structure and scope to your package

Tasks

—Provide the functionality to your package

Event handlers

—Respond to raised events in your package

Precedence constraints

—Provide an ordinal relationship between various items in your package

Packages

A core component of SSIS is the notion of a package. A package best parallels an executable program in Windows. Essentially, a package is a collection of tasks that execute in an orderly fashion. Precedence constraints help manage the order in which the tasks will execute. A package can be saved onto a SQL Server, which in actuality is saved in the msdb or package catalog database. It can also be saved as a .dtsx file, which is an XML structured file much like .rdl files are to Reporting Services. The end result of the package looks like what’s displayed in Figure 2, which was shown earlier.

Tasks

A task can best be described as an individual unit of work. Tasks provide functionality to your package, much like a method does in a programming language. A task can move a file, load a file into a database, send an e-mail, or write a set of .NET code for you, to name just a few of the things it can do. A small subset of the common tasks available to you comprises the following:

Bulk Insert Task

—Loads data into a table by using the BULK INSERT SQL command.

Data Flow Task

—This is the most important task that loads and transforms data into an OLE DB Destination.

Execute Package Task

—Enables you to execute a package from within a package, making your SSIS packages modular.

Execute Process Task

—Executes a program external to your package, like one to split your extract file into many files before processing the individual files.

Execute SQL Task

—Executes a SQL statement or stored procedure.

File System Task

—This task can handle directory operations like creating, renaming, or deleting a directory. It can also manage file operations like moving, copying, or deleting files.

FTP Task

—Sends or receives files from an FTP site.

Script Task

—Runs a set of VB.NET or C# coding inside a Visual Studio environment.

Send Mail Task

—Sends a mail message through SMTP.

Analysis Services Processing Task

—This task processes a SQL Server Analysis Services cube, dimension, or mining model.

Web Service Task

—Executes a method on a web service.

WMI Data Reader Task

—This task can run WQL queries against the Windows Management Instrumentation (WMI). This enables you to read the event log, get a list of applications that are installed, or determine hardware that is installed, to name a few examples.

WMI Event Watcher Task

—This task empowers SSIS to wait for and respond to certain WMI events that occur in the operating system.

XML Task

—Parses or processes an XML file. It can merge, split, or reformat an XML file.

These are only a few of the many tasks you have available to you. You can also write your own task or download a task from the web that does something else. Writing such a task only requires that you learn the SSIS object model and know VB.NET or C#. You can also use the Script Task to do things that the native tasks can’t do.

Data Flow Elements

Once you create a Data Flow Task, the Data Flow tab in SSDT is available to you for design. Just as the Control Flow tab handles the main workflow of the package, the Data Flow tab handles the transformation of data. Every package has a single Control Flow, but can have many Data Flows. Almost anything that manipulates data falls into the Data Flow category. You can see an example of a Data Flow in Figure 3, where data is pulled from an OLE DB Source and transformed before being written to a Flat File Destination. As data moves through each step of the Data Flow, the data changes based on what the transform does. For example, in Figure 3, a new column is derived using the Derived Column Transform and that new column is then available to subsequent transformations or to the destination.

Figure 3

You can add multiple Data Flow Tasks onto the Control Flow tab. You’ll notice that after you click on each one, it jumps to the Data Flow tab with the Data Flow Task name you selected in the drop-down box right under the tab. You can toggle between Data Flow Tasks easily by selecting the next Data Flow Task from that drop-down box.

Sources

A source is where you specify the location of your source data to pull from in the data flow. Sources will generally point to a connection manager in SSIS. By pointing them to the connection manager, you can reuse connections throughout your package because you need only change the connection in one place. Here are some of the common sources you’ll be using in SSIS:

OLE DB Source

—Connects to nearly any OLE DB Data Source like SQL Server, Access, Oracle, or DB2, to name just a few.

Excel Source

—Source that specializes in receiving data from Excel spreadsheets. This source also makes it easy to run SQL queries against your Excel spreadsheet to narrow the scope of the data that you want to pass through the flow.

Flat File Source

—Connects to a delimited or fixed-width file.

XML Source

—Can retrieve data from an XML document.

ODBC Source

—The ODBC Source enables you to connect to common data sources that don’t use OLE DB.

Destinations

Inside the Data Flow, destinations accept the data from the data sources and from the transformations. The flexible architecture can send the data to nearly any OLE DB–compliant data source or to a flat file. Like sources, destinations are managed through the connection manager. Some of the more common destinations in SSIS and available to you are as follows:

Excel Destination

—Outputs data from the Data Flow to an Excel spreadsheet that must already exist.

Flat File Destination

—Enables you to write data to a comma-delimited or fixed-width file.

OLE DB Destination

—Outputs data to an OLE DB data connection like SQL Server, Oracle, or Access.

SQL Server Destination

—The destination that you use to write data to SQL Server most efficiently. To use this, you must run the package from the destination.

Transformations

Transformations (or transforms) are a key component to the Data Flow that change the data to a format that you’d like. For example, you may want your data to be sorted and aggregated. Two transformations can accomplish this task for you. The nicest thing about transformations in SSIS is they are all done in-memory, and because of this they are extremely efficient. Memory handles data manipulation much faster than disk IO does, and you’ll find if disk paging occurs, your package that ran in 20 minutes will suddenly take hours. Here are some of the more common transforms you’ll use on a regular basis:

Aggregate

—Aggregates data from a transform or source similar to a GROUP BY statement in T-SQL.

Conditional Split

—Splits the data based on certain conditions being met. For example, if the State column is equal to Florida, send the data down a different path. This transform is similar to a CASE statement in T-SQL.

Data Conversion

—Converts a column’s data type to another data type. This transform is similar to a CAST statement in T-SQL.

Derived Column

—Performs an in-line update to the data or creates a new column from a formula. For example, you can use this to calculate a Profit column based on a Cost and SellPrice set of columns.

Fuzzy Grouping

—Performs data cleansing by finding rows that are likely duplicates.

Fuzzy Lookup

—Matches and standardizes data based on fuzzy logic. For example, this can transform the name Jon to John.

Lookup

—Performs a lookup on data to be used later in a transformation. For example, you can use this transformation to look up a city based on the ZIP code.

Multicast

—Sends a copy of the data to an additional path in the workflow and can be used to parallelize data. For example, you may want to send the same set of records to two tables.

OLE DB Command

—Executes an OLE DB command for each row in the Data Flow. Can be used to run an UPDATE or DELETE statement inside the Data Flow.

Row Count

—Stores the row count from the Data Flow into a variable for later use by, perhaps, an auditing solution.

Script Component

—Uses a script to transform the data. For example, you can use this to apply specialized business logic to your Data Flow.

Slowly Changing Dimension

—Coordinates the conditional insert or update of data in a slowly changing dimension during a data warehouse load.

Sort

—Sorts the data in the Data Flow by a given column and removes exact duplicates.

Union All

—Merges multiple data sets into a single data set.

Unpivot

—Unpivots the data from a non-normalized format to a relational format.

SSIS Capabilities Available in Editions of SQL Server 2012

The features in SSIS and SQL Server that are available to you vary widely based on what edition of SQL Server you’re using. As you can imagine, the higher-end edition of SQL Server you purchase, the more features are available. As for SSIS, you’ll have to use at least the Standard Edition to receive the bulk of the SSIS features. In the Express and Workgroup Editions, only the Import and Export Wizard is available to you. You’ll have to upgrade to the Enterprise or Developer Editions to see some features in SSIS. The advanced transformations available only with the Enterprise Edition are as follows:

Data Mining Query Transformation

Fuzzy Lookup and Fuzzy Grouping Transformations

Term Extraction and Term Lookup Transformations

Data Mining Model Training Destination

Dimension Processing Destination

Partition Processing Destination

Change Data Capture components

Higher speed data connectivity components such as connectivity to SAP or Oracle

Summary

This introduction exposed you to the SQL Server Integration Services (SSIS) architecture and some of the different elements you’ll be dealing with in SSIS. Tasks are individual units of work that are chained together with precedence constraints. Packages are executable programs in SSIS that are a collection of tasks. Finally, transformations are the Data Flow items that change the data to the form you request, such as sorting the data the way you want. Now that the overview is out of the way, it’s time to start the first section and your first set of lessons, and time for you to get your hands on SSIS.

As mentioned earlier, the print book comes with an accompanying DVD containing hours of instructional supporting video. At the end of each lesson in the book, you will find a box like this one pointing you to a video on the DVD that accompanies that lesson. In that video, one of us will walk you through the content and examples contained in that lesson. So, if seeing something done and hearing it explained helps you understand a subject better than just reading about it does, this text and video combination provides exactly what you need. There’s even an Introduction to SSIS video that you can watch to get started. Simply select the Intro to SSIS lesson on the DVD. You can also view the instructional videos online at www.wrox.com/go/ssis2012video.

Section 1

Installation and Getting Started

Lesson 1: Moving Data with the Import and Export Wizard

Lesson 2: Installing SQL Server Integration Services

Lesson 3: Installing the Sample Databases

Lesson 4: Creating a Solution and Project

Lesson 5: Exploring SQL Server Data Tools

Lesson 6: Creating Your First Package

Lesson 7: Upgrading Packages to SQL Server 2012

Lesson 8: Upgrading to the Project Deployment Model

Section 1

Installation and Getting Started

Lesson 1: Moving Data with the Import and Export Wizard

Lesson 2: Installing SQL Server Integration Services

Lesson 3: Installing the Sample Databases

Lesson 4: Creating a Solution and Project

Lesson 5: Exploring SQL Server Data Tools

Lesson 6: Creating Your First Package

Lesson 7: Upgrading Packages to SQL Server 2012

Lesson 8: Upgrading to the Project Deployment Model

Chapter 1

Moving Data with the Import and Export Wizard

The Import and Export Wizard is the easiest method to move data from sources like Excel, Oracle, DB2, SQL Server, and text files to nearly any destination. This wizard uses SSIS as a framework and can optionally save a package as its output prior to executing. The package it produces will not be the most elegant, but it can take a lot of the grunt work out of package development and provide the building blocks that are necessary for you to build the remainder of the package. Oftentimes as an SSIS developer, you’ll want to relegate the grunt work and heavy lifting to the wizard and do the more complex coding yourself. The wizard does no transformations or cleansing, but instead only moves data from point A to point B.

As with most SQL Server wizards, you have numerous ways to open the tool:

To open the Import and Export Wizard, right-click the database you want to import data from or export data to SQL Server Management Studio and select Tasks > Import Data (or Export Data based on what task you’re performing).

You can also open the wizard by right-clicking SSIS Packages in SQL Server Data Tools (SSDT) and selecting SSIS Import and Export Wizard.

Another common way to open it is from the Start menu under SQL Server 2012 by choosing Import and Export Data.

The last way to open the wizard is by typing

dtswizard.exe

at the command line or Run prompt.

Regardless of whether you need to import or export the data, the first few screens in the wizard look very similar.

Once the wizard comes up, you see the typical Microsoft wizard welcome screen. Click Next to begin specifying the source connection. If you opened the wizard from Management Studio by selecting Export Data, this screen is prepopulated. In this screen, you specify where your data is coming from in the Source drop-down box. Once you select the source, the rest of the options on the dialog box may vary based on the type of connection. The default source is SQL Native Client, and it looks like Figure 1-1. You have OLE DB Sources like SQL Server, Oracle, and Access available out of the box. You can also use text files and Excel files. After selecting the source, you have to fill in the provider-specific information.

Figure 1-1

For SQL Server, you must enter the server name (localhost means go to your local machine’s SQL Server instance, if applicable) and the username and password you want to use. If you’re going to connect with your Windows account, simply select Use Windows Authentication. Windows Authentication will pass your Windows local or domain credentials into the data source. Lastly, choose a database that you’d like to connect to. For most of the examples in this book, you use the AdventureWorks2012 database. You can see Lesson 3 of this book for more information on installing this sample database.

NOTE You can find the sample databases used for this book at the Wrox website at www.wrox.com/go/SQLSever2012DataSets.
NOTE Additional sources such as Sybase and DB2 can also become available if you install the vendors’ OLE DB or ODBC providers. You can download additional providers for free if you’re using Enterprise Edition by going to the SQL Server 2012 Feature Pack on the Microsoft website. You also have ODBC and ADO.NET providers available to you in SQL Server 2012.

After you click Next, you are taken to the next screen in the wizard, where you specify the destination for your data. The properties for this screen are exactly identical to those for the previous screen with the exception of the database. On the next screen, if you select the Copy data from one or more tables or views option, you can simply check the tables you want. If you select the Write a query to specify the data to transfer option, you can write an ad hoc query (after clicking Next) addressing where to select the data from or what stored procedure to use to retrieve your data.

The next screen enables you to select the table or tables you want to copy over and which table names you want them to be transferred to. If you want, you can click the Edit button to go to the Column Mappings dialog box (shown in Figure 1-2) for each table. Here you can change the mapping between each source and destination column. For example, if you want the DepartmentID column to go to the DepartmentID2 column on the destination, simply select the Destination drop-down box for the DepartmentID column and point it to the new column, or choose <ignore > to ignore the column altogether. By checking the Enabled identity insert box, you allow the wizard to insert into a column that has an identity (or autonumber) value assigned. If the data types don’t match between the source and destination columns, the wizard will add the necessary components to convert the data to a proper data type if possible.

Figure 1-2

The next screen enables you to save the package or just choose to run it immediately. You can uncheck Execute Immediately to just save the package for later modification and execution. You can open the package that executed in SQL Server Data Tools (SSDT) if you’d like. You do this by creating a project in SSDT and adding the package to the project. You cannot edit the package without an SSDT project to contain the package. We discuss how to create a project in Lesson 4 later in this book. The final screen executes the process and shows you the output log.

Try It

In this Try It, you learn how to quickly load a flat file into a database using the Import and Export Wizard. After this lesson, you’ll have a clear understanding of how the Import and Export Wizard is the easiest way to load data into almost any destination and how it is accessed from Management Studio or SSDT.

You can find the file associated with Lesson 1 on the companion website for this book at www.wrox.com.

Lesson Requirements

Load the ZipCodeExtract.csv file (which you can download at this book’s website at www.wrox.com) into any database of your choosing. We are using the AdventureWorks2012 database as our target, but that’s not a dependency. Note: The file’s first row holds the column names.

Hints

One of the fastest ways to access the Import and Export Wizard to load the data is through Management Studio. Right-click the target database and select Tasks > Import Data.

Step-by-Step

1. Open SQL Server Management Studio in the SQL Server 2012 program group.
2. Right-click the target database of your choosing (like AdventureWorks2012) and select Tasks > Import Data.
3. For the Data source, select Flat File Source, as shown in Figure 1-3. For the File name property, select the ZipCodeExtract.csv file that you can download from this book’s website at www.wrox.com. Check the Column names in the first data row option to read the column names from the first row of data from the flat file. Click the Columns page in the left pane to confirm that the file is delimited by commas.
4. Click Next to configure the destination. Point to any server and database you want.
5. On the Select Source Tables and Views screen, click Edit Mappings to go to the Column Mappings page. Change the StateAbbr to a size of 2 and the Population column to an int data type, as shown in Figure 1-4. Normally, you would evaluate each column to use the proper data length in an effort to save space.

Figure 1-3

Figure 1-4

6. Click OK to leave the Column Mappings page and then click Next to review any data type mapping warnings. The Data Mapping Warnings screen shows you where you have any columns for which the data types don’t match. You can ignore those warnings for the time being and click Next a few times to execute the package. If you are successful, you should see a total of 29,470 rows. You will see a truncation warning, which is a warning that you receive when you try to insert a 50-character string into a smaller sized column like a varchar(2), that you can also ignore.
Please select Lesson 1 on the DVD, or online at www.wrox.com/go/ssis2012video, to view the video that accompanies this lesson.

Chapter 2

Installing SQL Server Integration Services

This book requires that you have SQL Server Data Tools (SSDT) and the SQL Server Integration Services (SSIS) service installed. To develop SSIS, you cannot use SQL Express. The SSIS run time to run packages does ship with all editions, but on some of the lower editions that run time may not work with all SSIS components.

On the subject of editions of SQL Server, you have a decision to make as to which edition you want to install: Standard Edition, Business Intelligence (BI), or Enterprise Edition. Developer Edition is also available. It contains all the components of Enterprise Edition at a tiny fraction of the cost but is licensed for development only. Enterprise Edition gives you a few additional SSIS components that you may be interested in for SQL Server 2012:

Data Mining components

Fuzzy Lookup and Group transforms

Dimension and Partition Processing destinations

Term Extraction and Lookup transforms

Higher performance components for ODBC, Oracle, and SAP

Change Data Capture components

Additionally, the Enterprise Edition of SQL Server gives you database engine features that complement or may affect SSIS. One such feature is the Change Data Capture (CDC) feature, which enables you to easily synchronize two systems by querying SQL Server 2012 for only the changes that have occurred after a given date or time. Data compression is another key feature that may speed up your database reads and reduce your disk cost by 60–75 percent.

Oftentimes, if you care about the Enterprise Edition features enough, but don’t need Enterprise Edition for the database engine, you might decide to license an SSIS server with just the minimum number of client access licenses (CALs) instead of doing a per-core license. This approach reduces your SQL Server licensing cost sizably, but you now have new hardware cost to add.

When you’re installing SQL Server, you need to ensure that the SQL Server Data Tools, Integration Services, and Management Tools - Complete check boxes are selected in the Feature Selection screen (shown in Figure 2-1). The Integration Services option installs the run time and service necessary to run the packages, and is likely all you would need on a production server. The SQL Server Data Tools option installs the designer components, and the Management Tools option installs the DBA tools necessary to manage the packages later.

Figure 2-1

After you complete the Feature Selection screen, SQL Server installs all the necessary components without any wizard configuration required for SSIS. Once the installation is complete, open the configuration file located at C:\Program Files\Microsoft SQL Server\110\DTS\Binn\MsDtsSrvr.ini.xml. This file configures the SSIS service. Change the <ServerName> node where it currently says “.” to your SQL Server’s instance name where you want to store your packages. You can also change the directory from ..\Packages to the directory of your choice.

<

?xml

version=”1.0” encoding=”utf-8”?>

<DtsServiceConfiguration xmlns

:xsd

=”http://www.w3.org/2001/XMLSchema”

xmlns:xsi

=”http://www.w3.org/2001/XMLSchema-instance”>

<StopExecutingPackagesOnShutdown>true</StopExecutingPackagesOnShutdown>

<TopLevelFolders>

<Folder xsi

:type

=”SqlServerFolder”>

<Name>MSDB</Name>

<ServerName>

.<

/ServerName>

</Folder>

<Folder xsi

:type

=”FileSystemFolder”>

<Name>File System</Name>

<StorePath>

..

\Packages</StorePath>

</Folder>

</TopLevelFolders>

</DtsServiceConfiguration>

Once you modify this file, you need to restart the SSIS service from the SQL Server Configuration Manager under the SQL Server 2012 node in the Start menu or the Services applet.

Please select Lesson 2 on the DVD, or online at www.wrox.com/go/ssis2012video, to view the video that accompanies this lesson.

Chapter 3

Installing the Sample Databases

You will need two sample databases for the future tutorials in this book and for many tutorials on the web; both are provided by Microsoft. The AdventureWorks2012 database is an example database that simulates a bike retailer. It contains HR, accounting, and sales data for online transactions and store sales. The AdventureWorksDW2012 database is an example data warehouse for the same bike reseller.

The two sample databases are not installed by default with SQL Server 2012. You can download the sample databases used for this book at the Wrox website at www.wrox.com/go/SQLSever2012DataSets.

To use the AdventureWorks2012 database, you must enable the Full Text Search feature and enable the FileStream feature in SQL Server 2012. In addition, the SQL Server Full Text service must be running. You can still install the AdventureWorksDW2012 database without these, but not the AdventureWorks2012 database.

To install the Full Text Search feature, you must go back to the SQL Server Installation Center under SQL Server 2012 > Configuration Tools > SQL Server Installation Center. Walk through the installation wizard again as if you were doing a new installation, but when you get to the Feature Selection screen, ensure Full Text Search and Semantic Extractions for Search are selected.

The FileStream feature enables you to store files quickly and easily on the filesystem of the server, but they are treated like columns in a table. When you back up the database, it also backs up all files to which the table may refer. The feature is initially enabled in the installation wizard, but you can also enable it after the installation in the SQL Server Configuration Manager under SQL Server 2012 > Configuration Tools. Once the Configuration Manager is open, double-click the SQL Server database instance on which you want to enable the feature. This opens up the properties of the service, where you can go to the FILESTREAM tab to enable the feature, as shown in Figure 3-1.

Figure 3-1

Enabling FileStream requires that you restart the SQL Server instance.

To install the sample databases, download the AdventureWorks2012 and AdventureWorksDW2012 (data warehouse) MDF data files from the Wrox website and attach them to your server using the CREATE DATABASE command, as shown in the following code:

Try It

In this Try It, you download and install the necessary example databases to work through the rest of the lessons in this book.

Lesson Requirements

To do the examples in the book, you’ll need at least 300 MB of hard drive space and the SQL Server 2012 database engine installed.

Hints

Navigate to

www.wrox.com/go/SQLSever2012DataSets

to download the sample databases and make sure the Full Text service is installed and running prior to the installation.

Step-by-Step

WARNING Prior to installation, open the SQL Server Configuration Manager to start the SQL Full-text Filter Daemon Launcher for your instance. Failure to do this will cause the installation to fail.
4. The sample databases are now installed and ready to use in Management Studio and for the rest of the examples.
Please select Lesson 3 on the DVD, or online at www.wrox.com/go/ssis2012video, to view the video that accompanies this lesson.

Chapter 4

Creating a Solution and Project

You cannot create an SSIS package in SQL Server Data Tools (SSDT) without first having a solution and project. Additionally, for execution of the package in debug mode, which you use when troubleshooting, your package must be in a project and solution. Projects and solutions are containers for your packages that help you keep every component together and make you a more efficient SSIS developer.

SSDT is the program in which you’re going to develop your SSIS packages. In SQL Server 2012, SSDT is a Visual Studio 2010 shell. You can either open SSDT by itself under the SQL Server 2012 program group or open it by opening the full Visual Studio 2012 program.

An SSIS project is a container of one or more packages and other SSIS components. All the Visual Studio suite of products use the project construct to hold their files. For example, Reporting Services uses projects to hold its reports, and VB.NET uses projects to hold its VB.NET class files. In general, you want to align an SSIS project with a business project you’re working on. For example, you may have an SSIS project called “Data warehouse ETL.”

Projects mean much more in SQL Server 2012 than they did in SQL Server 2005 and 2008. This is because you now deploy projects, not packages, to production if your project is using the project deployment model. If you want to use the legacy deployment model where you deploy a package at a time, you will use the package deployment model. The new project deployment model is where many of the new SQL Server 2012 features that are discussed later in this book are used.

A solution is a container of one or more projects. Solutions enable many disparate types of projects to live under one container. For example, you may have a solution called “Enterprise Data Warehouse” with a SQL Server Reporting Services (SSRS) project called “Data warehouse reports,” another project for SSIS called “Data warehouse ETL,” and a final one for C# called “SharePoint code.” All of those projects could live under one roof, so if a report developer makes a change in his SSRS project, the SSIS developer is aware of that change.

When you create a project in SSDT, a solution is automatically created at the same time. To create a project, you can open SSDT and select File > New > Project. As you can see in Figure 4-1, the solution name is “Enterprise Data Warehouse” and its project is called “Datawarehouse Load.”

Figure 4-1

At first, the solution will not appear in your Solution Explorer because you have only a single project. Once you add a second project, it will appear. You can add subsequent projects into the same solution by going back to File > New > Project and selecting Add to Solution from the Solution drop-down box (which is shown in Figure 4-3 in the “Step-by-Step” later in this chapter). When you create your first project, you’ll notice in the Solution Explorer, which shows you all the projects and files, that there appears to be no solution. This is because solutions are hidden from you when you have only a single project in the solution. You can choose to always see the solution file in the Solution Explorer by going to Tools > Options and checking the Always show solution option in the Projects and Solutions page (shown in Figure 4-2).

Try It

In this Try It, you learn how to create your first solution and project, which you’ll be using throughout the rest of the book.

You can download examples of completed package, project, and solution files for this lesson from the book’s website at www.wrox.com.

Figure 4-2

Lesson Requirements

To successfully complete this lesson, you need to create a solution called Personal Trainer Solution and a project called Personal Trainer SSIS Project that will be used throughout this book.

Hints

To create the project, open SQL Server Data Tools and select File > New > Project.

Step-by-Step

1. Open SSDT from the SQL Server 2012 program group.
2. Click File > New > Project.
3. Select Business Intelligence Projects for the project type.
4. Select Integration Services Project for the template.
5. Type Personal Trainer SSIS Project for the Name property, as shown in Figure 4-3.
6. Type C:\projects\ for the Location property.
7. Type Personal Trainer Solution for the Solution Name property.

Figure 4-3

Please select Lesson 4 on the DVD, or online at www.wrox.com/go/ssis2012video, to view the video that accompanies this lesson.

Chapter 5

Exploring SQL Server Data Tools

SQL Server Data Tools (SSDT) is a Visual Studio 2010 tool that helps you create, debug, and execute SSIS packages. When you’re a business intelligence developer, it can also help you create reports in SQL Server Reporting Services (SSRS) or design cubes in SQL Server Analysis Services (SSAS). You’ll be using SSDT extensively throughout this book, so it’s important that in this lesson, you learn everything you need to know to make your life easier in this critical environment.

NOTE Because this is a more exploratory, introductory lesson, it doesn’t have a task-based tutorial as the other lessons have.

You can open SSDT through the SQL Server 2012 program group. Depending on your PC, SSDT may take some time to open.

NOTE One hint that you can use to reduce your load time is to eliminate the splash screen. To eliminate the SSDT splash screen and reduce your load time by a few seconds each time, right-click the SSDT shortcut and select Properties. Next, add the -NOSPLASH switch at the end of the shortcut as shown here:
“C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE\ devenv.exe” -NOSPLASH

The Solution Explorer

Once you create your project from Lesson 4, you’re ready to begin exploration of the environment. The most important pane, the Solution Explorer, is on the right. The Solution Explorer is where you can find all of your created SQL Server Integration Services (SSIS) packages, shared connection managers, and parameters. As discussed in Lesson 4, a solution is a container that holds a series of projects. Each project holds a myriad of objects for whatever type of project you’re working in. For SSIS, it holds your packages, shared parameters, and shared connections (the latter two are available only in the project deployment model, which is discussed in this lesson shortly in the “Deployment Models” section). Once you create a solution, you can store many projects inside of it. For example, you might have a solution that has your VB.NET application and all the SSIS packages that support that application. In this case, you would probably have two projects: one for VB and another for SSIS.

Figure 5-1

After creating a new project, your Solution Explorer window contains a series of empty folders and a single package in the Packages folder. Figure 5-1 shows you a partially filled Solution Explorer. In this screenshot, you see a solution named Enterprise Data Warehouse with one project, an Integration Services project called Datawarehouse Load. Inside the project, you’ll find the single default package, Package.dtsx.

If you don’t see the solution name in your Solution Explorer, it’s because solutions are hidden when you have only a single project. In this scenario, the solution won’t appear by default. To always show the solution, you can select Tools > Options to open the Visual Studio options pane. Under Projects and Solutions, check Always show solution, as shown in Figure 5-2.

Figure 5-2

If you look into the directory that contains your solution and project files, you can see all the files that are represented in the Solution Explorer window. Some of the base files you might see will have the following extensions:

.dtsx

—An SSIS package

.ds

—A shared data source file

.sln

—A solution file that contains one or more projects

.dtproj

—An SSIS project file

.params

—A shared parameter file

.conmgr

—A shared connection manager

If you copy any file that does not match the .params, .conmgr, or .dtsx extension, it is placed in the Miscellaneous folder. This folder is used to hold any files such as Word documents that describe the installation of the package or requirements documents. Anything you’d like can go into that folder, and it can all potentially be checked into a source control system like SourceSafe with the code.

Deployment Models

In SQL Server 2012, you have two models for developing and deploying packages: package and project deployment models.

The

package deployment model

used to be the only deployment model that existed in SQL Server 2005 and 2008 and was where you could deploy only a package at a time to the server. It also had ways of configuring the packages to change properties like connections with XML files or tables.

With the new

project deployment model

, you can only deploy the entire project of packages, and packages can be configured by the database administrator (DBA) through parameters.

You can switch back and forth between these models, but the new project deployment model is much more robust with features. You can switch back and forth between the models by right-clicking the project in the Solution Explorer and selecting Convert to Project (or Package) Deployment Model. You learn much more about this functionality in Lessons 52 and 53.

The Properties Window

Figure 5-3

The Properties window (shown in Figure 5-3) is where you can customize almost any item that you have selected. For example, if you select a task in the design pane, you receive a list of properties to configure, such as the task’s name and what query it’s going to use. The view varies widely based on what item you have selected. Figure 5-3 shows the properties of the Execute SQL Task. You can also click the white background of the Control Flow tab to see the package properties in the Properties window. Sometimes, you can see some more advanced properties in the Properties pane than what the task’s editor user interface provides you.

The Toolbox

Figure 5-4

The Toolbox contains all the items that you can use in the particular tab’s design pane at any given point in time. For example, the Control Flow tab has a list of tasks and containers (a partial list is shown in Figure 5-4). This list may grow based on what custom tasks are installed. The list is completely different when you’re in a different tab, such as the Data Flow tab. Many of the core tasks you see in Figure 5-4 are covered in Section 2 of this book in much more detail.

The Toolbox is organized into sections such as Common, Containers, and Other Tasks. These tabs can be collapsed and expanded for usability. As you use the Toolbox, you may want to customize your view by moving items to your favorites by right-clicking a given task or container and selecting Add to Favorites. Also, after you install a custom component, it automatically shows up in your Toolbox. When you select a component like a task from the Toolbox, notice that below the Toolbox pane, an interactive help section appears that enables you to see samples and a short description of the component.

NOTE At some point, you may accidentally close a window like the Properties window. If this happens to you, you can bring that window back through the View menu. You can also click the pushpin on any particular window to hide the window because real estate is at a premium when you begin development of SSIS.

The SSDT Design Environment

The SSDT environment contains two key tabs for designing packages: the Control Flow and Data Flow tabs. Each of these handles different parts of your packages. The Control Flow tab controls the execution of the package and the Data Flow tab handles the movement of data.

The Control Flow tab orchestrates the execution of your package, dictating that one task, such as an FTP Task, should execute ahead of another; for example, an Execute SQL Task. Inside the tab are tasks and containers you can drag over from the Toolbox onto the design pane. Each of those tasks has its own user interface that you can use to configure the task, and you can access it by double-clicking the component.

Each package has only a single Control Flow, but can have many Data Flows. The user interface for the Data Flow task is quite different. Its user interface is the Data Flow tab. In the Data Flow tab, you can configure one or more Data Flow tasks by dragging over sources, transforms, and destinations onto the design pane. Each Control Flow can have any number of Data Flow tasks, each of which results in a new item in the Data Flow tab’s drop-down list of tasks. The Data Flow is essentially where you’re going to configure the movement of your data from nearly any source to nearly any destination.

When you execute a package by right-clicking it in the Solution Explorer and selecting Execute Package, you enter debug mode. Notice a new tab called Progress immediately opens. The Progress tab is where you go to debug when a package has a problem. You can also go to the Output window below to see a textual view of the same Progress tab. Once you stop debug mode by clicking the Stop button or by going to Debug > Stop Debugging, the Progress tab changes to an Execution Results tab, which shows you the last run of a package. Each of those tabs shows you more than the Output window at the bottom, which shows you only critical issues.

One other handy thing you can do from within SSDT is open Server Explorer. Server Explorer enables you to create a connection to a SQL Server database that you can manage just as you would in Management Studio. You can do this by selecting Tools > Connect to Database. Type in the credentials for the database, and then you’re ready to run queries against the database, create stored procedures, or redesign tables, to name just a few things you can do.

Now that you’ve taken a look at the SSDT environment, Lesson 6 covers using the environment to create your first package.

Please select Lesson 5 on the DVD, or online at www.wrox.com/go/ssis2012video, to view the video that accompanies this lesson.

Chapter 6

Creating Your First Package

Creating packages in SQL Server Integration Services (SSIS) is a bit like LEGO-block programming. You drag various tasks over, configure the tasks, chain them together, and then voila, execute. Well, it’s not quite that easy, but you’ll find it much easier than writing any program. In this lesson, you learn how to create your first SSIS package. Granted, the package does very little here, but it shows you many of the concepts that will be critical throughout the rest of the book. Many of the concepts may not make complete sense yet when it comes to configuring various components, but no worries—the concepts are deeply covered throughout the rest of the book.

To create your first package, you need an SSIS project. Creating a project is covered extensively in Lesson 4. After you create your first project, a package called Package.dtsx is automatically created. If you want to rename this package, simply right-click the package in Solution Explorer and select Rename, leaving the .dtsx extension.

To create a new package, you can also right-click SSIS Packages in the Solution Explorer and select New Package. This action creates a new package that you will want to rename as soon as it’s created because it, too, will be called Package.dtsx or some variation of it. The final result will resemble Figure 6-1, which shows a partially complete SSIS project.

Figure 6-1

Creating and Using Connection Managers

To design a package, you want to first create connections, which are called connection managers in SSIS. A connection manager is a connection that can be leveraged and consumed once or many times in a package. To create a connection manager, right-click in the Connection Manager pane at the bottom of the screen in SSDT and select New <type of connection>. Any connection that you would use in SSIS, whether to a file or a database, will be stored as a connection manager here. Some of those common items would include the connections in the following table.

Type of Connection

Connection Manager

Database

OLE DB Connection Manager for Oracle, SQL Server, DB2.ADO.NET and ODBC Connection Manager for ODBC types of connections and in some cases OLE DB Data Sources.

File

Flat File Connection Manager when you want to load the file using a Data Flow Task. There is also an additional connection manager called the File Connection Manager that you can use if all you want to do is rename, delete, or perform some other type of file operation.

Excel

Excel Connection Manager.

Internet Connection

SMTP Connection Manager for mail servers. FTP Connection Manager for FTP servers. HTTP Connection Manager for websites or web services.

You can access some of the connections by right-clicking in the Connection Manager pane and selecting New Connection. This brings up a list of all the available connection managers (shown in Figure 6-2), including third-party ones that you have installed. The handy thing about connection managers is that they’re externally available to a DBA at run time. In other words, when a DBA goes to schedule this package, he or she can point the connection to a new database or file on-the-fly for that one job.

Figure 6-2

Once you create an OLE DB connection, it is available to you anywhere in the package from any component that can use the connection. If you’d like, you can create a connection that can be leveraged from multiple packages by creating a project connection manager. To do this, right-click Connection Managers in the Solution Explorer and select New Connection Manager. These data sources can be leveraged from multiple packages in the project and can be changed by the DBA later. By creating a connection here, you type the password one time for your connection, and if you ever change any type of connection information, it changes across any package using that connection. No negative consequences result from doing this, so generally speaking, it’s a great design-time practice to use project connection managers if you see yourself using the connection a few times.

NOTE At any time, you can convert a regular connection manager to a project connection manager by right-clicking the connection in the Connection Manager pane.

Using and Configuring Tasks

Your package would be nothing without tasks. Tasks in the Control Flow tab orchestrate the work that you want to do in the package. For example, one task may copy a file over from a different server while another task may load the file into a database. To use a task, simply drag it onto the design pane in the Control Flow tab from the Toolbox. A common task that you’ll use in this book is the Script Task because it requires no configuration, which makes it a great training tool.

Figure 6-3

Until most tasks are configured by double-clicking the task, you may see a yellow warning or red error indicator on the task. After you configure the task, you can link it to other tasks by using precedence constraints. Once you click the task, you’ll notice a green arrow (the precedence constraint) pointing down from the task, as shown in Figure 6-3. This precedence constraint controls the execution order of the various tasks in your package, and you can use it by dragging the green arrow to the next task that you want to chain together. You read more about most of the core tasks and the topic of precedence constraints in Section 2 of this book.

You should never keep the default name of your tasks. Instead, you should rename them to something that you can recognize in the log later. We prefer to name all of our tasks with some two- or three-digit qualifier, such as SCR for a Script Task, and then the purpose of the task such as SCR - Encrypt File. This name then shows up in your logs when a problem occurs and can also help you self-document your package.

Exploring Package Encryption

A package is essentially an XML file behind the scenes. To prove this, you can right-click any package and select View Code to see the package’s XML. As you can imagine, though, storing secure information inside an XML file could create some security problems. Luckily, Microsoft already thought of that problem and has a solution—encrypting your packages.

Microsoft encrypts your package by default with your Windows user key, which is a key that protects your Windows user credentials on your PC. You can look at the property that encrypts your package by going to the Properties pane and looking at the ProtectionLevel package-level property. This property is set to EncryptSensitiveWithUserKey by default, which means that all the usernames, passwords, or any other sensitive data are locked down with your credentials. If you were to pass the package to another user, the package’s encrypted data would not be visible, and the user would have to retype the secure information, such as the login information.

Another option is to change the property to EncryptSensitiveWithPassword, which locks down the package with a password instead. You can also use EncryptAllWithPassword (or UserKey). This property value locks down the entire package to where no one can open it without a proper password.

WARNING This property is usually one of the top reasons why packages fail in production. For example, if your package has sensitive information inside of it to connect to a database, the package would potentially fail when you ran the job because it was running under the SQL Server Agent’s (SQL Server’s scheduler) service account. You can also avoid this problem by setting the property to EncryptAllWithPassword and simply pass in the password when running the package or scheduling it.

Executing Packages