Knight's 24-Hour Trainer - Brian Knight - E-Book

Knight's 24-Hour Trainer E-Book

Brian Knight

0,0
34,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

A unique book-and-video package for learning Microsoft SQLServer Integration Services If you need a practical, hands-on introduction to Microsoft SQLServer 2008 Integration Services (SSIS), this book and videopackage from foremost SSIS authority Brian Knight gets youthoroughly up to speed. Each lesson in the book includes threemajor components: a thorough description of how each SSIS featureor process works; a tutorial that walks you through the process ortechnique; and a video lesson that demonstrates it. If you take full advantage of this comprehensive package, youwill gain enough experience to tackle your first SSIS project withconfidence. * SQL Server Integration Services 2008 (SSIS) builds on therevolutionary database product suite first introduced by Microsoftin 2005 and is a powerful tool for performing extraction,transformation, and load (ETL) operations essential to datawarehousing * If you're just learning SSIS, the step-by-step tutorials inthis book and DVD training package will ready you to tackle yourown first project * Every lesson in the book is supplemented by instructional videoon the DVD Note:As a part of this title, video lessons are included onDVD. For the e-book versions, video lessons can be accessed atwrox.com using a link provided in the interior of thee-book.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 403

Veröffentlichungsjahr: 2011

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

About the Authors

Acknowledgments

Preface

Welcome to SSIS

Import and Export Wizard

The Business Intelligence Development Studio

Architecture

Editions of SQL Server 2008

Summary

Section I: Installation and Getting Started

Chapter 1: Moving Data with the Import and Export Wizard

Try It

Chapter 2: Installing Sql Server Integration Services

Chapter 3: Installing the Sample Databases

Try It

Chapter 4: Creating a Solution and Project

Try It

Chapter 5: Exploring Business Intelligence Development Studio

The Solution Explorer

The Properties Window

The Toolbox

The BIDS Design Environment

Chapter 6: Creating Your First Package

Creating and Using Connection Managers

Using and Configuring Tasks

Exploring Package Encryption

Executing Packages

Try It

Section II: Tasks

Chapter 7: Connecting Control Flow Tasks with Precedence Constraints

Try It

Chapter 8: Manipulating Files with the File System Task

Try It

Chapter 9: Coding Custom Script Tasks

Try It

Chapter 10: Using the Execute SQL Task

Try It

Chapter 11: Using the Send Mail Task

Try It

Chapter 12: Using the FTP Task

Try It

Chapter 13: Creating a Data Flow

Try It

Section III: Data Flow

Chapter 14: Extracting Data from Sources

OLE DB Source

Excel Source

Flat File Source

Chapter 15: Loading Data to a Destination

OLE DB Destination

Flat File Destination

Excel Destination

Chapter 16: Changing Data Types with the Data Conversion Transform

Try It

Chapter 17: Creating and Replacing Columns with the Derived Column Transform

Try It

Chapter 18: Rolling Up Data with The Aggregate Transform

Try It

Chapter 19: Ordering Data with the Sort Transform

Try It

Chapter 20: Joining Data with the Lookup Transform

Cache Modes

The Cache Connection Manager and Transform

Chapter 21: Auditing Data with the Row Count Transform

Try It

Chapter 22: Combining Multiple Inputs with the Union All Transform

Try It

Chapter 23: Cleansing Data with the Script Component

Try It

Chapter 24: Separating Data with the Conditional Split Transform

Try It

Chapter 25: Altering Rows with the OLE DB Command Transform

Try It

Chapter 26: Handling Bad Data with the Fuzzy Lookup Transform

Try It

Chapter 27: Removing Duplicates with the Fuzzy Grouping Transform

Try It

Section IV: Variables and Expressions

Chapter 28: Making a Package Dynamic with Variables

Try It

Chapter 29: Making a Connection Dynamic with Expressions

Try It

Chapter 30: Making a Task Dynamic with Expressions

Try It

Section V: Containers

Chapter 31: Using Sequence Containers to Organize a Package

Try It

Chapter 32: Using For Loop Containers to Repeat Control Flow Tasks

Try It

Chapter 33: Using the Foreach Loop Container to Loop Through a Collection of Objects

Try It

Section VI: Configuration

Chapter 34: Easing Deployment with Configuration Tables

Try It

Chapter 35: Easing Deployment with Configuration Files

Try It

Chapter 36: Configuring Child Packages

Configuring an Execute Package Task

Configuring a Child Package

Try It

Section VII: Troubleshooting and Logging

Chapter 37: Logging Package Data

Try It

Chapter 38: Using Event Handlers

Creating Event Handlers

Common Uses for Event Handlers

Try It

Chapter 39: Troubleshooting Errors

Working in the Progress Tab

Troubleshooting Steps

Try It

Chapter 40 Using Data Viewers

Try It

Chapter 41: Using Breakpoints

Try It

Section VIII: Deployment and Administration

Chapter 42: Deploying Packages

Deploying Packages Using a Deployment Utility

Other Methods of Deploying Packages

Try It

Chapter 43: Configuring the SSIS Service

Try It

Chapter 44: Securing SSIS Packages

Try It

Chapter 45: Running SSIS Packages

Try It

Chapter 46: Scheduling Packages

Using Proxy Accounts

Try It

Section IX: Data Warehousing

Chapter 47: Loading a Dimension Table

Try It

Chapter 48: Loading a Fact Table

Try It

Section X: Wrap Up and Review

Chapter 49: Bringing It All Together

Lesson Requirements

Hints

Step-by-Step

Appendix A: SSIS Component Crib Notes

Appendix B: Problem and Solution Crib Notes

Appendix C: What’s on the DVD?

Wiley Publishing, Inc. End-User License Agreement

Index

Download CD/DVD content

Advertisement

Knight’s 24-Hour Trainer: Microsoft® SQL Server® 2008 Integration Services

Published by

Wiley Publishing, Inc.

10475 Crosspoint Boulevard

Indianapolis, IN 46256

www.wiley.com

Copyright © 2009 by Wiley Publishing, Inc., Indianapolis, Indiana

Published simultaneously in Canada

ISBN: 978-0-470-49692-3

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Web site may provide or recommendations it may make. Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read.

For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.

Library of Congress Control Number: 2009928170

Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, Wrox Programmer to Programmer, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. Microsoft and SQL Server are registered trademarks of Microsoft Corporation in the United States and/or other countries. All other trademarks are the property of their respective owners. Wiley Publishing, Inc., is not associated with any product or vendor mentioned in this book.

To my always supporting wife and children. Thank you for making me who I am today.

—Brian Knight

James 1:17

—Devin Knight

I dedicate this book to my wife Jessica; without her support and love I would never have accomplished so much. To my son, daughter, and wife for being understanding when I work late writing instead of playing Rock Band with them. To my mother who raised me to be the man I am today.

—Mike Davis

Credits

Executive Editor

Robert Elliott

Senior Project Editor

Kevin Kent

Technical Editors

Bob Bojanic

Dustin Ryan

Carla Sabotta

Stefan Steiner

Production Editor

Kathleen Wisor

Copy Editor

Kim Cofer

Editorial Director

Robyn B. Siesky

Editorial Manager

Mary Beth Wakefield

Production Manager

Tim Tate

Vice President and Executive Group Publisher

Richard Swadley

Vice President and Executive Publisher

Barry Pruett

Associate Publisher

Jim Minatel

Project Coordinator, Cover

Lynsey Stanford

Proofreader

Dr. Nate Pritts, Word One

Indexer

Robert Swanson

About the Authors

Brian Knight, SQL Server MVP, MCITP, MCSE, MCDBA, is the owner and founder of Pragmatic Works. He is also the co-founder of SQLServerCentral.com and JumpstartTV.com. He runs the local SQL Server users group in Jacksonville (JSSUG). Brian is a contributing columnist at several technical magazines and does regular webcasts at Jumpstart TV. He is the author of ten SQL Server books. Brian has spoken at conferences like PASS, SQL Connections, and TechEd and many Code Camps. His blog can be found at http://www.pragmaticworks.com. Brian watches his 401k wither away in Jacksonville, Florida.

Devin Knight is a BI consultant at Pragmatic Works. Previously, he has tech edited the book Professional Microsoft SQL Server 2008 Integration Services (John Wiley & Sons, Inc.). Devin has spoken at past conferences like PASS and at several SQL Saturday events. He is a contributing member to the Business Intelligence Special Interest Group (SIG) for PASS as a leader in the SSIS Focus Group. Making his home in Jacksonville, Florida, Devin is a participating member of the local users’ group (JSSUG). Visit his blog at http://pragmaticworks.com/community/blogs/.

Mike Davis (Jacksonville, Florida) is an author, developer, consultant, and trainer with Pragmatic Works. He has expertise in many areas of Business Intelligence including Integration Services, Reporting Services, Analysis Services, Database Administration, and .NET Software Development. Mike has created enterprise level BI and software solutions for numerous corporations. As a trainer he travels the world teaching classes on SQL Server and BI. He also participates as a speaker at events like SQL Saturdays, SQL Server launches, SQL server user groups, and SQL PASS. In his spare time Mike plays guitar and darts. He is a member of the North Florida Dart League.

Acknowledgments

Thanks to everyone who made this book possible. Devin and Mike were an excellent team to push this rewarding project forward. Thanks also to the voices in my head who talk to me at night when the caffeine is kicking into overdrive. Twinkles, that means you. Thank you to the wonderful team at Wrox, especially Bob Elliott, who gave this book concept a chance when it was still a drunken idea on a napkin, and Kevin Kent, who made our ramblings much more concise. As always, I owe a huge debt to my wife Jenn for putting up with my late nights and my children, Colton, Liam, and Camille for being so patient with their tired dad.

— Brian Knight

I must give thanks to God; without Him in my life I would not have such blessings. To my wife Erin who has only helped in strengthening my relationship with God. She’s had the patience during the late nights of writing, editing, and video recording. She will be a great mother to our future child. Thanks to my brother Brian Knight. If it weren’t for him an opportunity like this would have never entered my life. He’s truly an inspiration in his work and personal life and I hope I can become as good a husband and dad as he is. Lastly, I would like to thank personal eating trainer, Takeru Kobayashi. With your help my hope is to join the International Federation of Competitive Eating and break the world record for hot dog eating.

— Devin Knight

Thanks to Brian Knight for giving me the opportunity to author a book with him and teaching me many things about SQL Server. Thanks to Devin and Brian Knight for being co-authors on this book and all of their help and advice. Thanks to technical editors Dustin Ryan, Bob Bojanic, Carla Sabotta, and Stefan Steiner who were a huge help in making sure I was accurate in my writing. Thanks to the entire Wiley staff for helping me with all parts of the book, especially Kevin Kent for putting up with all my last minute writings. And finally, thanks to the Flying Spaghetti Monster; with him all things are possible.

— Mike Davis

Preface

If you’ve picked up this book, Knight’s 24-Hour Trainer: Microsoft SQL Server 2008 Integration Services, you’ve decided to learn one of SQL Server’s most exciting applications, SQL Server Integration Services (SSIS). SSIS is a platform to move data from nearly any data source to nearly any destination and helps you by orchestrating a workflow to organize and control the execution of all these events. Most who dive into SSIS use it weekly, if not daily, to move data between partners, departments, or customers. It’s also a highly in-demand skill — even in the worst of economic environments, jobs are still posted for SSIS developers. This is because no matter what happens in an economy, people still must move and transform data.

This book, then, is your chance to start delving into this powerful and marketable application. And what’s more, this is not just a book you’re holding right now. It’s a video learning tool, as well. We became passionate about video training a number of years ago when we realized that in our own learning we required exposure to multiple teaching techniques to truly understand a topic — a fact that is especially true with tutorial books like this one. So, you’ll find hours of videos on the DVD in this book to help you learn SSIS even better than reading about the topic alone could and to help demonstrate to you the various tutorials in the book.

Who This Book Is For

This is a beginner book and assumes only that you know SQL Server 2005 or 2008 to run queries against the database engine. Because this book is structured for a beginner, providing many tutorials and teaching you only what you’ll likely use at work, it is not a reference book filled with a description of every property in a given task. It instead focuses on only the essential components for you to complete your project at work or school.

What This Book Covers

This book covers SQL Server 2008 and assumes no knowledge of SQL Server 2005. The differences in SQL Server 2005 and SQL Server 2008 are minimal, so if you are using this book as a 2005 learning tool, you will find only a few small user interface differences.

By the time you’ve completed this book, you’ll know how to load and synchronize database systems using SSIS by using some of the new SQL Server 2008 features. You’ll also know how to load data warehouses, which is a very hot and specialized skill. Even in warehousing, you’ll find features in the new SQL Server 2008 release that you’ll wonder how you lived without!

How This Book Is Structured

Our main principle in this book is to teach you only what we think you need to perform your job task. Because of that, it’s not a comprehensive reference book. You won’t find a description of every feature of SSIS in here. Instead the book blends small amounts of description, a tutorial, and videos to enhance your experience. Each lesson walks you through how to use components of SSIS and contains a tutorial. In this tutorial, called “Try It,” you can choose to read the requirements to complete the lesson, the hints of how to go about it, and begin coding, or you can read the step-by-step instructions if you learn better that way. Either way if you get stuck or want to see how one of us does the solution, watch the video on the DVD to receive further instruction.

What This Book Covers

This book contains 49 lessons, which are broken into 10 sections. The lessons are usually only a few pages long and focus on the smallest unit of work in SSIS that we could work on. Each section has a large theme around a given section in SSIS:

Section 1: Installation and Getting Started — This section covers the basic installation of SSIS and the sample databases that you’ll use throughout this book. If you already have SSIS and the sample databases installed, you can review this section quickly.Section 2: Tasks — This section explains how to use tasks in the Control Flow of SSIS.Section 3: Data Flow — Seventy-five percent of your time as an SSIS developer is spent in the Data Flow tab. This section focuses on the configuration of the core sources, transforms, and destinations.Section 4: Variables and Expressions — Now that you’ve created your first package, you must make it dynamic. This section covers how you can use variables and expressions to make your package change at run time.Section 5: Containers — This section covers one of the key Control Flow items, containers, which control how SSIS does looping and grouping.Section 6: Configuration — Here you learn how to configure your packages externally through configuration files, tables, and other ways.Section 7: Troubleshooting and Logging — No sooner do you have an SSIS package developed than you start experiencing problems. This section shows you how to troubleshoot these problems.Section 8: Deployment and Administration — Now that your package is developed, here you learn how to deploy and configure the service.Section 9: Data Warehousing — A little more on the advanced side, this section teaches you how to load a data warehouse using SSIS.Section 10: Wrap Up and Review — This section was one of our favorites to write. It contains a lesson to bring everything together and also Appendixes A and B, which contain crib notes for quick reference. As trainers and consultants, we are constantly asked to leave behind a quick page of crib notes of common code. In these appendixes, you find guides on when to use which SSIS components and useful solutions and code snippets that address common situations you might face.

Instructional Videos on DVD

As mentioned earlier in this preface, because we believe strongly in the value of video training, this book has an accompanying DVD containing hours of instructional video. At the end of each lesson in the book, you will find a reference to an instructional video on the DVD that accompanies that lesson. In that video, one of us will walk you through the content and examples contained in that lesson. So, if seeing something done and hearing it explained helps you understand a subject better than just reading about it does, this book and DVD combination is just the thing for you to get started with SSIS.

Conventions

To help you get the most from the text and keep track of what’s happening, we’ve used a number of conventions throughout the book.

Boxes like this one hold important, not-to-be forgotten information that is directly relevant to the surrounding text.

Notes, tips, hints, tricks, and asides to the current discussion are offset and placed in italics like this.

References like this one point you to the DVD with the print book, or online at www.wrox.com/go/ssis08vid to watch the instructional video that accompanies a given lesson.

As for styles in the text:

We highlight new terms and important words when we introduce themWe show URLs and code within the text like so: persistence.propertiesWe present code in the following way:
We use a monofont type for code examples.

Supporting Packages and Code

As you work through the lessons in this book, you may choose either to type in all the code and create all the packages manually or to use the supporting packages and code files that accompany the book. All the packages, code, and other support files used in this book are available for download at http://www.wrox.com. Once at the site, simply locate the book’s title (either by using the Search box or by using one of the title lists) and click the Download Code link on the book’s detail page to obtain all the source code for the book.

Because many books have similar titles, you may find it easiest to search by ISBN; this book’s ISBN is 978-0-470-49692-3.

Once you download the code, just decompress it with your favorite compression tool. Alternatively, you can go to the main Wrox code download page at http://www.wrox.com/dynamic/books/download.aspx to see the code available for this book and all other Wrox books.

Errata

We make every effort to ensure that there are no errors in the text or in the code. However, no one is perfect, and mistakes do occur. If you find an error in one of our books, like a spelling mistake or faulty piece of code, we would be very grateful for your feedback. By sending in errata you may save another reader hours of frustration and at the same time you will be helping us provide even higher quality information.

To find the errata page for this book, go to http://www.wrox.com and locate the title using the Search box or one of the title lists. Then, on the Book Search Results page, click the Errata link. On this page you can view all errata that has been submitted for this book and posted by Wrox editors.

A complete book list including links to errata is also available at www.wrox.com/misc-pages/booklist.shtml.

If you don’t spot “your” error on the Errata page, click the Errata Form link and complete the form to send us the error you have found. We’ll check the information and, if appropriate, post a message to the book’s errata page and fix the problem in subsequent editions of the book.

p2p.wrox.com

For author and peer discussion, join the P2P forums at p2p.wrox.com. The forums are a Web-based system for you to post messages relating to Wrox books and related technologies and interact with other readers and technology users. The forums offer a subscription feature to e-mail you topics of interest of your choosing when new posts are made to the forums. Wrox authors, editors, other industry experts, and your fellow readers are present on these forums.

At http://p2p.wrox.com you will find a number of different forums that will help you not only as you read this book, but also as you develop your own applications. To join the forums, just follow these steps:

1. Go to p2p.wrox.com and click the Register link.

2. Read the terms of use and click Agree.

3. Complete the required information to join as well as any optional information you wish to provide and click Submit.

4. You will receive an e-mail with information describing how to verify your account and complete the joining process.

You can read messages in the forums without joining P2P but in order to post your own messages, you must join.

Once you join, you can post new messages and respond to messages other users post. You can read messages at any time on the Web. If you would like to have new messages from a particular forum e-mailed to you, click the Subscribe to this Forum icon by the forum name in the forum listing.

For more information about how to use the Wrox P2P, be sure to read the P2P FAQs for answers to questions about how the forum software works as well as many common questions specific to P2P and Wrox books. To read the FAQs, click the FAQ link on any P2P page.

Welcome to SSIS

SQL Server Integration Services (SSIS) is one of the most powerful applications in your arsenal for moving data in and out of various databases and files. Like the rest of the business intelligence (BI) suite that comes with SQL Server, it’s not part of SQL Server other than being included on the DVD and in the license that you’ve already purchased. Because of that, even if your environment is not using a lot of SQL Server, SSIS can still be used as a platform for data movement.

Though ultimately this book is more interactive in nature, this introductory chapter first walks you through a high-level tour of SSIS so you have a life preserver on prior to jumping in the pool. Each topic touched on in this introduction is covered much deeper throughout the book in lesson form and in the supporting videos on the DVD.

Import and Export Wizard

If you need to move data quickly from almost any OLE DB–compliant data source to a destination, you can use the SSIS Import and Export Wizard (shown in Figure 1). The wizard is a quick way to move the data and perform very light transformations of data, such as casting of the data into new data types. You can quickly check any table you want to transfer as well as write a query against the data to retrieve only a selective amount of data.

Figure 1

The Business Intelligence Development Studio

The Business Intelligence Development Studio (BIDS) is the central tool that you’ll spend most of your time in as an SSIS developer. Like the rest of SQL Server, the tool’s foundation is the Visual Studio 2008 interface (shown in Figure 2). The nicest thing about the tool is it’s not bound to any particular SQL Server. In other words, you won’t have to connect to a SQL Server to design an SSIS package. You can design the package disconnected from your SQL Server environment and then deploy it to your target SQL Server or the file system you’d like it to run on.

Figure 2

Architecture

SQL Server 2008 has truly evolved SSIS into a major player in the extraction, transformation, and loading (ETL) market. What’s especially nice about SSIS is its price tag, which is free with the purchase of SQL Server. Other ETL tools can cost hundreds of thousands of dollars based on how you scale the software. The SSIS architecture has also expanded dramatically since the SQL Server 2000 days, as you can see in Figure 3. The SSIS architecture consists of four main components:

The SSIS serviceThe SSIS runtime engine and the runtime executablesThe SSIS Data Flow engine and the Data Flow componentsThe SSIS clients

Figure 3

Let’s boil this down to the essentials that you need to know to do your job. The SSIS service handles the operational aspects of SSIS. It is a Windows service that is installed when you install the SSIS component of SQL Server 2008, and it tracks the execution of packages (a collection of work items) and helps with the storage of the packages. You don’t need the SSIS service to run SSIS packages, but if the service is stopped, all the SSIS packages that are currently running will in turn stop by default.

The SSIS runtime engine and its complimentary programs actually run your SSIS packages. The engine saves the layout of your packages and manages the logging, debugging, configuration, connections, and transactions. Additionally, it manages handling your events to send you emails or log into a database when an event is raised in your package. The runtime executables provide the following functionality to a package that we discuss in more detail throughout this book:

Containers — Provide structure and scope to your packageTasks — Provide the functionality to your packageEvent handlers — Respond to raised events in your packagePrecedence constraints — Provider ordinal relationship between various items in your package

Packages

A core component of SSIS is the notion of a package. A package best parallels an executable program in Windows. Essentially, a package is a collection of tasks that execute in an orderly fashion. Precedence constraints help manage which order the tasks will execute in. A package can be saved onto a SQL Server, which in actuality is saved in the msdb database. It can also be saved as a .DTSX file, which is an XML structured file much like .RDL files are to Reporting Services. The end result of the package looks like Figure 2 that was shown earlier.

Tasks

A task can best be described as an individual unit of work. Tasks provide functionality to your package, much like a method does in a programming language. A task can move a file, load a file into a database, send an email, or write a set of VB.NET code for you, to name just a few of the things it can do. A small subset of the common tasks available to you comprises the following:

Bulk Insert Task — Loads data into a table by using the BULK INSERT SQL commandData Flow Task — This very specialized task loads and transforms data into an OLE DB destinationExecute Package Task — Allows you to execute a package from within a package, making your SSIS packages modularExecute Process Task — Executes a program external to your package, like one to split your extract file into many files before processing the individual filesExecute SQL Task — Executes a SQL statement or stored procedureFile System Task — This task can handle directory operations like creating, renaming, or deleting a directory. It can also manage file operations like moving, copying, or deleting filesFTP Task — Sends or receives files from an FTP siteScript Task — Runs a set of VB.NET or C# coding inside a Visual Studio environmentSend Mail Task — Sends a mail message through SMTPAnalysis Services Processing Task — This task processes a SQL Server Analysis Services cube, dimension, or mining modelWeb Service Task — Executes a method on a web serviceWMI Data Reader Task — This task can run WQL queries against the Windows Management Instrumentation (WMI). This allows you to read the event log, get a list of applications that are installed, or determine hardware that is installed to name a few examplesWMI Event Watcher Task — This task empowers SSIS to wait for and respond to certain WMI events that occur in the operating systemXML Task — Parses or processes an XML file. It can merge, split, or reformat an XML file

These are only a few of the many tasks you have available to you. You can also write your own task or download a task from the web that does something else. Writing such a task only requires that you learn the SSIS object model and know VB.NET or C#.

Data Flow Elements

Once you create a Data Flow Task, the Data Flow tab in BIDS is then available to you for design. Just as the Control Flow tab handles the main workflow of the package, the Data Flow tab handles the transformation of data. Almost anything that manipulates data falls into the Data Flow category. You can see an example of a Data Flow in Figure 4, where data is pulled from an OLE DB Source and transformed before being written to a Flat File Destination. As data moves through each step of the Data Flow, the data changes based on what the transform does. For example, in Figure 4 a new column is derived using the Derived Column Transform and that new column is then available to subsequent transformations or to the destination.

Figure 4

You can add multiple Data Flow Tasks onto the Control Flow tab. You’ll notice that after you click on each one, it jumps to the Data Flow tab with the Data Flow Task name you selected in the drop-down box right under the tab. You can toggle between Data Flow Tasks easily by selecting the next Data Flow Task from that drop-down box.

Sources

A source is where you specify the location of your source data to pull from in the data pump. Sources will generally point to a connection manager in SSIS. By pointing them to the connection manager, you can reuse connections throughout your package, because you need only change the connection in one place. Here are some of the common sources you’ll be using in SSIS:

OLE DB Source — Connects to nearly any OLE DB data source like SQL Server, Access, Oracle, or DB2 to name just a fewExcel Source — Source that specializes in receiving data from Excel spreadsheets. This source also makes it easy to run SQL queries against your Excel spreadsheet to narrow the scope of the data that you want to pass through the flowFlat File Source — Connects to a delimited or fixed-width fileXML Source — Can retrieve data from an XML documentADO.NET Source — The ADO.NET source is an ADO.NET connection much like you see in the .NET Framework when you use the DataReader interface in your application code to connect to a database for communication to ODBC

Destinations

Inside the Data Flow, destinations accept the data from the data sources and from the transformations. The flexible architecture can send the data to nearly any OLE DB–compliant data source or to a flat file. Like sources, destinations are managed through the connection manager. Some of the more common destinations in SSIS and available to you are as follows:

Excel Destination — Outputs data from the Data Flow to an Excel spreadsheet that must already existFlat File Destination — Enables you to write data to a comma delimited or fixed-width fileOLE DB Destination — Outputs data to an OLE DB data connection like SQL Server, Oracle, or AccessSQL Server Destination — The destination that you use to write data to SQL Server most efficiently. To use this, you must run the package from the destination

Transformations

Transformations (or transforms) are a key component to the Data Flow that change the data to a format that you’d like. For example, you may want your data to be sorted and aggregated. Two transformations can accomplish this task for you. The nicest thing about transformations in SSIS is they are all done in-memory, and because of this they are extremely efficient. Memory handles data manipulation much faster than disk IO does, and you’ll find if disk paging occurs, your package that ran in 20 minutes will suddenly take hours. Here are some of the more common transforms you’ll use on a regular basis:

Aggregate — Aggregates data from a transform or source similar to a GROUP BY statement in T-SQLConditional Split — Splits the data based on certain conditions being met. For example, if the State column is equal to Florida, send the data down a different path. This transform is similar to a CASE statement in T-SQLData Conversion — Converts a column’s data type to another data type. This transform is similar to a CAST statement in T-SQLDerived Column — Performs an in-line update to the data or creates a new column from a formula. For example, you can use this to calculate a Profit column based on a Cost and SellPrice set of columnsFuzzy Grouping — Performs data cleansing by finding rows that are likely duplicatesFuzzy Lookup — Matches and standardizes data based on fuzzy logic. For example, this can transform the name Jon to JohnLookup — Performs a lookup on data to be used later in a transformation. For example, you can use this transformation to look up a city based on the zip codeMulticast — Sends a copy of the data to an additional path in the workflow and can be used to parallelize data. For example, you may want to send the same set of records to two tablesOLE DB Command — Executes an OLE DB command for each row in the Data Flow. Can be used to run an UPDATE or DELETE statement inside the Data FlowRow Count — Stores the row count from the Data Flow into a variable for later use by perhaps an auditing solutionScript Component — Uses a script to transform the data. For example, you can use this to apply specialized business logic to your Data FlowSlowly Changing Dimension — Coordinates the conditional insert or update of data in a slowly changing dimension during a data warehouse loadSort — Sorts the data in the Data Flow by a given column and removes exact duplicatesUnion All — Merges multiple data sets into a single data setUnpivot — Unpivots the data from a non-normalized format to a relational format

Editions of SQL Server 2008

The features in SSIS and SQL Server that are available to you vary widely based on what edition of SQL Server you’re using. As you can imagine, the higher end edition of SQL Server you purchase, the more features are available. As for SSIS, you’ll have to use at least Standard Edition to receive the bulk of the SSIS features. In the Express and Workgroup editions, only the Import and Export Wizard is available to you. There are some features in SSIS that you’ll have to upgrade to Enterprise or Developer Editions to see. The advanced transformations available only with Enterprise edition are as follows:

Data Mining Query TransformationFuzzy Lookup and Fuzzy Grouping TransformationsTerm Extraction and Term Lookup TransformationsData Mining Model Training DestinationDimension Processing DestinationPartition Processing Destination

Summary

This chapter introduced you to the SQL Server Integration Services (SSIS) architecture and some of the different elements you’ll be dealing with in SSIS. Tasks are individual units of work that are chained together with precedence constraints. Packages are executable programs in SSIS that are a collection of tasks. Finally, transformations are the Data Flow items that change the data to the form you request, such as sorting the data the way you want. Now that that the overview is out of the way, it’s time to start the first section and your first set of lessons, and time for you to get your hands on SSIS.

As mentioned earlier, the print book comes with an accompanying DVD containing hours of instructional supporting video. At the end of each lesson in the book, you will find a box like this one pointing you to a video on the DVD with the print book, or online at www.wrox.com/go/ssis08vid, that accompanies that lesson. In that video, one of us will walk you through the content and examples contained in that lesson. So, if seeing something done and hearing it explained helps you understand a subject better than just reading about it does, this text and video combination provides exactly what you need. There’s even an Introduction to SSIS video that you can watch to get started. Simply select the Intro to SSIS lesson on the DVD or online.

Section I

Installation and Getting Started

Lesson 1: Moving Data with the Import and Export Wizard

Lesson 2: Installing SQL Server Integration Services

Lesson 3: Installing the Sample Databases

Lesson 4: Creating a Solution and Project

Lesson 5: Exploring Business Intelligence Development Studio

Lesson 6: Creating Your First Package

Chapter 1

Moving Data with the Import and Export Wizard

The Import and Export Wizard is the easiest method to move data from sources like Excel, Oracle, DB2, SQL Server, and text files to nearly any destination. This wizard uses SSIS as a framework and can optionally save a package as its output prior to executing. The package it produces may not be the most elegant, but it can take a lot of the grunt work out of package development and provide the building blocks that are necessary for you to build the remainder of the package. Oftentimes as an SSIS developer, you’ll want to relegate the grunt work and heavy lifting to the wizard and do the more complex coding yourself.

As with most SQL Server wizards, you have numerous ways to open the tool:

To open the Import and Export Wizard, right-click the database you want to import data from or export data to SQL Server Management Studio and select Tasks ⇒ Import Data (or Export Data based on what task you’re performing)You can also open the wizard by right-clicking SSIS Packages in Business Intelligence Development Studio (BIDS) and selecting SSIS Import and Export WizardAnother common way to open it is from the Start Menu under SQL Server 2008 by choosing Import and Export DataThe last way to open the wizard is by typing dtswizard.exe at the command line or Run prompt

Regardless of whether you need to import or export the data, the first few screens in the wizard look very similar.

Once the wizard comes up, you see the typical Microsoft wizard welcome screen. Click Next to begin specifying the source connection. If you opened the wizard from Management Studio by selecting Export Data, this screen is prepopulated. In this screen you specify where your data is coming from in the Source drop-down box. Once you select the source, the rest of the options on the dialog box may vary based on the type of connection. The default source is SQL Native Client, and it looks like Figure 1-1. You have OLE DB sources like SQL Server, Oracle, and Access available out of the box. You can also use text files and Excel files. After selecting the source, you have to fill in the provider-specific information.

Figure 1-1

For SQL Server, you must enter the server name (localhost means go to your local machine’s SQL Server instance if applicable) and the user name and password you want to use. If you’re going to connect with your Windows account, simply select “Use Windows Authentication.” Windows Authentication will pass your Windows local or domain credentials into the data source. Lastly, choose a database that you’d like to connect to. For most of the examples in this book, you use the AdventureWorks2008 database. You can download this database as an optional installation on CodePlex.com, or you can see Lesson 3 of this book for more information on installing this sample database.

You can also find the sample databases at the Wrox website at http://www.wrox.com/go/SQLServer2008RTMDataSets.

Additional sources such as Sybase and DB2 can also become available if you install the vendors’ OLE DB providers. You can download the OLE DB Provider for DB2 for free if you’re using Enterprise Edition by going to the SQL Server 2008 Feature Pack on the Microsoft website. You also have ADO.NET providers available to you in SQL Server 2008.

After you click Next, you are taken to the next screen in the wizard, where you specify the destination for your data. The properties for this screen are exactly identical to those for the previous screen with the exception of the database. On the next screen, if you select “Copy data from one or more tables or views,” you can simply check the tables you want. If you select “Write a query to specify the data to transfer,” you can write an ad hoc query (after clicking Next) addressing where to select the data from or what stored procedure to use to retrieve your data.

The next screen allows you to select the table or tables you want to copy over and which table names you want them to be transferred to. If you want, you can click the Edit button to go to the Column Mappings dialog box (shown in Figure 1-2) for each table. Here you can change the mapping between each source and destination column. For example, if you want the DepartmentID column to go to the DepartmentID2 column on the destination, simply select the Destination drop-down box for the DepartmentID column and point it to the new column, or choose <ignore > to ignore the column altogether. By checking “Enabled identity insert,” you allow the wizard to insert into a column that has an identity (or autonumber) value assigned. If the data types don’t match between the source and destination columns, the wizard will add the necessary components to convert the data to a proper data type if possible.

Figure 1-2

The next screen allows you to save the package or just choose to run it immediately. You can uncheck “Execute Immediately” to just save the package for later modification and execution. You can open the package that executed in the Business Intelligence Development Studio (BIDS) if you’d like. You do this by creating a project in BIDS and adding the package to the project. You cannot edit the package without a BIDS project to contain the package. We discuss how to create a project in Lesson 4 later in this book. The final screen executes the process and shows you the output log.

Try It

In this lesson, you learn how to quickly load a flat file into a database using the Import and Export Wizard. After this lesson, you’ll have a clear understanding of how the Import and Export Wizard is the easiest way to load data into almost any destination and how it is accessed from Management Studio or BIDS.

Lesson Requirements

Load the file BelgiumExtract.txt (which you can download at this book’s website at www.wrox.com) into any database of your choosing. We are using AdventureWorks2008 database as our target, but that’s not a dependency. Note: The file’s first row holds the column names.

Hints

One of the fastest ways to access the Import and Export Wizard to load the data is through Management Studio. Right-click on the target database and select Tasks ⇒ Import Data.

Step-by-Step

1. Open SQL Server Management Studio in the SQL Server 2008 program group.

2. Right-click the target database of your choosing (like AdventureWorks2008) and select Tasks ⇒ Import Data.

3. For the Data Source, select Flat File Source, as shown in Figure 1-3. For the “File Name” property, select the BelgiumExtract.txt file that you can download from this book’s website on www.wrox.com. Check the “Column names in the first data row” option to read the column names from the first row of data from the flat file. Click the Columns page in the left pane to confirm that the file is delimited by commas.

Figure 1-3

4. Click Next to configure the destination. Point to any server and database you want.

5. On the Select Source Tables and Views screen, click Edit Mappings to go to the Column Mappings page. Change the PurchaseOrderID column to an int data type. Change the OrderDate, ShipDate, and ModifiedDate columns to a datetime data type. Finally, change the SubTotal, TaxAmt, Freight, TotalDue columns all to a decimal(12,4) data type where 12 is the precision and 4 is the scale, as shown in Figure 1-4.

Figure 1-4

6. Click OK to leave the Column Mappings page and then click Next to review any Data Type Mapping warnings. The data mapping warnings screen shows you where you have any columns where the data types don’t match. You can ignore those warnings for the time being and click Next a few times to execute the package. If you are successful, you should see a total of 4000 rows.

Please select Lesson 1 on the DVD with the print book, or online at www.wrox.com/go/ssis08vid to view the video that accompanies this lesson.

Chapter 2

Installing Sql Server Integration Services

This book requires that you have Business Intelligence Development Studio (BIDS) and the SQL Server Integration Services (SSIS) service installed. To develop SSIS, you cannot use Workgroup Edition or SQL Express. The SSIS runtime to run packages does ship with all editions, but that runtime on some of the lower editions may not work with all SSIS components.

On the subject of editions of SQL Server, you have a decision to make as to which edition you want to install: Standard Edition or Enterprise Edition. Developer Edition is also available. It contains all the components of Enterprise Edition at a tiny fraction of the cost but is licensed for development only. Enterprise Edition gives you a few additional SSIS components that you may be interested in for SQL Server 2008:

Data Mining componentsFuzzy Lookup and Group transformsDimension and Partition Processing DestinationTerm Extraction and Lookup transforms