Oracle Warehouse Builder 11g: Getting Started - Bob Griesemer - E-Book

Oracle Warehouse Builder 11g: Getting Started E-Book

Bob Griesemer

0,0
44,39 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

In today's economy, businesses and IT professionals cannot afford to lag behind the latest technologies. Data warehousing is a critical area to the success of many enterprises, and Oracle Warehouse Builder is a powerful tool for building data warehouses. It comes free with the latest version of the Oracle database.
Written in an accessible, informative, and focused manner, this book will teach you to use Oracle Warehouse Builder to build your data warehouse. Covering warehouse design, the import of source data, the ETL cycle and more, this book will have you up and running in next to no time.
This book will walk you through the complete process of planning, building, and deploying a data warehouse using Oracle Warehouse Builder. By the book's end, you will have built your own data warehouse from scratch.
Starting with the installation of the Oracle Database and Warehouse Builder software, this book then covers the analysis of source data, designing a data warehouse, and extracting, transforming, and loading data from the source system into the data warehouse. You'll follow the whole process with detailed screenshots of key steps along the way, alongside numerous tips and hints not covered by the official documentation.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 551

Veröffentlichungsjahr: 2009

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Oracle Warehouse Builder 11g Getting Started
Credits
About the Author
About the Reviewers
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code for the book
Errata
Piracy
Questions
1. An Introduction to Oracle Warehouse Builder
Introduction to data warehousing
Introduction to our fictional organization
What is a data warehouse?
Where does OWB fit in?
Installation of the database and OWB
Downloading the Oracle software
A word about hardware and operating systems
Installing Oracle database software
Configuring the listener
Creating the database
Installing the OWB standalone software
OWB components and architecture
Configuring the repository and workspaces
Summary
2. Defining and Importing Source Data Structures
Preliminary analysis
ACME Toys and Gizmos source data
The POS transactional source database
The web site order management database
An overview of Warehouse Builder Design Center
Importing/defining source metadata
Creating a project
Creating a module
Creating an Oracle Database module
Creating a SQL Server database module
Creating a SQL Server database connection
Configure Oracle to connect to SQL Server
Creating a heterogeneous service configuration file
Editing the listener.ora file
Creating the Warehouse Builder ODBC module for SQL Server
Importing source metadata from a database
Defining source metadata manually with the Data Object Editor
Importing source metadata from files
Summary
3. Designing the Target Structure
Data warehouse design
Dimensional design
Cube and dimensions
Implementation of a dimensional model in a database
Relational implementation (star schema)
Multidimensional implementation (OLAP)
Designing the ACME data warehouse
Identifying the dimensions
Designing the cube
Data warehouse design in OWB
Creating a target user and module
Create a target user
Create a target module
OWB design objects
Summary
4. Creating the Target Structure in OWB
Creating dimensions in OWB
The Time dimension
Creating a Time dimension with the Time Dimension Wizard
The Product dimension
Product Attributes (attribute type)
Product Levels
Product Hierarchy (highest to lowest)
Creating the Product dimension with the New Dimension Wizard
The Store dimension
Store Attributes (attribute type), data type and size, and (Identifier)
Store Levels
Store Hierarchy (highest to lowest)
Creating the Store dimension with the New Dimension Wizard
Creating a cube in OWB
Creating a cube with the wizard
Using the Data Object Editor
Summary
5. Extract, Transform, and Load Basics
ETL
Manual ETL processes
Staging
To stage or not to stage
Configuration of a staging area
Mappings and operators in OWB
The canvas layout
OWB operators
Source and target operators
Data flow operators
Pre/post-processing operators
Summary
6. ETL: Putting it Together
Designing and building an ETL mapping
Designing our staging area
Designing the staging area contents
Building the staging area table with the Data Object Editor
Designing our mapping
Review of the Mapping Editor
Creating a mapping
Adding source tables
Adding a target table
Connecting source to target
Joiner operator attribute groups
Connecting operators to the Joiner
Defining operator properties for the Joiner
Adding an Aggregator operator
Summary
7. ETL: Transformations and Other Operators
STORE mapping
Adding source and target operators
Adding Transformation Operators
Using a Key Lookup operator
Creating an external table
Creating and loading a lookup table
Retrieving the key to use for a Lookup Operator
Adding a SUBSTR Transformation Operator
Adding a Constant operator
Adding a TO_NUMBER transformation
Adding a Key Lookup operator
PRODUCT mapping
SALES cube mapping
Dimension attributes in the cube
Measures and other attributes in the cube
Mapping values to cube attributes
Mapping measures' values to a cube
Mapping PRODUCT and STORE dimension values to the cube
Mapping DATE_DIM values to the cube
Mapping an Expression operator
Features and benefits of OWB
Summary
8. Validating, Generating, Deploying, and Executing Objects
Validating
Validating in the Design Center
Validating from the editors
Validating in the Data Object Editor
Validating in the Mapping Editor
Generating
Generating in the Design Center
Generating from the editors
Generating in the Data Object Editor
Generating in the Mapping Editor
Default operating mode of the mapping
Selecting the generation style
Deploying
The Control Center Service
Deploying in the Design Center and Data Object Editor
The Control Center Manager
The Control Center Manager window overview
The Object Details window
The Control Center Jobs window
Deploying in the Control Center Manager
Executing
Deploying and executing remaining objects
Deployment Order
Execution order
Summary
9. Extra Features
Additional editing features
Metadata change management
Recycle Bin
Cut, copy, and paste
Snapshots
Metadata Loader (MDL) exports and imports
Synchronizing objects
Changes to tables
Updating object definitions
Synchronizing
Inbound or outbound
Matching and synchronizing strategy
Viewing the synchronization plan
Changes to dimensional objects and auto-binding
Warehouse Builder online resources
Summary
Index

Oracle Warehouse Builder 11g Getting Started

Bob Griesemer

Oracle Warehouse Builder 11g Getting Started

Copyright © 2009 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: August 2009

Production Reference: 1300709

Published by Packt Publishing Ltd.

32 Lincoln Road

Olton

Birmingham, B27 6PA, UK.

ISBN 978-1-847195-74-6

www.packtpub.com

Cover Image by Parag Kadam (<[email protected]>)

Credits

Author

Bob Griesemer

Reviewers

Anitha Kadaru

Yasodarani Venkatesan

Acquisition Editor

James Lumsden

Development Editor

Swapna V. Verlekar

Technical Editors

Arani Roy

Reshma Sundaresan

Copy Editor

Sneha Kulkarni

Editorial Team Leader

Abhijeet Deobhakta

Project Team Leader

Lata Basantani

Project Coordinators

Ashwin Shetty

Neelkanth Mehta

Indexer

Rekha Nair

Proofreader

Chris Smith

Production Coordinator

Adline Swetha Jesuthas

Cover Work

Adline Swetha Jesuthas

About the Author

Bob Griesemer has over 27 years of software and database engineering/DBA experience in both government and industry, solving database problems, designing and loading data warehouses, developing code, leading teams of developers, and satisfying customers. He has been working in various roles involving database development and administration with the Oracle Database with every release since Version 6 of the database from 1993 to the present. He has also been performing various tasks, including data warehouse design and implementation, administration, backup and recovery, development of Perl code for web-based database access, writing Java code utilizing JDBC, migrating legacy databases to Oracle, and developing Developer/2000 Oracle Forms applications. He is currently an Oracle Database Administrator Certified Associate, and is employed by the Northrop Grumman Corporation, where he is the Senior Database Engineer and primary data warehouse ETL specialist for a large data warehouse project.

I would like to thank my two co-workers, Anitha Kadaru and Yasodarani Venkatesan, who were kind enough to review this book. With their wealth of knowledge of data warehousing and Business Intelligence, they provided invaluable comments that helped me keep the book on track. I'd like to thank David Allan of the Oracle Warehouse Builder development team at Oracle for putting up with my numerous questions and requests for clarification about various aspects of the software. Lastly and most importantly, I'd like to thank my wife Lynn and children Robby, Melanie, Hilary, Christina, Millie, and Mikey for doing without a husband and dad for major periods of time over the last year, while I worked on this book. Your understanding and support has been a big help!

About the Reviewers

Anitha Kadaru is employed with Northrop Grumman and has more than 12 years of experience in leading and supporting Information Technology (IT) development, including 10 years of experience of directly supporting the Decision Support Systems (DSS). She provides expertise in a broad range of Common Off-The-Shelf (COTS) applications for Business Intelligence (BI), data integration, and data architectures, and she is expert in all phases of system lifecycle development for the DSS applications. She has in-depth technical knowledge and exceptional analytical skills with implementing the COTS solutions in data warehousing, the ETL, and the BI technical areas. She has expertise in data engineering with years of data analysis, data design, dimensional modeling, and data management expertise.

Yasodarani Venkatesan is employed by Northrop Grumman as a Data Warehouse Analyst on a Healthcare project. In the past 11 years, she has worked on several large and small data warehousing projects in sales, logistics, finance, healthcare, and HR domain areas. She has expertise in designing and modeling star and snowflake schema design, designing and implementing the ETL processes for converting/transforming data, designing and implementing metadata layers in the Business Intelligence (BI) applications, and quality assurance.

Preface

Competing in today's world requires a greater emphasis on strategy, long-range planning, and decision making, and this is why businesses are building data warehouses. Data warehouses are becoming more and more common as businesses have realized the need to mine the information that is stored in electronic form. Data warehouses provide valuable insight into the operation of a business and how best to improve it. Organizations need to monitor these processes, define policy, and at a more strategic level, define the visions and goals that will move the company forward in the future. If you are new to data warehousing in general, and to Extract, Transform, and Load (ETL) in particular, and need a way to get started, the Oracle Warehouse Builder is a great application to use to build your warehouse. The OracleWarehouseBuilder (OWB) is a tool provided by Oracle that can be used at every stage of the implementation of a data warehouse right from the initial design and creation of the table structure to ETL and data-quality auditing.

We will build a basic data warehouse using Oracle Warehouse Builder. It has the ability to support all phases of the implementation of a data warehouse from designing the source and target information, the mappings to map data from source to target, the transformations needed on the data, and building the code to implementing the mappings to load the data. You are free to use any or all of the features in your own implementation.

What this book covers

This book is an introduction to the Oracle Warehouse Builder (OWB). This is an introductory, hands-on book so we will be including in this book the features available in Oracle Warehouse Builder that we will need to build our first data warehouse.

The chapters are in chronological order to flow through the steps required to build a data warehouse. So if you are building your first data warehouse, it is a good idea to read through each chapter sequentially to gain maximum benefit from the book. Those who have already built a data warehouse and just need a refresher on some basics can skip around to whatever topic they need at that moment.

We'll use a fictional toy company, ACME Toys and Gizmos, to illustrate the concepts that will be presented throughout the book. This will provide some context to the information presented to help you apply the concepts to your own organization. We'll actually be constructing a simple data warehouse for the ACME Toys and Gizmos company. At the end of the book, we'll have all the code, scripts, and saved metadata that was used. So we can build a data warehouse for practice, or use it as a model for building another data warehouse.

Chapter 1: An Introduction to Oracle Warehouse Builder starts off with a high-level look at the architecture of OWB and the steps for installing it. It covers the schemas created in the database that are required by OWB, and touches upon some installation topics to provide some further clarification that is not necessarily found in the Oracle documentation. Most installation tasks can be found in the Oracle README files and installation documents, and so they won't be covered in depth in this book.

Chapter 2: Defining and Importing Source Data Structures covers the initial task of building a data warehouse from scratch, that is, determining what the source of the data will be. OWB needs to know the details about what the source data structures look like and where they are located in order to properly pull data from them using OWB. This chapter also covers how to define the source data structures using the Data Object Editor and how to import source structure information. It talks about three common sources of data—flat files, Oracle Databases, and Microsoft SQL Server databases—while discussing how to configure Oracle and OWB to connect to these sources.

Chapter 3: Designing the Target Structure explains designing of the data warehouse target. It covers some options for defining a data warehouse target structure using relational objects (star schemas and snowflake schemas) and dimensional objects (cubes and dimensions). Some of the pros and cons of the usage of these objects are also covered. It introduces the Warehouse Builder for designing and starts with the creation of a target user and module.

Chapter 4: Creating the Target Structure in OWB implements the design of the target using the Warehouse Builder. It has step-by-step explanations for creating cubes and dimensions using the wizards provided by OWB.

Chapter 5: Extract, Transform, and Load Basics introduces the ETL process by explaining what it is and how to implement it in OWB. It discusses whether to use a staging table or not, and describes mappings and some of the main operators in OWB that can be used in mappings. It introduces the Warehouse Builder Mapping Editor, which is the interface for designing mappings.

Chapter 6: ETL: Putting it Together is about creating a new mapping using the Mapping Editor. A staging table is created with the Data Object Editor, and a mapping is created to map data directly from the source tables into the staging table. This chapter explains how to add and edit operators, and how to connect them together. It also discusses operator properties and how to modify them.

Chapter 7: ETL: Transformations and Other Operators expands on the concept of building a mapping by creating additional mappings to map data from the staging table into cube and dimensions. Additional operators are introduced for doing transformations of the data as it is loaded from source to target.

Chapter 8: Validating, Generating, Deploying, and Executing Objects covers in great detail the validating of mappings, the generation of the code for mappings and objects, and deploying the code to the target database. This chapter introduces the Control Center Service, which is the interface with the target database for controlling this process, and explains how to start and stop it. The mappings are then executed to actually load data from source to target. It also introduces the Control Center Manager, which is the user interface for interacting with the Control Center Service for deploying and executing objects.

Chapter 9: Extra Features covers some extra features provided in the Warehouse Builder that can be very useful for more advanced implementations as mappings get more numerous and complex. The metadata change management features of OWB are discussed for controlling changes to mappings and objects. This includes the recycle bin, cutting/copying and pasting objects to make copies or backups, the snapshot feature, and the metadata loader facility for exporting metadata to a file. Keeping objects synchronized as changes are made is discussed, and so is the auto-binding of tables to dimensional objects. Lastly, some additional online references are provided for further study and reference.

What you need for this book

The following software is required for this book:

Oracle Warehouse Builder 11gMicrosoft SQL Server 2008 Express with Advanced Services

Who this book is for

If you are new to data warehousing and you have to build your first data warehouse using OWB, or have implemented a data warehouse using another tool and are now using OWB for the first time, this book is for you. You can also use it as a refresher if you are a more advanced user. An ever-increasing number of businesses are implementing data warehouses and if you are reading this book, then even yours has most likely chosen to implement one.

This book is for anyone tasked with building a data warehouse and loading data into it using Oracle Warehouse Builder. It is primarily aimed at database administrators and engineers who are new to data warehousing and are building a data warehouse for the first time using OWB. This book can also be used as a refresher on basic OWB features. Think of it as a beginner's guide to OWB. It can be helpful for any IT professional looking to broaden his or her knowledge about data warehousing in general and Oracle Warehouse Builder in particular.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an email to <[email protected]>, and mention the book title via the subject of your message.

If there is a book that you need and would like to see us publish, please send us a note in the SUGGEST A TITLE form on www.packtpub.com or email <[email protected]>.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book on, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code for the book

Visit http://www.packtpub.com/files/code/5746_Code.zip to directly download the example code.

The downloadable files contain instructions on how to use them.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration, and help us to improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the let us know link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata added to any list of existing errata. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or web site name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at <[email protected]> if you are having a problem with any aspect of the book, and we will do our best to address it.

Chapter 1. An Introduction to Oracle Warehouse Builder

The OracleWarehouseBuilder (OWB) is what this book is all about, so let's start discussing it by looking at it from a high level. We'll talk about some installation topics and the various components that compose this application. Oracle provides some detailed installation documentation and user guides that give you step-by-step instructions on how to install the product and the prerequisites we need to have in place. So we will focus more on some general topics that will help us understand the installation better. We'll walk through a basic installation that can be followed along and actually performed while reading. We'll be accepting most of the defaults during the installation for simplicity. For more advanced installation requirements, dig into the Oracle installation documentation to get familiar with the options that are available. You can find this at http://www.oracle.com/pls/db111/homepage by clicking on the Installing and Upgrading link in the lefthand frame.

Introduction to data warehousing

Although you may not be familiar with data warehousing, you have probably at least heard the term. Data warehouses are becoming increasingly common as businesses have realized the need to be able to mine the information they have stored in the electronic form in order to provide a valuable insight into the operation of their business and how best to improve it. Organizations need to monitor these processes, define policies, and—at a more strategic level—define the visions and goals that will move the company forward in the future. Operational transactional systems have greatly benefited the daily functioning of the enterprise. But now, organizations are shifting to a more decisional-based requirement from their computing platforms and are looking to build data warehouses. This is where OWB enters the picture to help organizations with the task of building that data warehouse.

Introduction to our fictional organization

The manuals that Oracle supplies with its database and applications contain a great deal of information. However, it can be hard to relate that information to the real-world ways of implementing the database and applications. Anyone who has ever tried to read a technical user guide or reference provided with a database or application will know what that means. It is a great benefit to be able to learn about a new software tool by seeing how that tool is actually used within the context of an actual organization conducting a business. This is precisely the focus of this book. We'll be building an actual data warehouse using a fictional organization as an example.

Before we talk about what a data warehouse is, let's get introduced to the fictional organization we'll be using to demonstrate the use of the Warehouse Builder to build a data warehouse. Throughout this book, we will be using examples of the concepts involved by making reference to a fictional organization named ACME ToysandGizmos, which is sales oriented. It is an entirely made-up organization, and any similarity to a real company is completely coincidental. This book will provide explanations throughout on how to use the OWB tool to build a data warehouse within the context of this invented company, which is involved in storefront and online Internet sales. Thus, it will demonstrate practical ways of implementing a data warehouse that can be directly applied in the real world.

ACME Toys and Gizmos will have stores all over the United States as well as a number of other countries, and will also have an online storefront for Internet sales. The online transactional processing systems (OLTP) play a huge role in the functioning of any business today, especially in the operation of a sales-oriented business. So this makes a good example to illustrate the subject matter of data warehousing and how to take information from those OLTP systems to load our warehouse.

Although we'll be using a sales organization for our examples, the concepts we'll discuss can apply to any business and will be as generic as possible to assist in doing that.

What is a data warehouse?

We've discussed the business case for implementing a data warehouse by showing how companies these days need information to support strategic-level decision making. We've also introduced the fictional organization that we'll use to provide examples of the concepts we'll be presenting. But we've not yet explained what a data warehouse is.

We will not be dealing in detail with the concept of a data warehouse as that topic would encompass the entire contents of a book by itself. There are a number of good books already written about that topic. Therefore, we will touch upon some high-level concepts only as an introduction and to provide a context for using OWB to build a data warehouse.

Fundamentally, a data warehouse is a decisional database system. It is designed to support the decision makers in the organization in ways a transactional processing system is ill-equipped to handle, such as the strategic-level goals and visions of an organization. To think strategically, a large amount of data over long periods of time is needed. Transactional systems are concerned with the day-to-day operations such as: How many dolls did we sell today and will we need to restock the inventory? How many orders were processed today? How many balls were shipped out today? The strategic thinkers are more concerned with questions such as: How many dolls did we sell today compared to the same time period in the last year? How has our inventory level been for the last few months?

To support that level of information, we need more data than what is provided by the day-to-day transactions. We'll need much more information compiled over greater time periods and this is where the data warehouse comes in. As a data warehouse is different from a transactional database, there are some unique terms used to describe the data it contains. There are also other techniques that should be employed for designing the database for a data warehouse, which would not be a good idea for a transactional database.

The data in a data warehouse is composed of facts (actual numerical measures) and dimensions (descriptive data about those measures) that place the facts in a context that is understandable to the end-user decision maker. For instance, a customer makes a purchase of a toy with ACME Toys and Gizmos on a particular day over the Internet, which results in a dollar amount of the transaction. The dollar amount becomes the fact and the toy purchased, the customer, and the location of the purchase (the Internet in this case) become the dimensions that provide a scope of the fact measurement and give it a meaning.

The design of a data warehouse should be different from that of a transactional database. The data warehouse must handle large amounts of data, and must be simple to query and understand by the end users. While relational techniques and normalization are excellent database design methods for transactional systems to ensure data integrity, they can make understanding a data warehouse difficult for the end users. They can also bog down a data warehouse with long-running queries that have to make use of many joins (including more than one table that share a common data element to look up additional data).

A much better means of representing the data is to de-normalize the data, so that users will not have to be concerned with retrieving the data from multiple tables. The use of foreign keys (a column that references a row in another table) should be restricted in a data warehouse. The outcome is a fact table with foreign keys only to each of the dimension tables. The diagram of the database structure has a fact table in the middle surrounded by dimension tables, resulting in something that looks like a star. Thus, the term star schema is used to refer to this representation of a data warehouse. It is also possible that these dimensions may themselves have other tables surrounding them, resulting in something akin to a snowflake. Thus, the term snowflake schema is also used. This is the dimensional modeling technique of representing a data warehouse.

This design lends itself extremely well to the task of querying large amounts of data by the end users. Users do not have to be bothered with queries involving complicated joins with multiple tables to get the descriptive information they need. This is because the information is included directly in the dimension tables in a de-normalized fashion. If a manager for ACME Toys and Gizmos needs to know what products sold well in the last quarter, the query will only involve two tables—the main fact table containing the data on number of items sold and the product dimension table that contains all the information about the product. The de-normalization means the manager will not have to be concerned with looking up product information in any other tables, as all the details about the product will be included in the one dimension table.

All this is great background information on data warehouses, but you can read any number of other books for much more detailed material on the topic. Our purpose in this book is to introduce the Oracle Warehouse Builder and use it to design and build our first data warehouse. So, let's see how it fits in to this discussion of data warehousing.

Where does OWB fit in?

The Oracle Warehouse Builder is a tool provided by Oracle, which can be used at every stage of the implementation of a data warehouse, from initial design and creation of the table structure to the ETL process and data-quality auditing. So, the answer to the question of where it fits in is—everywhere. It is provided as a part of the Oracle Database Release 11g installation. For the previous Oracle Database Releases, it can be downloaded and installed from Oracle's web site as a free download.

We can choose to use any or all of the features as needed for our project, so we do not need to use every feature. Simple data warehouse implementations will use a subset of the features and as the data warehouse grows in complexity, the tool provides more features that can be implemented. It is flexible enough to provide us a number of options for implementing our data warehouse as we'll see in the remainder of the book.

Installation of the database and OWB

We'll be using the latest version of the database as of this writing—Oracle Database 11g Release 1—and the corresponding version of OWB that (as of this release) is included with the database install. If you have that version of the database installed already, you can skip this section and move right on to the next. If not, then keep reading as we discuss the installation of the database software.

Downloading the Oracle software

We can download the Oracle database software from Oracle's web site, provided we adhere to their license agreement. This agreement basically says we agree to use the database and the accompanying software either for development of a prototype of our application or for our own learning purposes. If we proceed to use this application internally or make it commercially available, then we will need to purchase a license from Oracle. For the purpose of working through the contents of this book to learn OWB, we need to install the database, which is covered under the license agreement for the free download.

We can find the database on the OracleTechnologyNetwork web site (http://www.oracle.com/technology). The main database download is usually the first download listed under FEATURED DOWNLOADS on the main page. We need to register on the site, in order to create an account, before it lets us download any files, but there is no charge for that. The download files are classified by the platform on which they can be executed, so we'll choose the one for the system we'll be hosting the database on. We'll have to accept the license agreement first before the web page will let us download the file. The download files are anywhere from 1.7 GB to 2.3 GB in size, depending on the platform we'll be hosting it on. So we do not want to attempt this download unless we have a Broadband Internet connection (that is, cable, DSL, and so on). We'll download the install file and unzip it to a folder on a drive with enough available space. The installation files are temporary and are not needed after the installation is done, so we'll be able to delete them to free up space if needed.

A word about hardware and operating systems

When installing software of this magnitude, we have to decide whether we'll have to buy additional hardware and a different operating system to run the database and OWB. OWB will run in the following databases:

Oracle Database 11g R1 Standard EditionOracle Database 11g R1 Enterprise EditionOracle Database 10g R2 Standard EditionOracle Database 10g R2 Enterprise Edition

This list is for the most recent version of OWB, which we'll be using throughout this book. We can download older versions of OWB that will run on older versions of the database, but we will not have the benefit of the improvements as in the latest version of the software. Much of what we'll be doing with the software throughout the course of the book can also be done on previous versions of the software. However, due to the changes made to things such as the interface, it would be easiest to follow along using the most recent version.

For this book, the platform is Windows Vista with Oracle Database 11g Release 1 (11.1.0.6) Enterprise Edition (which is the most recent version as of this writing), which is available from the download site. The Enterprise Edition of the database was chosen because it allows us to make full use of the features of the Warehouse Builder, especially in the area of dimensional modelling. There are some errors that will be generated by the client software when running in the Standard Edition installation due to code dependencies. These code dependencies are in libraries that are installed with the Enterprise Edition, but not the Standard Edition. We could use OWB with the Standard Edition, but then we would be limited in the type of objects we could deploy. For instance, dimensions and cubes would be problematic, and without using them we'd be missing out on a major functionality provided by the tool. If we want to develop any reasonably-sized data warehouse, the Enterprise Edition is the way to go.

Everything that we'll work through in this book was done on a laptop personal computer with an Intel Core 2 processor running at 1.67 GHz and 2 GB of RAM. Oracle says 1 GB of RAM will suffice, but it is always good to have more to provide better performance. Minimum specifications usually result in underpowered systems for all but the very basic processing. In terms of hard disk space, Oracle specifies that 4.5 GB is required for the basic database installation. We'll need about 2 GB just to save the installation files, so to make sure we have plenty of space, we should plan for something between 10 GB and 15 GB of available disk space just to be safe. We don't want to install the database software and then find that we don't have any space on our hard drive.

Oracle supports its database installed on Windows and Unix. For Windows, it supports Windows XP Professional or Windows Vista (Business Edition, Enterprise Edition, or Ultimate Edition) as well as Windows Server 2003. The system mentioned above that was used for writing this book and working through all the examples, is running Windows Vista Home Premium Edition with Service Pack 1 and the database installed runs on it. We certainly would not want to use this configuration for large production databases, but it works fine for simple databases and learning purposes. The installation program will first do a prerequisite check of the computer and will flag any problems that it sees, such as not enough memory or an incorrect version of the operating system. For working through this book on our own to learn about the Warehouse Builder, we should be OK as long as we are running XP or Vista. However, for business users who would be installing the Oracle Database and OWB for use at work using Windows, it would be a good idea to stick with the recommended configurations of Windows XP Professional, Windows Vista (Business Edition, Enterprise Edition, and Ultimate Edition), or Windows Server.

Tip

Server versus workstation

We don't have to use a computer that is configured as a server to host the Oracle database. It will get installed on a regular workstation as long as the minimum system requirements are met. However, we might encounter a minor issue. A workstation is usually configured to use Dynamic Host Configuration Protocol (DHCP) to obtain its IP address. This means the address is not specified as a fixed address and can change the next time the system boots up. The Oracle database requires a fixed address to be assigned, and it can install on a system with DHCP. But it will also require the Microsoft Loopback Adapter to be installed as the primary network interface to provide that fixed address. If this situation is encountered, the installer prerequisite checks will alert us to that and give us instructions on how to proceed. It will not harm our existing network configuration to install that option. That is the way the laptop mentioned above was configured for this book project.

Installing Oracle database software

So far we've decided what system we're going to host the database on, downloaded the appropriate install file for that system, and unzipped the install files into a folder to begin the installation. We'll navigate to that folder and run the setup.exe file located there. This will launch the Oracle Universal Installer program to begin the installation.

We are installing the full database, which now automatically includes the Warehouse Builder client and database components. If we had an older version of the database (10g R2 for example) that did not include the Warehouse Builder software, or if we wanted to run the client on a different workstation than where the database software is installed, then there is the option to install the Warehouse Builder by itself.

Note

A separately downloadable install for the standalone option is available at http://www.oracle.com/technology/software/products/warehouse/index.html. Skip ahead to the section titled Installing the OWB standalone software if just the Warehouse Builder software is needed.

One of the first questions the installer will ask us is about setting up our ORACLE_HOME—the destination to install the software on the system and the name of the home location. Oracle uses this information when running to determine where to find its files on the system. It will store the information in the registry on Windows. It will suggest a default name, which can be changed. We'll leave it set to the default—OraDb11g_home1.

The ORACLE_BASE and ORACLE_HOME locations will have suggested paths filled in for us. It is a good idea to leave the path names as they are and only change the drive designation if we'd like to install to a different hard drive. The install program will suggest a drive for the installation, but we might have a different preference.

Oracle recommends a convention for naming folders and files that they call the Optimal Flexible Architecture (OFA). This is described in Appendix B of the Oracle Database Installation Guide for Microsoft Windows, which can be found at the following URL: http://download.oracle.com/docs/cd/B28359_01/install.111/b32006/ofa.htm#CBBEDHEB. It is a good idea to follow their recommendations for standardization so that others who have to work with the database files will know where to find them, and to save us from problems with possible conflicts with other Oracle products we may have installed. If we keep the default folder locations intact and only change the drive letter, we will adhere to the standard. We'll be asked to choose our installation method and whether to install a starter database. We're not going to let it install a starter database for us because it's going to default to a transactional database and we want a data warehouse. So on the Select Installation Method screen, we'll check off the Basic Installation type and uncheck the box for installing a starter database. The Select Installation Method screen should look similar to the following:

Tip

Basic versus advance install

The Basic Installation method is the quickest and easiest, but makes many decisions for us that the Advanced Installation option will ask us about. For the purpose of working through the examples in this book, we will be OK with the basic installation. But if we were installing for a production environment, we would want to read through the Oracle Database Installation Guide (http://www.oracle.com/technology/documentation/database.html; click on View Library to view the documentation online or click on Download to download the documentation) to familiarize ourselves with the various situations that would require us to use the advanced installation option. This would ensure that we don't end up with a database installation that will not support our needs.

We will click on the Next button to continue and the install program will perform its prerequisite checks to ensure our system is capable of running the database. That should show a status of Succeeded for all the checks. If any of the checks do not pass, we have to correct them and start over before continuing. When everything reports a status of Succeeded, we can click on the Next button.

The install screens will proceed to the Summary screen where we can verify the options that we selected for the installation before actually doing it. So here we can make any last minute changes.

If we expand the New Installations entry, it will list all of the database products and features that will be installed. This includes the feature we are most interested in, the Oracle 11gWarehouse Builder Server option, which is included automatically in 11g database installations. The following image illustrates what will appear in the list for OWB and the option we are interested in is circled:

Now that we've specified our installation method and verified the options and components to be installed, we will click on the Install button to proceed with the installation. We'll be presented with the progress screen as it performs the installation with a progress bar that proceeds to the right as it installs.

Tip

Location of install results

A good idea is to pay particular attention to a message at the bottom of the install progress screen, which tells us where we can find a log of the installation. The logs that the installer keeps are stored in the Oracle folder on the system drive in the following subfolder: C:\Program Files\Oracle\Inventory\logs. The files are named with the following convention: install ActionsYYYY-MM-DD_HH-MI-SSPM where YYYY is the year, MM the month, DD the day, HH the hour, MI the minutes, SS the seconds of the time the installation was performed, and PM is either AM or PM. The files will have a .log extension. This information may come in useful later to see just what products were installed. The folder also will contain any errors encountered during the installation in files with a file extension of .err and any output generated by the installer in files with a file extension of .out.

When it completes we'll be presented with the final screen with a little reminder similar to the following where bob is the login name of the user running the installation:

This is important information because our database could be rendered inoperable if files are deleted. Now that it's installed, it's time to proceed with creating a database. But there is one step we have to do first—we need to configure the listener.

Configuring the listener

The listener is the utility that runs constantly in the background on the database server, listening for client connection requests to the database and handling them. It can be installed either before or after the creation of a database, but there is one option during the database creation that requires the listener to be configured—so we'll configure it now, before we create the database.

Run Net Configuration Assistant to configure the listener. It is available under the Oracle menu on the Windows Start menu as shown in the following image:

The welcome screen will offer us four tasks that we can perform with this assistant. We'll select the first one to configure the listener, as shown here:

The next screen will ask you what we want to do with the listener. The four options are as follows:

AddReconfigureDeleteRename

Only the Add option will be available since we are installing Oracle for the first time. The remainder of the options will be grayed out and will be unavailable for selection. If they are not, then there is a listener already configured and we can proceed to the next section—Creating the database.

For those of us installing for the first time on our machines, we need to proceed with the configuration. The next screen will ask us what we want to name the listener. It will have LISTENER entered by default and that's a fine name, which states exactly what it is, so let's leave it at that and proceed.

The next screen is the protocol selection screen. It will have TCP already selected for us, which is what most installations will require. This is the standard communications protocol in use on the Internet and in most local networks. Leave that selected and proceed to the next screen to select the port number to use. The default port number is 1521, which is standard for communicating with Oracle databases and is the one most familiar to anyone who has ever worked with an Oracle database. So, change it only if you want to annoy the Oracle people in your organization who have all memorized the default Oracle port of 1521.

Tip

To change or not change the default listener port

Putting aside the annoyance, the Oracle people might have to suffer as there are valid security reasons why we might want to change that port number. Since it is so common, the people accustomed to working with the Oracle database aren't the only people who know that port number. Hackers looking to break into an Oracle database are going to go straight for that port number, so if we change it to something obscure, the database will be harder to find on the network for the people with malicious intent. If it does get changed, be sure to make a note of the assigned number.

There also may be firewall issues that allow only certain port numbers to be open through the firewall, which means communication on any of the other port numbers would be blocked. 1521 might be allowed by default since it is common for the Oracle database. It would be a good idea to check with the network support personnel to get their recommendation.

That is the last step. It will ask us if we want to configure another listener. Since we only need one, we'll answer "no" and finish out the screens by clicking on the Finish button back on the main screen.

Creating the database

So far we have the Oracle software installed and a listener configured, but we have not created a database. We chose not to install the starter database because that defaults to a general purpose transactional database, and we want one that is oriented toward a data warehouse.

We can install a new database using Database Configuration Assistant, which Oracle provides to walk us step-by-step through the process of creating a database. It is launched from the Windows Start menu as shown in the following image:

Running this application may require patience as we have to wait for the application to load after it's selected. Depending on the system it is running on, it can take over a minute to display, during which there is no indication that anything is happening. It may be tempting to just select it again from the Start menu because it appears it didn't work the first time, but don't as that will just end up running two instances of the program. It will appear soon. The following are steps in the creation process:

The first step is to specify what action to take. Since we do not have a database created, we'll select the Create a Database option in Step 1. If there was a database already created, the options for configuring a database or deleting a database would be selectable. Templates can be managed with the Database Configuration Assistant application, which are files containing preset options for various database configurations. Pre-supplied templates are provided with the application, and the application has the ability to custom-build templates.

Automatic Storage Management can be configured as well. It is Oracle's feature for databases for automatically managing the layout and storage of database files on the system. These are both topics for a more advance book on the Oracle Database. We will be creating a database using an existing template.

This step will offer the following three options for a database template to select:
General Purpose or Transaction ProcessingCustom DatabaseData Warehouse

We are going to choose the Data Warehouse option for our purposes. If we already had a database installed that we wanted to use for learning OWB, but that's not configured as a data warehouse, it's not a problem. We can still run OWB hosted on it and create the data warehouse schema (database user and tables), which we'll be creating as we proceed through the book. This would be fine for learning purposes, but for production-ready data warehouses a database configured specifically as a data warehouse should be used.

This step of the database creation will ask for a database name. The name of the database must be one to eight characters in length. Any more than that will generate an error when trying to proceed to the next screen. This is an Oracle database limitation. The database name can also include the network domain name of the domain of the host it is running on, to further uniquely identify it. Follow the name with a period and then the domain, which itself can include additional periods.

If this database is being created for business use, a good naming scheme would reflect the purpose of the database. Since we're creating this database for the data warehouse of ACME Toys and Gizmos Company, we'll choose a name that reflects this—ACME for the company name and DW for data warehouse, resulting in a database name of ACMEDW. It is important to remember this name as it will be a part of any future connections to the database.

As the database name is typed in, the SID (or Oracle System Identifier) is automatically filled in to match it. If the domain is added to the database name, the SID will stop pre-populating after the first period is entered. The end result is that the SID becomes the same as the first part of the database name.

This step of the database creation process asks whether we want to configure Enterprise Manager. The box is checked by default and leave it as is. This is a web-based utility Oracle provides for controlling a database, and as it is very useful to have, we will want to enable it. There are two options for controlling a database: registering with Grid Control or local management. Grid Control is Oracle's centralized feature for controlling a grid, a network of loosely coupled modular hardware and software components that can be joined and rejoined together on demand to meet business needs. That is what the "g" in Oracle Database 11g stands for. If your network is not configured in a grid architecture, or you are installing on a standalone machine, then choose the local management option. It will automatically detect a Grid Control agent that is running locally, and if it doesn't find one, the Grid Control option will be grayed out anyway. In that case, you will only be able to select local management.

When the Next button is clicked, the following message may appear:

That means a listener was not configured before creating the database. If this happens, we'll have to just pause our database creation and go back to the previous section about installing the listener and then come right back to this spot. There is no need to exit out of the database install window while doing this; just leave it on step 4. When we've completed the listener configuration, this screen will allow us to proceed to the next screen without that warning popping up again.

On this screen (step 5) we can set the database passwords on the system accounts using a different one for each account, or by choosing one password for all four. We're going to set a single password on all four, but for added security in a production environment, it is a good idea to make a different password for each. Click on the option to Use the Same Administrative Password for All Accounts and enter a password. This is very important to remember as these are key system accounts used for database administrative control.This step is about storage. We'll leave it at the default of File System for storage management. The other two options are for more advanced installations that have greater storage needs.This step is for specifying the locations where database files are to be created. This can be left at the default for simplicity (which uses the locations specified in the template and follows the OFA standard for naming folders described above). A storage screen will come up where we'll be able to change the actual file locations if we want, for all but the Oracle-Managed Files option.

Note

The Oracle-Managed Files option is provided by the database so that we can let Oracle automatically name and locate our data files. A folder location is specified on the step 7 screen, which will become the default location for any files created using this option. This is why we won't be able to change any file locations later on during the installation if this option is chosen. However, files can still be created with explicit names and locations after the database is running.

The next screen is for configuring recovery options. We're up to step 8 now. If we were installing a production database, we would want to make sure to use the Flash Recovery option and to Enable Archiving. Flash Recovery is a feature Oracle has implemented in its database to provide a location that is managed by the database. It stores backups and files needed to recover a database in the event of disk failure. With Flash Recovery Area specified, we can recover data that would otherwise be lost in a system failure.

Enabling archiving turns on the archive log mode of the database, which causes it to archive the redo logs (files containing information that is used by the database to recover transactions in the event of a failure.) Having redo logs archived means you can recover your database up to the time of the failure, and not just up to the time of the last backup.

These recovery options will consume more disk space, but will provide a recovery option in the event of a failure. Each individual will have to make the call for their particular situation whether that is needed or not.

We'll specify Flash Recovery and for simplicity, we will just leave the default for size and location. We will not enable archiving at this point. These options can always be modified after the database is running, so this is not the last chance to set them.

This step is where we can have the installation program create some sample schemas in the database for our reference, and specify any custom scripts to run. The text on the screen can be read to decide whether they are needed or not. We don't need either of these for this book, so it doesn't matter which option we choose.The next screen is for Initialization Parameters. These are the settings that are put in place to define various options for the database such as Memory options. There are over 200 different parameters and to go through all of them would take much more time and space than we have here. There is no need for that at this point as there are about 28 parameters that Oracle says are basic parameters that every database installation should set. We're just going to leave the defaults set on this screen, which will set the basic parameters for us based on the amount of memory and disk space detected on our machine. We'll just move on from here. Once again, these can all be adjusted later after the database is created and running if we need to make changes. The next screen is for security settings. For the purposes of this book and its examples, we'll check the box to Revert to pre-11g security settings since we don't need the additional features. However, for a production environment, it is a good idea to leave the default checked to use Oracle's more advanced security features.This step is automatic maintenance and we'll deselect that option and move on, since we don't need that additional functionality. Automatic Maintenance Tasks are tasks that run in predefined maintenance windows of time to perform various preconfigured maintenance operations on the database. Since the database for this book is only for learning purposes, it is not critical that these maintenance tasks be done automatically.

Automatic maintenance is designed to run during preset maintenance windows, which are usually in the middle of the night. So if the database system is shut down every day, there wouldn't be a good window to run the tasks on regularly anyway. If installing in a production environment with servers that will be running 24 hours a day every day, then consider setting up the automatic maintenance to occur. Oracle provides three pre-configured maintenance tasks to choose from—collecting statistics for the query optimizer (for improving performance of SQL queries), Automatic Segment Advisor for analyzing storage space for areas that can possibly be reclaimed for use, and the Automatic SQL Tuning Advisor for automatically analyzing SQL statements for performance improvements.

The next step (step 13 of 14) is the Database Storage screen referred to earlier. Here the locations of the pre-built data and control files can be changed if needed. They should be left set to the default for simplicity since this won't be a production database. For a production environment, we would want to consider storing datafiles on separate partitions for performance reasons, and to minimize the impact of disk failures on the running database if something goes wrong. If all the datafiles are on one drive and it goes bad, then the whole database is down.The final step has the following three options, and any or all can be selected for creating the database:
Create the database directlySave the creation options as a template for later useSave database creation scripts that can be used later to create the database

We'll leave the first checkbox checked to go ahead and create the database.

The Next button is grayed out since this is the last screen. So click on the Finish