IBM InfoSphere Replication Server and Data Event Publisher - Pav Kumar-Chatterjee - E-Book

IBM InfoSphere Replication Server and Data Event Publisher E-Book

Pav Kumar-Chatterjee

0,0
47,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Business planning is no longer just about defining goals, analyzing critical issues, and then creating strategies. You must aid business integration by linking changed-data events in DB2 databases on Linux, UNIX, and Windows with EAI solutions , message brokers, data transformation tools, and more. Investing in this book will save you many hours of work (and heartache) as it guides you around the many potential pitfalls to a successful conclusion.

This book will accompany you throughout your Q replication journey. Compiled from many of author's successful projects, the book will bring you some of the best practices to implement your project smoothly and within time scales. The book has in-depth coverage of Event Publisher, which publishes changed-data events that can run updated data into crucial applications, assisting your business integration processes. Event Publisher also eliminates the hand coding typically required to detect DB2 data changes that are made by operational applications.

We start with a brief discussion on what replication is and the Q replication release currently available in the market. We then go on to explore the world of Q replication in more depth. The latter chapters cover all the Q replication components and then talk about the different layers that need to be implemented—the DB2 database layer, the WebSphere MQ layer, and the Q replication layer. We conclude with a chapter on how to troubleshoot a problem. The Appendix (available online) demonstrates the implementation of 13 Q replication scenarios with step-by-step instructions.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 379

Veröffentlichungsjahr: 2010

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

IBM InfoSphere Replication Server and Data Event Publisher
Credits
About the Author
About the Reviewer
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Errata
Piracy
Questions
1. Q Replication Overview
Why do we want to replicate data
Overview of what is available today
The different replication options
Replication toolset
The Replication Center GUI
The ASNCLP command interface
Q replication constituent components
The different types of Q replication
Unidirectional replication
Replicating to a stored procedure
Bidirectional replication
Peer-to-peer replication
Tree replication
Replicating to a Consistent Change Data table
Event Publishing
DB2 replication sources
Replicating XML data types
Replicating compressed tables
Replicating large objects
Other DB2 objects
Q replication filtering and transformations
Filtering rows/columns
Before and After SQL—alternatives
Stored procedure processing
Q replication conflict detection
What is conflict detection?
When do conflicts occur?
Q replication and HADR
Q replication in a DPF environment
Tables with referential integrity
Table load and insert considerations
Summary
2. Q Replication Components
The DB2 database layer
Database/table/column name compatibility
The WebSphere MQ layer
The Q replication layer
What is a logical table?
What is a Replication/Publication Queue Map?
What is a Q subscription?
What is a subscription group?
Q subscription activation
The relationship between the components
The Q Capture and Q Apply programs
Q Capture internals
Q Apply internals
How do Q Capture and Q Apply communicate?
Summary
3. The DB2 Database Layer
Database creation
The control tables
The Q Capture control tables
The Q Apply control tables
The Replication Alert Monitor control tables
Where do the control tables go
DB2 to DB2 replication
DB2 to non-DB2 replication
Non-DB2 to DB2 replication
How are the Q replication control tables populated
Pruning of the control tables
The IBMQREP_SIGNAL control table
Sending signals using the IBMQREP_SIGNAL table
Sending signals using an INSERT statement
Sending signals using ASNCLP
Using a signal to determine where Q Capture is up to
Restricting access to IBMQREP_SIGNAL
Summary
4. WebSphere MQ for the DBA
An introduction to MQ
MQ queues
MQ queue naming standards
MQ queues required for different scenarios
WebSphere MQ commands
Create/start/stop a Queue Manager
Starting a Queue Manager
Checking that the Queue Manager is running
Stopping a Queue Manager
Deleting a Queue Manager
The Queue Manager configuration file
MQ logging
Issuing commands to a Queue Manager (runmqsc)
Displaying the attributes of a Queue Manager
Changing the attributes of a Queue Manager
MQ Listener management
Defining/Starting an MQ Listener
Displaying an MQ Listener
Stopping an MQ Listener
MQ Channel management
To define a Channel
To start a Channel
To display a list of Channels
To display the status of a Channel
To stop a Channel
MQ Queue management
To define a Local Queue
To display the attributes of a Local Queue
To alter the attributes of a Queue
To empty a Local Queue
To delete a Local Queue
To define a Remote Queue
To define a Model Queue
To define a Transmission Queue
To list Queues
WebSphere MQ sample programs—server
To put a message onto a Queue (amqsput)
To retrieve a message from a Queue (amqsget)
To browse a message on a Queue
WebSphere MQ sample programs—client
Dead Letter Queue handler (runmqdlq)
WebSphere MQ message format
MQ error messages
Summary
5. The ASNCLP Command Interface
The ASNCLP environment
The ASNCLP commands
Setting up the administration environment
Setting the environment session
Comments in an ASNCLP script
Possible header lines in a script
Common Q replication tasks
Creating or dropping Q Capture control tables on DB2A
Creating or dropping Q Apply control tables on DB2B
Creating Q Capture and Q Apply control tables in the same database
Queue Map maintenance
Creating a Replication Queue Map
Creating a Publication Queue Map
Dropping a Queue Map
Altering a Replication Queue Map
Creating Q subscriptions and Publications
Q subscription for unidirectional replication
Q subscription for bidirectional replication
Q subscription for P2P two-way replication
Q subscription for P2P three-way replication
Publication for Event Publishing
Q subscription maintenance
Checking the state of a Q subscription
Stopping a Q subscription
Dropping a Q subscription
Altering a Q subscription
Starting a Q subscription
Sending a signal using ASNCLP
Validating the WebSphere MQ environment
Validating WSMQ for the Capture schema
Validating WSMQ for the Apply schema
Validating a Replication Queue Map
Validating a Publication Queue Map
Validating a Q subscription
Validation error messages
Summary
6. Administration Tasks
Defining the MQ queues
Create/drop the Q replication control tables
Create/drop the Q Capture control tables
Create/drop the Q Apply control tables
Registering a table for Q replication
Managing Queue Maps
Creating a Queue Map
Altering a Replication Queue Map
Drop/delete a Queue Map
Listing the RQM for a Receive Queue
Q subscription maintenance
Creating a Q subscription
Altering a Q subscription
Dropping a Q subscription
Reinitializing a Q subscription
Checking the status of a Q subscription
Stopping a Q subscription
Determining when a Q subscription became inactive
Listing the attributes of a Q subscription
Listing all Q subscriptions using a RQM
Specifying a table as the initial load source
Source table maintenance
Adding a column to a Q subscription
Removing a column from a replicated source table
Altering the column attributes of a replicated source table
Performing a reorganization on the source table
Collecting statistics on the source table
Performing a load on the source table
Importing data into the source table
Adding a new source table to Q replication
Stop replicating (remove) a table
Administrative commands/tasks
Viewing messages using asnqmfmt
Retrieving Q Capture Restart Queue information
Q Capture and Q Apply administration
Q Capture administration
Starting Q Capture
Stopping Q Capture
Querying the status of Q Capture
Altering a running Q Capture
Starting Q Capture from a point in the DB2 log
Starting Q Capture without triggering a load
Taking a Q Capture trace (asntrc)
Q Apply administration
Starting Q Apply
Stopping Q Apply
Querying the status of Q Apply
Starting a Receive Queue
The password file
Copying (promoting) Q replication environments
The ASNCLP PROMOTE procedure
Summary
7. Monitoring and Reporting
The database layer
The WebSphere MQ layer
Checking that the Queue Managers are running
Checking the state of the Listeners
Checking the state of the Channels
Checking the state of the Receive Queue
Checking that the Q subscription is active
The Q replication layer
Monitoring Q Capture start up
Monitoring Q Apply start up
Checking that Q Capture and Q Apply are active
Checking the Q Capture and Q Apply log files
Checking the APPLYTRACE and CAPTRACE tables
How far is Q Capture behind the DB2 log
How far is Q Apply behind Q Capture
Listing Q subscription status
Listing Receive Queue status
Table synchronization
The different latencies
The base monitor tables
The Q Capture tables
The Q Apply tables
Collection of data for historical analysis
Historical monitoring of Q Capture
Historical monitoring of Q Apply
To determine the row throughput
Manual monitoring
Monitoring using the Replication Alert Monitor
Q Capture alert conditions
Q Apply alert conditions
Creating the RAM control tables
Setting up e-mail notification
Monitoring Q Capture
Monitoring Q Apply
Starting the Replication Alert Monitor
Monitor management
Checking which monitors are active
Changing or reinitializing a monitor
Stopping a monitor
Suspending or resuming a monitor
The ibmsnap_alerts table
Other tools available to monitor Q replication
The database layer
Optim Data Studio
The DB2 Health Center
The WebSphere MQ layer
The WebSphere MQ Explorer
The WebSphere MQSC interface (runmqsc)
The rfhutil utility
The Q replication layer
The Replication Dashboard
Tivoli Monitoring
The asnqanalyze command
Some what happens if ... scenarios
If MQ is stopped on each server
If the Receive Queue is stopped
If Q Apply is not running
If the Q Apply Queue Manager is not running
If the Receive Queue fills up
If the Q Apply Dead Letter Queue fills up
If a Dead Letter Queue has not been defined
What happens if—summary diagram
Q replication performance considerations
The DB2 database layer
The WebSphere MQ layer
Q Capture
Q Apply
Some error messages
Q Capture: ASN0569E on starting
Q Capture: ASN7094E
Q Apply: hangs when starting
How to handle an ASN7551E message
Q Apply: ASN7094E
Q Apply: ASN7505E
Summary
Index

IBM InfoSphere Replication Server and Data Event Publisher

IBM InfoSphere Replication Server and Data Event Publisher

Copyright © 2010 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: August 2010

Production Reference: 1120810

Published by Packt Publishing Ltd.

32 Lincoln Road

Olton

Birmingham, B27 6PA, UK.

ISBN 978-1-849681-54-4

www.packtpub.com

Cover Image by Sandeep Babu (<[email protected]>)

Credits

Author

Pav Kumar-Chatterjee

Reviewer

Rich Briddell

Acquisition Editor

Rashmi Phadnis

Development Editor

Akash Johari

Technical Editors

Ishita Dhabalia

Aditi Suvarna

Indexer

Monica Ajmera Mehta

Editorial Team Leader

Gagandeep Singh

Project Team Leader

Lata Basantani

Project Coordinator

Poorvi Nair

Proofreader

Lesley Harrison

Graphics

Nilesh Mohite

Production Coordinator

Alwin Roy

Cover Work

Alwin Roy

About the Author

Pav Kumar-Chatterjee (Eur Ing, CENG, MBCS) has been involved in DB2 support on the mainframe platform since 1991, and on midrange platforms since 2000. Before joining IBM, he worked as a database administrator in the airline industry as well as various financial institutions in the UK and Europe. He has held various positions during his time at IBM, including in the Software Business Services team and the global BetaWorks organization. His current position is a DB2 technical specialist in the Software Business. He has been involved with Information Integrator (the forerunner of Replication Server) since its inception, and has helped numerous customers design and implement Q replication solutions, as well as speaking about Q replication at various conferences.

Pav Kumar-Chatterjee has co-authored the DB2 pureXML Cookbook, published in August 2009.

This book would not have been possible without the help that the following people have given us over the years:

Anuradha I Pariti (IBM US), Beth Hamel (IBM US), David Tolleson (IBM US), Jayanti Mahapatra (IBM US), Kevin Lau (IBM US), Neale Armstrong (IBM UK), Nikola Slavicic (IBM Slovenia), Ray Houle (Canada), Sean Byrd (IBM US).

Any mistakes or omissions in this book are the sole responsibility of the author.

About the Reviewer

Rich Briddell is a Senior Managing Consultant with IBM Software Group, Replication Center of Competency, specializing in Q Replication on z/OS, and Distributed platforms. He works in the St. Louis, MO area and has 19 years of experience in the DBMS field. He has been an application development team lead and independent consultant. He holds a Master of Arts degree in Operations Management from the University of Arkansas and a Master of Science from Webster University in Computer Resources and Information Management. He is an IBM Certified Solutions Expert for Informix Dynamic Server V10, V9, and V7, an IBM Certified Advanced Database Administrator for DB2 UDB database systems, and an IBM Certified Application. Developer for DB2 UDB Family, and an IBM Certified Specialist for WebSphere Application Server. He is co-author of the IBM Redbook DB2 Information Integrator Q Replication: Fast Track Implementation Scenarios.

I would like to dedicate this book to Carrie and Scott—you never know where life’s journey will take you, but you will always be in the thoughts of those who love you

Preface

Business planning is no longer just about defining goals, analyzing critical issues, and then creating strategies. You must aid business integration by linking changed-data events in DB2 databases on Linux, UNIX, and Windows with EAI solutions, message brokers, data transformation tools, and more.

This book will accompany you throughout your Q replication journey. It will bring you some of the best practices to implement your project smoothly and within time scales. The book has in-depth coverage of Event Publisher, which publishes changed-data events that can run updated data into crucial applications, assisting your business integration processes. Event Publisher also eliminates the hand coding typically required to detect DB2 data changes that are made by operational applications.

In this book, we start with a brief discussion on what replication is and the Q replication release currently available in the market. We then go on to explore the world of Q replication in more depth. The latter chapters cover all the Q replication components and then talk about the different layers that need to be implemented—the DB2 database layer, the WebSphere MQ layer, and the Q replication layer. We conclude with a chapter on how to troubleshoot a problem. The Appendix (available online) demonstrates the implementation of 13 real-time Q replication scenarios with step-by-step instructions.

What this book covers

Chapter 1, Q Replication Overview, describes why we want to replicate data and what is available today in the IBM world of data replication. It introduces the architecture of Q replication and different types of Q replication available and discusses various DB2 replication sources including XML data and compressed data, and looks at filtering and transformations.

Chapter 2, Q Replication Components, discusses three layers—DB2 database layer, the WebSphere MQ layer, and the Q replication layer that make up a Q replication solution and also showed the relationship between Replication/Publication Queue Map, Q subscription, and subscription group.

Chapter 3, The DB2 Database Layer, looks at creating the databases used in Q replication, the Q replication control tables, and their structure.

Chapter 4, WebSphere MQ for the DBA, illustrates the working and setup of WebSphere MQ.

Chapter 5, The ASNCLP Command Interface, illustrates the working and setup of the ASNCLP Command Interface, guides you through some of the Q replication setup tasks, and shows you how to perform them using ASNCLP commands.

Chapter 6, Administration Tasks, focuses on the administrative tasks that we need to perform to set up and administer a Q replication environment.

Chapter 7, Monitoring and Reporting, looks at monitoring and reporting on the Q replication setup. It also describes the Replication Alert Monitor and how to use monitors.

Appendix A, Setup Procedures: Steps to Follow, describes the tools available to set up Q replication, goes through various scenarios, and gives step-by-step instructions. This can be downloaded from http://www.packtpub.com/sites/default/files/downloads/1544_Appendix.zip

What you need for this book

In the course of this book, you will need the following software utilities to try out various code examples listed:

InfoSphere Replication Server 9.7.1WebSphere MQ V6.0 or V7.0

Who this book is for

If you are a professional who needs to set up and administer a Q replication or Event Publishing environment, then this is the book you need. The book will give you a clear understanding of how to implement Q replication. The examples are based on a Linux, UNIX, or Windows operating system, but the principles are equally applicable to Q replication on z/OS.

Conventions

In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

Code words in text are shown as follows: "A target database is created using the DB2 script file as shown next."

A block of code is set as follows:

CONNECT RESET; DROP DB db2b; CREATE DB db2b; UPDATE DB CFG FOR db2b USING logarchmeth1 disk:c:\temp; BACKUP DATABASE db2b TO c:\temp;

When we wish to draw your attention to a particular part of a command-line block, the relevant lines or item(s) are set in bold:

SUBNAME S STATE_TIME ---------- - -------------------------- T10001 A 2006-02-22-18.32.22.112000 T10002 I 2006-02-22-18.22.14.468000 T10003 I 2006-02-22-18.22.14.478002

Any command-line input or output is written as follows:

$ db2 "select substr(subname,1,10) as subname, state as S, state_time from asn.ibmqrep_subs"

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "There is also a Sample entries button, which will give us some default queue names.".

Note

Warnings or important notes appear in a box like this.

Note

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to <[email protected]>, and mention the book title via the subject of your message.

If there is a book that you need and would like to see us publish, please send us a note in the SUGGEST A TITLE form on www.packtpub.com or e-mail <[email protected]>.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Tip

Downloading the example code for this book

You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at <[email protected]> if you are having a problem with any aspect of the book, and we will do our best to address it.

Chapter 1. Q Replication Overview

Welcome to the start of your journey along the Q replication road. Any journey can be a bumpy ride, but after reading this book and going through the numerous examples, your journey will be a smoother one! In this first chapter, we will take you through the following discussion points:

Why we want to replicate data.What is available today in the IBM world of data replication.The toolsets available to set up and administer a replication environment and look at the code that we need to install for a functioning Q replication solution.Introduce the architecture of Q replication. We look at the different types of replication available, namely the base replication methods of unidirectional, bidirectional, and peer-to-peer, and the replication architectures built on these base methods.Replicating XML data types and compressed tables. We look at some of the design points when considering replicating compressed table.Q replication conflict detection.Available transformation processing for both regular and XML data.

Why do we want to replicate data

Much has been written about why we need to replicate data, so we will keep this short. What's wrong with just storing our data in one place? Well, in today's 24x7 world where being without data for even a short period of time could be catastrophic to our business, we need a method to be able to take a copy of our data and possibly more than one copy and store it securely in a different location. This copy should be complete and be stored as many miles away as possible. Also the amount of data that has to be stored is ever increasing and being generated at a fast rate, so our method needs to be able to handle large volumes of data very quickly.

Overview of what is available today

In the IBM software world today, there are a number of options available to replicate data:

InfoSphere (formerly WebSphere) Replication ServerInfoSphere CDC (formerly the Data Mirror suite of programs)The DB2 High Availability Disaster Recovery (HADR) functionalityTraditional log shipping

In this book, we will cover the first option InfoSphere Replication Server, which from now on, we will refer to as DB2 replication. The other options are outside the scope of this book.

The different replication options

In the world of DB2 replication, we have two main options—SQL replication and Q replication, both of which involve replicating between source and target tables. Event publishing is a subset of Q replication, in that the target is not a table but a WebSphere MQ queue. The choice of replication solution depends on a number of factors, of which the fundamental ones are:

Type of sourceType of targetOperating system support

The DB2 Information Center contains a table, which compares the three types of replication. This table can be used as a quick checklist for determining the best solution to a given business requirement (http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.swg.im.iis.db.repl.intro.doc/topics/iiyrcintrsbsc.html).

The following figure shows the basic operations of SQL replication:

Updates to any tables are logged, and if the table is a registered table (TAB1 and TAB2), then the SQL Capture program (Capture for short) reads the information from the DB2 log and inserts the row into a change data table (CD_<table-name>)—there is one of these for each registered source table. The SQL Apply program (Apply for short) reads from these change data tables and updates the target tables (TAB3 and TAB4).

In Q replication, we do not have the concept of change data tables, as shown in the following figure:

Any updates to registered tables, which the Q Capture program (Q Capture for short) detects are put onto a WebSphere MQ queue. The Q Apply program (Q Apply for short) then reads from these queues and updates the target tables.

In Event Publishing, there is no Q Apply and no target tables as shown in the following diagram:

Q Capture puts messages into the WebSphere MQ queues, and it is up to the consuming applications to destructively/non-destructively read from these queues.

Replication toolset

We have three ways of administering a replication environment. We can use:

The Replication Center GUIThe ASNCLP command interfaceStandard SQL

We recommend that when you are new to replication, you should use the Replication Center and once you are confident with the process, you can then progress onto the ASNCLP interface. For defining production systems, we recommend using the ASNCLP interface, because the commands can be scripted.

The ASNCLP interface generates SQL, which is run against the appropriate control tables to define and administer the replication environment. Therefore, in theory, it is possible for us to write our own SQL to do this. However, the SQL can be complicated and manual coding could result in errors, and therefore we recommend not using this method.

The Replication Center GUI

The Replication Center GUI can be used to set up and administer a Q replication scenario. See Chapter 6, Administration Tasks, for details on accessing and using the Replication Center. The launchpad screen is shown next.

The Replication Center has a series of wizards, which are very useful if we are new to replication. The default wizard screen is the launchpad screen, and it can be accessed from the main replication screen through Replication Center | Launchpad. The wizards take us through all the steps necessary to set up a replication environment.

Using the Replication Center, it is possible to generate an SQL script for a particular function. The ability for the Replication Center to generate ASNCLP scripts in addition to SQL scripts is being planned for a future release.

The ASNCLP command interface

The ASNCLP interface (discussed in detail in Chapter 5, The ASNCLP Command Interface) allows us to enter commands from the command line, and more importantly, allows us to combine various commands into a script file, which can then be run from the command line. In this book, we will focus on ASNCLP scripts. It is supported on the Linux, UNIX, and Windows platforms. It is also supported on z/OS natively, through USS. We can also administer replication on z/OS from Linux, UNIX, and Windows system if we catalog the z/OS databases on the Linux, UNIX, and Windows system.

In the next section, we will look at the constituent components of Q replication.

Q replication constituent components

We like to think that the Q replication architecture is made up of three interconnected layers: the DB2 database layer, the WebSphere MQ layer, and finally the Q replication layer—each layer needs to be set up and tested before we move on to the next layer. An overview of the Q replication process is shown in the following diagram:

The basic operating architecture of a Q replication system involves:

An application processing a row in a table and DB2 logging this operationA Q Capture program calling the DB2 log reader to "process" the DB2 log and place rows that have been committed and that it wants to replicate onto a WebSphere MQ queueA Q Apply program "reading" from this queue and applying the row to the target table

In this setup, we have two components that need to be installed—the Q replication code and the WebSphere MQ code. We will discuss the installation of both of these in some detail.

With the current packaging, the Q replication code for homogeneous replication already comes bundled with the base code for DB2—all we have to install is a replication license key. The license for InfoSphere Replication Server is called isrs.lic and for InfoSphere Data Event Publisher the license is called isep.lic. Use the DB2 db2licm command to install the license key and to confirm that the license key has been successfully applied.

Turning to the WebSphere MQ component, we can use either WebSphere MQ V6 or V7 with DB2 replication.

The WebSphere MQ V6 Information Center can be found at http://publib.boulder.ibm.com/infocenter/wmqv6/v6r0/index.jsp.

The WebSphere MQ V7 Information Center can be found at http://publib.boulder.ibm.com/infocenter/wmqv7/v7r0/index.jsp.

For the procedure to install WebSphere MQ, consult the WebSphere MQ Information Center and search for install server. As an example, we will now take you through installing WebSphere MQ V6 on x86 64-bit Linux (which comes packaged as C87RUML.tar.gz). We need to perform the following tasks:

As root, use gunzip and tar to unpack the WebSphere MQ packages:
# gunzip C87RUML.tar.gz # tar -xvf C87RUML.tar
As root, the first task we need to perform is accept the MQ license, as shown next:
# ./mqlicense.sh
Now we can install the base packages. As root, issue the following commands:
# rpm -U MQSeriesRuntime-6.0.1-0.x86_64.rpm # rpm -U MQSeriesServer-6.0.1-0.x86_64.rpm # rpm -U MQSeriesSDK-6.0.1-0.x86_64.rpm
If we want the WebSphere MQ sample programs, which include amqsput, amqsget, amqsgbr, amqsbcg, and so on (which we do!), then we have to install the following package:
# rpm -U MQSeriesSamples-6.0.1-0.x86_64.rpm

For future reference, to uninstall WebSphere MQ, perform the following steps:

We can check which packages are installed using the following command:
# rpm -q -a | grep MQSeries
We can check what version of WebSphere MQ we are running by using the following command:
# dspmqver

This should give us an output similar to the following:

Name: WebSphere MQ Version: 6.0.0.0 CMVC level: p000-L080610 BuildType: IKAP - (Production)

On UNIX systems, if we are running on a 64-bit system, then we need to add the WebSphere MQ library to the LD_LIBRARY_PATH environment variable. If we do not do this, then when we try and start Q Capture (or Q Apply), we will see the following messages in the process log file:

2009-09-02-12.47.39.730985 <ASNMQLOD:MQCONN> ASN0584E "Q Capture" : "ASN" : "AdminThread" : An error occurred while the program was dynamically loading the WebSphere MQ library "libmqm_r.so". Error code: "0x90000076", "Cannot load the specified library". Environment variable ASNUSEMQCLIENT is set to "".

The ASN0584E message tells us to set the LD_LIBRARY_PATH environment variable. To check the current setting of this variable, we can either list the current values of all environment variables, using the env command, or we can list the value of this specific variable by using the echo command and prefixing the variable name with a dollar sign:

echo $LD_LIBRARY_PATH

We can temporarily set the value of this parameter (for the duration of the session in which the command was issued), using the following command:

LD_LIBRARY_PATH=/opt/mqm/lib64:$LD_LIBRARY_PATH

If we ever need to remove the packages, we would use the commands as shown:

#rpm -ev MQSeriesRuntime-6.0.1-0.x86_64.rpm #rpm -ev MQSeriesServer-6.0.1-0.x86_64.rpm #rpm -ev MQSeriesSDK-6.0.1-0.x86_64.rpm #rpm -ev MQSeriesSamples-6.0.1-0.x86_64.rpm

The Q Capture and Q Apply programs are discussed in detail in Chapter 2, The Q Capture and Q Apply programs. Typically, these programs will be installed on different servers, in which case we have to pay attention to the machine clock time on the servers.

Note

The machine clock time on all servers involved in replication should be synchronized.

The times on all servers need to be synchronized, because each captured record has a timestamp associated with it, and Q Apply will not apply in the future. Therefore, if the Q Capture server is ahead of the Q Apply server, then Q Apply will wait until it has reached the timestamp in the replicated record before applying it. If the Apply server time is ahead of the Capture server time, then we will not hit the "Apply will not apply in the future" problem, but the latency figures will be out by the time difference.

In the next section, we will look at the different types of Q replication.

The different types of Q replication

There are four basic types of Q replication:

UnidirectionalBidirectionalPeer-to-peerEvent Publishing

Replicating to a stored procedure or a Consistent Change Data (CCD) table are a subset of unidirectional replication.

Let's look at each of these in more detail. In the following sections, we talk about source and target tables. You may be wondering, what about views, triggers, and so on? You should check the Pre-setup evaluation section of Appendix A, for a list of objects to check for, before deciding on whether Q replication is the correct solution.

Unidirectional replication

In unidirectional replication, we can replicate all of the rows and columns of a source table or we can just replicate a subset of the rows and columns. We cannot really perform any transformation on this data. If we want to perform some sort of transformation, then we would need to replicate to a stored procedure, which we will discuss in detail in Appendix A.

Replicating to a stored procedure

Stored procedure replication is a subset of unidirectional replication in which the target is not a table as such, but a stored procedure, as shown in the following diagram:

A stored procedure can transform the data and output the results to a target table. This target table is not known to Q Apply. These stored procedures can be written in SQL, C, or Java. An example of replicating to a stored procedure is shown in the Replication to a stored procedure section of Appendix A.

Note

Prior to DB2 9.7 the source table and the stored procedure must have the same name, and the target table name can be any name we like.

Bidirectional replication

In bidirectional replication, we replicate copies of tables between two servers, each of which has a copy of the table. Note that we can only set up bidirectional replication between two servers. Unlike unidirectional replication, where we can replicate a subset of rows and columns, this is not possible in bidirectional replication. The tables on both servers can have different names, but must have the same number of rows and columns. The columns must have identical column names of compatible data types. It is not possible to do any data transformation using this type of replication.

Because we are updating records on both servers, it is possible that the same record will be updated at the same time on both servers.

Note

Although Q replication provides a conflict detection mechanism, we strongly advise that the driving application should be written or modified in such a way that such conflicts be avoided. The conflict detection provided by Q replication should be treated as a safety net and not the primary conflict resolution mechanism.

This mechanism allows us to choose which data values are used to detect conflicts (key column values only, changed column values, or all column values) and which server should win if such a conflict is detected. The row in the losing system is rolled back and the record is written to the IBMQSNAP_EXCEPTIONS table for review. Conflict detection is discussed in detail in the Q replication conflict detection section.

One of the related subjects to conflict detection is the concept of which server takes precedence in a conflict, or to put it more bluntly, which server is the master and which is the slave! If there is a conflict, then whichever server takes precedence will not apply changes from the other server. This ensures that the servers remain in sync. There is a more egalitarian option, which is that no server takes precedence. In this situation, rows are applied irrespective of whether or not there is a conflict, which ultimately leads to a divergence of the contents of the databases, which is not good!

There are two types of bidirectional replication—the first type is where we have an active/passive setup and the second type is where we have an active/active setup. The type of replication you choose will have implications on which server is defined as the master and which as the slave and what to do if a Q subscription is inadvertently inactivated.

In an active/passive setup, the passive server should be made the master. In an active/active setup, the choice of which system is the master is a decision you have to make. See the Conflict detection: update/delete conflict section of Appendix A for further discussion.

Peer-to-peer replication

Peer-to-peer replication allows us to replicate data between two or more servers. This is different from bidirectional replication, which is only between two servers. Each server has a copy of the table (which can have a different schema and name), but must have the same number of rows and columns and these columns must have identical column names and compatible data types. It is not possible to do any data transformation using this type of replication.

In peer-to-peer replication, there is no such thing as a master or slave server—each server will have the most recent copy of the table—eventually! What this means is that there will be a slight delay between the first server having a copy of the table and the last server having that copy. This is an asynchronous process, so at any one time the tables might be different, but once applications stop updating them, then the tables will converge to the most recently updated value. This type of processing means that there isn't any "manual" conflict detection as such (it is handled automatically by Q Apply), because the latest update will always win.

If two applications update the same record at exactly the same time, then Q replication uses the server number allocated when the peer-to-peer environment was set up to determine the winner. This type of processing means that two columns are added to each of the tables in the Q replication environment, where the first column is a timestamp of when the row was last updated (GMT) and the second column is the machine number. These updates are performed through triggers on the tables.

Tree replication

Tree replication comes in two flavors: bidirectional, which we call B-Tree replication, and unidirectional, which we call U-Tree replication.

A variation on the bidirectional replication theme is that it would be nice to be able to replicate from one master to two slaves in a bidirectional manner. This requirement was addressed with B-Tree replication.

A B-Tree replication structure is one, which looks like a tree (as shown in the preceding diagram). DB2A replicates with DB2B, DB2C, and DB2<n> in a bidirectional manner, but DB2B, DB2C, and DB2<n> do not replicate directly with each other—they have to replicate through DB2A, which is what differentiates B-Tree replication from peer-to-peer replication.

For B-Tree replication, we can replicate between one master and many slaves in a bidirectional manner. In SQL replication terms, this was called Update Anywhere. Note that it is not possible to set up B-Tree replication using the Replication Center—we need to use ASNCLP commands, which is described in detail in the Bidirectional replication to two targets (B-Tree) of Appendix A.

We can also replicate from one source to many targets in a unidirectional scenario, which we call a U-Tree scenario. In the preceding figure, DB2A replicates with DB2B, DB2C, and DB2<n> in a unidirectional manner (we can have more than three targets). Note, there is no radio button in the Replication Center to set up unidirectional U-Tree replication. What we have to do is set up unidirectional replication from DB2A to DB2B, and then for DB2A to DB2C, and so on. It is easier to use ASNCLP commands, which are described in detail in the Unidirectional replication to two targets (U-Tree) section of Appendix A.

Replicating to a Consistent Change Data table

Let's first look at the definition of Consistent Change Data (CCD) replication. CCD table replication is a subset of unidirectional replication, in which the target is a CCD table, which contains a row for each insert/delete/update that occurs on the source table. These CCD tables can be complete and/or condensed (this will be explained later).

There are three main uses of CCD table replication:

To populate an operational data storeTo keep a history of changes made to the source table for audit purposesTo enable multi-target update

Consider the situation where we want to populate an Operational Data Store (ODS) with data from our live system. We want to replicate all operations apart from delete operations. Before the introduction of CCD tables, our only option was to use a stored procedure. One of the parameters that the Q Apply program passes to a stored procedure is the operation (insert, delete, and so on) that occurred on the source system. See the Replication to a stored procedure section of Appendix A.

We can use CCD tables to keep a history of changes made to a table, or as a feed to InfoSphere DataStage.

The multi-target update scenario uses Q replication to populate the CCD table and then uses SQL Replication to populate the multiple target tables as shown in the following diagram:

In the Replicating to a CCD table section of Appendix A, we go through the steps needed to set up replication to a CCD table.

We can only specify that a target table be a CCD table in a unidirectional setup. A CCD target table is made up of the following columns (only four of the metadata columns are compulsory, the other four are optional, and the order of the columns does not matter):

<user key columns>

<user nonkey columns>

<user computed columns>

 

IBMSNAP_INTENTSEQ

IBMSNAP_OPERATION

IBMSNAP_COMMITSEQ

IBMSNAP_LOGMARKER

Compulsory auditing columns

IBMSNAP_AUTHID

IBMSNAP_AUTHTKN

IBMSNAP_PLANID

IBMSNAP_UOWID

Optional auditing columns

The IBMSNAP_INTENTSEQ column is a sequence number that uniquely identifies a change and is ascending within a transaction. The IBMSNAP_OPERATION column is a flag that indicates the type of operation for a record. The IBMSNAP_COMMITSEQ column is a sequence number that provides transactional order. The IBMSNAP_LOGMARKER column is the time that the data were committed.

The IBMSNAP_AUTHID column is the authorization ID that is associated with the transaction. This column is for used for Linux, UNIX, Windows, and z/OS. For z/OS, this is the primary authorization ID. The IBMSNAP_AUTHTKN column is the authorization token that is associated with the transaction. This column is for z/OS only and is the correlation ID—it will be NULL for Linux, UNIX, and Windows. The IBMSNAP_PLANID column is the plan name that is associated with the transaction. This column is for z/OS only—it will be NULL for Linux, UNIX, and Windows. And finally, the IBMSNAP_UOWID column is the unit-of-work identifier from the log record for this unit of work. This column is used for Linux, UNIX, Windows, and z/OS.

The <user computed columns> columns will be the before image columns and must be NULLABLE (because there is no before image when we insert a row, the before image is NULL).

Now let's look at what data is stored in a CCD table. With CCD tables we can specify that the target table is: COMPLETE or NONCOMPLETE and CONDENSED or NONCONSENSED. These are interpreted as follows:

Complete (COMPLETE=Y): A complete CCD table contains every row of interest from the source table and is initialized with a full set of source data. All target table loading options are valid for complete CCDs (automatic, manual, or no load).Noncomplete (COMPLETE=N): A noncomplete CCD table contains only changes to the source table and starts with no data. A noncomplete CCD table records all UPDATE operations at the source. The only valid load option for noncomplete CCD tables is no load.Condensed (CONDENSED=Y): A condensed CCD table contains one row for every key value in the source table and contains only the latest value for the row. For condensed CCD tables, a primary key is required, which is used in case of an update conflict. If such a conflict occurs, all the source columns are forced into the row (CONFLICT_ACTION=F).Noncondensed (CONDENSED=N): A noncondensed CCD table contains multiple rows with the same key value, one row for every UPDATE, INSERT, or DELETE operation at the source table. When added to the CCD table, all of the rows become INSERT operations. No primary key is required.

The options for handling unexpected conditions at the target are limited for CCD tables:

For condensed and complete, two choices are available:

Force the source change into the target table (CONFLICT_ACTION=F)Ignore the condition and continue (CONFLICT_ACTION=I)

For any combination other than condensed and complete, the only valid option is to force the change into the target table.

For all CCD table types, the only valid conflict rule is to check only key columns (CONFLICT_RULE=K).

Before we move on, let's quickly look at CCD tables in an SQL Replication environment. In SQL Replication, there is the concept of internal and external CCD tables, which does not exist in Q replication. In SQL Replication terminology, all Q replication CCD tables are external CDD tables.

Event Publishing

The Event Publishing functionality captures changes to source tables and converts committed transactional data to messages in an XML format. Each message can contain an entire transaction or only a row-level change. These messages are put on WebSphere MQ message queues and read by a message broker or other applications. We can publish subsets of columns and rows from source tables so that we publish only the data that we need.

DB2 replication sources

In this section, we cover the various DB2 objects that can be used as replication sources, such as XML data types, compressed tables, and large objects.

Replicating XML data types

From DB2 9.5 onwards, we can replicate tables, which contain columns of data type XML, and an example is shown in the Unidirectional replication for an XML data type section of Appendix A. We can set up unidirectional, bidirectional, and peer-to-peer replication.