Thursday, April 5, 2018

ERROR OGG-01161 Bad column index- How to run a defgen for goldengate replication

So golden gate has two kinds of replication. 
1. With DDL replication enabled  
2. Without DDL replication enabled

If without DDL replication enabled is configured then we have to run something called defgen i.e. if there is a change in the structure of the source tables. It is actually generating the definition of the source tables so the replication does not freak out for a difference in what it knows about the table’s past compared to what was changed on the source table and is at present.  

A good example is let’s say, there are 20 columns in a table and last time we ran defgen it was 20 columns and a CR is executed or is going to be executed which will change the source table columns to 21 from 20. That means the structure of the table has changed. In such a case we have to run DEFGEN utility of Golden gate. Which generated a file stating that the table has these many column in our case 21 now. 

The defgen file is kept in replicat parameter file and lives at GG home location ./dirdef/excfblq2.def. It will make sure the replicat knows the structure of the source table so it replicates accordingly.  Anytime DDL structure is change on source table defgen needs to be run for replicat unless DDL replication is enabled. I will run an example here. 

Below are my GG processes running. 
GGSCI (server1) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING

EXTRACT     RUNNING     EXCFBLQ2    00:00:05      00:00:04
REPLICAT    RUNNING     RPMPCLQ2    00:00:00      00:00:04



The CR will execute the below commands where source table structure will be changed and column will be added:

spool alter_service_order.log

ALTER SESSION SET CURRENT_SCHEMA = DLEUSER;

alter table dleuser.service_order add WF_API_GUID VARCHAR2(100);

SPOOL OFF;

In the golden gate replication we are using service_order table so since the structure will change, the replicat above should have abended.

Let’s check after the execution of script the replicat has abended:

GGSCI (server1) 2> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EXCFBLQ2    00:00:05      00:00:06
REPLICAT    ABENDED     RPMPCLQ2    00:00:00      00:04:02

Now let’s check the report file of the replicat on ggsci command line and go at the end:

GGSCI (server1) 2>view report RPMPCLQ2
At the very end you will see below error:
018-04-05 15:14:57 ERROR   OGG-01161 Bad column index (43) specified for table DLEUSER.SERVICE_ORDER, max columns = 43.

***********************************************************************
*                   ** Run Time Statistics **                         *
***********************************************************************

Last record for the last committed transaction is the following:


This error is because of the change in the definition of the source table.

Solution:
1.     Check which table’s definition will be generated when defgen will be run:
Catalog the file: make sure table name exists in the file in my case it does for the table because of which replicat abended.
[server1: 12c]$ cat ./dirprm/defgen.prm
defsfile ./dirdef/excfblq2.def
userid ggsuser@OGG password XXXX ==>> get in server to see password  
TABLE dleuser.service_order; =====è I see the one I am looking for
TABLE dleuser.state ;
TABLE dleuser.customer_registered ;
TABLE dleuser.customer_address ;
TABLE dleuser.address ;
TABLE dleuser.appointment ;
TABLE dleuser.contact_mechanism ;
TABLE dleuser.customer_contact_mechanism ;
TABLE dleuser.telecom_number ;
TABLE dleuser.electronic_address ;
TABLE dleuser.claim ;
TABLE dleuser.claim_folder ;
TABLE dleuser.claim_folder_display ;
TABLE dleuser.event ;
TABLE dleuser.customer_application ;
TABLE DLEUSER.DOCUMENT_CACHE ;
TABLE DLEUSER.CUSTOMER_PREFERENCE ;
That looks good. Service table is included in the parameter file. Let’s run the command which generated the defgen. 
NOTE: when I run the defgen you will see the file created as listed in the param file. You will have to use the same name in replicat parameter file so keep an eye on the name. 

2.     Rename the already existing file created from the past.
Make sure to rename the already generated file from the past else you will get process abending the file already exists. 
[server1: 12c]$ cd dirdef
[server1: dirdef]$ mv excfblq2.def excfblq2_old.def
[server1: dirdef]$ ls -ltrh
total 195K
-rw-r----- 1 ggpusr2 dba  56K Aug  5  2015 excfblq2_old.def
-rw-r----- 1 ggpusr2 dba 6.6K Aug 11  2015 excfblq1.def
3.     Let’s run defgen for the extract table or tables

./defgen PARAMFILE ./dirprm/defgen.prm

Definitions generated for 17 tables in ./dirdef/excfblq2.def.

4.     Okay so the new file with the listed table or tables has been generated. Now let’s add it to the replicat parameter file.
NOTE: If the replicat is on a different server winscp or ssh it to the destination server and add it to the ./dirdef/ directory with the same name as its used in the replicate parameter file.
In my case it’s as ./dirdef/excfblq2.def and my replicat is running from same OGG home and server. 

5.     Once I verified or performed step 4, its time to start the replicat.
I only have to start the replicat since I kept the same name as it existed in replicat. Let’s start and check.

 [server1-12c]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 12.1.2.1.5 20635622 OGGCORE_12.1.2.1.0OGGBP_PLATFORMS_150320.0454_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Apr 22 2015 02:38:13
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2015, Oracle and/or its affiliates. All rights reserved.

GGSCI (server1) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING

EXTRACT     RUNNING     EXCFBLQ2    00:00:03      00:00:05
REPLICAT    ABENDED     RPMPCLQ2    00:00:00      00:25:44

I see there is a lag of 25 min lets start replicat and it should work because it knows about the 21st column added and will not complain now. 

GGSCI (server1) 2> start RPMPCLQ2

Sending START request to MANAGER ...
REPLICAT RPMPCLQ2 starting

GGSCI (server1) 3> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EXCFBLQ2    00:00:04      00:00:01
REPLICAT    RUNNING     RPMPCLQ2    00:00:00      00:00:00

The lag caught up and the process is running. 






That is it. Hope the document helps. 


No comments:

Post a Comment