Illinois School of Technology IST: How to perform roll forward recovery for standby oracle database?

Oracle DBA Documentation: Standby Recover Forward with RMAN

TO FIX GAPS - Do NOT run backups during operational hours, unless you get approval
from the NOC!!!

Note: I really really like to use "screen -RD" when performing this procedure, so I can come
back to my session when I get disconnected from the network. If you don't know how to
use screen, there are many tutorials on the web.

1. Stop the redo on the standby database

Since our standby databases use Oracle DataGuard Broker,
it is easiest to simply disable the broker on the standby side.

SQL> alter system set dg_broker_start=FALSE;

This will stop and prevent broker from starting by updating memory and the spfile or
the database. You can also cancel recovery, but this should not be running after
broker was stopped.

SQL> alter database recover managed standby database cancel;

2. Determine the current SCN for the standby database. Run both of these SQL
statements in SQL Plus on the Standby.

col CURRENT_SCN for 9999999999999

select current_scn from v$database;

select min(fhscn) current_scn from x$kcvfh;

Take the lesser of the 2, minus 10000 to be sure.

3. Use the value for the current SCN to take your incremental backup on the primary

Check the /DD_RMAN/rman/[database] folder or your backup folder and make sure
there is not a "control_file.ctl" file already there. I recommend you remove all previous
backups, just in case. Just "rm" them or copy to another place/folder.

run {

allocate channel disk1 type disk format '/DD_RMAN/rman/%d/%d_%s_%p_%t_%U.dat';

allocate channel disk2 type disk format '/DD_RMAN/rman/%d/%d_%s_%p_%t_%U.dat';

allocate channel disk3 type disk format '/DD_RMAN/rman/%d/%d_%s_%p_%t_%U.dat';

BACKUP INCREMENTAL FROM SCN 275632204001 database tag 'FOR STANDBY';

backup current controlfile for standby format '/DD_RMAN/rman/%d/control_file.ctl';

}

4. Login to the standby and verify all the backupsets have been replicated to the
Standby Data Domain.

On Primary Node

ls -l /DD_RMAN/rman/[database]/

On Standby Node

ls -l /DD_RMAN/rman/[database]/

They should be the same and have the same file sizes. Once they are identical, proceed.

5. Go ahead and catalog the backupsets on the Standby

RMAN> catalog start with '/DD_RMAN/rman/[database]

Answer "YES" to catalog the backups.

I also like to crosscheck the backupsets and remove any old ones.

RMAN> allocate channel for maintenance device type disk;

RMAN> crosscheck backup;

RMAN> delete expired backup;

5. Recover the database from the incremental backup

RMAN> recover database noredo;

If when you run this, you see something like this, your incremental backup was
NOT from an old enough SCN. You need to re-run the backup from an OLDER SCN.
Please check your queries on standby.

RMAN> run {

2> allocate channel d1 type disk;

3> allocate channel d2 type disk;

allocate channel d3 type disk;

4> 5> recover database noredo;

6> }

allocated channel: d1

channel d1: SID=1032 instance=test1 device type=DISK

allocated channel: d2

channel d2: SID=1290 instance=test1 device type=DISK

allocated channel: d3

channel d3: SID=1549 instance=test1 device type=DISK

Starting recover at 21-APR-15

Finished recover at 21-APR-15

released channel: d1

released channel: d2

released channel: d3

If your output looks something like this, you are in good shape.

allocated channel: d1

channel d1: SID=1290 instance=test1 device type=DISK

allocated channel: d2

channel d2: SID=1549 instance=test1 device type=DISK

allocated channel: d3

channel d3: SID=1802 instance=test1 device type=DISK

Starting recover at 21-APR-15

channel d1: starting incremental datafile backup set restore

channel d1: specifying datafile(s) to restore from backup set

destination for restore of datafile 00006: +DATA/test/datafile/users.259.842540043

destination for restore of datafile 00010: +DATA/test/datafile/cccmonitor.564.842715095

destination for restore of datafile 00014: +DATA/test/datafile/procspuser_data.463.855351457

destination for restore of datafile 00015: +DATA/test/datafile/vis_data.460.858644267

destination for restore of datafile 00018: +DATA/test/datafile/undotbs5.350.866517573

destination for restore of datafile 00020: +DATA/test/datafile/flexnetprod_data.649.869682677

channel d1: reading from backup piece /DD_RMAN/rman/TEST/TEST_10841_1_877608247
_ipq4uf9n_1_1.dat

channel d2: starting incremental datafile backup set restore

channel d2: specifying datafile(s) to restore from backup set

destination for restore of datafile 00002: +DATA/test/datafile/sysaux.266.842540047

destination for restore of datafile 00004: +DATA/test/datafile/undotbs2.271.842540045

destination for restore of datafile 00005: +DATA/test/datafile/undotbs3.270.842540045

destination for restore of datafile 00009: +DATA/test/datafile/sp_data.269.842540047

destination for restore of datafile 00016: +DATA/test/datafile/hps_data.462.859231897

destination for restore of datafile 00017: +DATA/test/datafile/undotbs4.353.866515623

destination for restore of datafile 00019: +DATA/test/datafile/esdprod_data.648.869682669

6. Restore the STANDBY controlfile to the standby:

RMAN> shutdown immediate;

RMAN> startup nomount

RMAN> RESTORE STANDBY CONTROLFILE FROM '/DD_RMAN/rman/[database]/control_file.ctl';

RMAN> alter database mount;

7. Now you need to syncronize the controlfile with the datafile location of the standby.

RMAN> catalog start with '+DATA/[database]/datafile';

RMAN> switch database to copy;

If you get an error when switching the datafile to copy, it may be that the database,
during startup has switched to a "newer" incarnation, which is not applicable to the standby
recovery.

Here is an example from test

RMAN> list incarnation;

using target database control file instead of recovery catalog

List of Database Incarnations

DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time

------- ------- -------- ---------------- --- ---------- ----------

1 1 TEST 162103565 ORPHAN 1 13-MAR-13

2 2 TEST 162103565 CURRENT 265249491116 07-FEB-15

Let's look at the primary - TEST

RMAN> list incarnation;

using target database control file instead of recovery catalog

List of Database Incarnations

DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time

------- ------- -------- ---------------- --- ---------- ----------

1 1 TEST 162103565 CURRENT 1 13-MAR-13

Notice how the primary current is not the same as the standby?

To fix this, execute the following on the standby.

RMAN> reset database to incarnation 1;

Now re-run the command to switch the datafile copies.

RMAN> switch database to copy;

This should list all the datafile copies.

Through all of this, if you are watching the alert log on the standby, you WILL see a lot of errors.
It may even look like the datafiles don't exist. This is OK!

8. Now stop the standby database with srvctl

# srvctl stop database -d [database]

Then start it

# srvctl start database -d [database]

9. Watch the alert log, to ensure the database comes up and applies any archive

logs that are available locally. If you get an "FAL gap" error, which can not be

resolved, you will have to check the archive retention and start over with your incremental
backup from the new SCN.

Make sure the local "archive delete" scripts on "node 1" of each cluster are retaining the archives
long enough for you to do all this stuff. If you recover forward clean. Messages like this:

Tue Apr 21 13:37:40 2015

Archived Log entry 1363 added for thread 2 sequence 44117 ID 0x9a9530d dest 1:

Tue Apr 21 13:37:44 2015

RFS[9]: Assigned to RFS process 10194

RFS[9]: Opened log for thread 4 sequence 6201 dbid 162103565 branch 809981197

Archived Log entry 1364 added for thread 4 sequence 6201 rlc 809981197 ID 0x9a9530d dest 2:

Tue Apr 21 13:37:46 2015

RFS[10]: Assigned to RFS process 10234

RFS[10]: Opened log for thread 5 sequence 6084 dbid 162103565 branch 809981197

Archived Log entry 1365 added for thread 5 sequence 6084 rlc 809981197 ID 0x9a9530d dest 2:

You are in good shape.

10. Restart the MRP process and broker on the standby.

sql> alter system set dg_broker_start=TRUE;

In the alert log, you should see something like this.

Tue Apr 21 13:47:53 2015

ALTER SYSTEM SET dg_broker_start=TRUE SCOPE=BOTH;

Starting Data Guard Broker (DMON)

Tue Apr 21 13:47:55 2015

INSV started with pid=84, OS id=21325

Tue Apr 21 13:47:59 2015

NSV0 started with pid=85, OS id=21389

Tue Apr 21 13:48:05 2015

RSM0 started with pid=86, OS id=21439

ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='test1';

ALTER SYSTEM SET log_archive_format='%t_%s_%r.dbf' SCOPE=SPFILE SID='test1';

Tue Apr 21 13:48:25 2015

Managed Standby Recovery starting Real Time Apply

Tue Apr 21 13:50:29 2015

Start MRP with archive log current also if needed.

11. Now check dgmgrl and confirm the standby is in good shape. You may get

some timeouts, just keep trying the command.

DGMGRL> show database '[database]';

It should come back showing no transfer gap and perhaps it is still recovering. It will also
show you the apply instance, so you can confirm broker is getting archives and managing
things via that local alert log.

Illinois School of Technology IST

Monday, July 15, 2019

How to perform roll forward recovery for standby oracle database?

No comments:

Post a Comment