Illinois School of Technology IST: Pre and Post steps for platinum oracle Exadata patching.

Pre and Post steps for Exadata patching.

Note: The actual patch apply will be conducted by Oracle engineer we make

environment ready for patching and then ready for customer after patching for

Exadata in my case.

Pre-Patching steps

1. Create Blackout for host being patched.

2. Check if we have enough repair time" and re-balance:

su - oracle

menu -- choose 1 ASM

sqlplus / as sysasm -- be careful login as sysasm NOT sysdba

set linesize 100

column name format a30

column value format a30

SELECT group_number, name, value FROM v$asm_attribute

where name = 'disk_repair_time' ORDER BY group_number, name;

If the output showing hours too low or less than 12h , change it:

ALTER DISKGROUP DBFS_DG SET ATTRIBUTE 'disk_repair_time' = '12h';

ALTER DISKGROUP DATA_MAEX SET ATTRIBUTE 'disk_repair_time' = '12h';

ALTER DISKGROUP RECO_MAEX SET ATTRIBUTE 'disk_repair_time' = '12h';

ALTER DISKGROUP DATA2_MAEX SET ATTRIBUTE 'disk_repair_time' = '12h';

Re-execute select to make sure it changed:

-- this is important if we don't have enough repair time, ASM will be

dropped and recreated and that will take long time.

2. Save a snapshot of file system on all nodes as root user, you will compare

this for post steps.

df -h > node1filesystem -- do this for all nodes

3. Save a snapshot of database services distribution:

srvctl status service -d db_name for all running database on the server.

5. This will be done on the day of patching. Stop the omsagent.

Depending on the directory in use for your environment omsagent stop agent:

ps -ef | grep agent

and see where its running usually as below but depends where your environment was configured /u01/app/oracle/agent12c/agent_inst/bin

In my case:

/tools/agents/oracle/occ/agent12c/agent_inst/bin

./emctl status agent

./emctl stop agent

6. Rebalance won't prevent oracle from patching, but sometimes they will ask.

select * from gv$asm_operation;

7. Save the cluster health.

. oraenv

+ASM1

crsctl check cluster > check_cluster

crsctl stat res -t > cluster_res_t

srvctl status nodeapps

srvctl status database -d -- for each database to see if any one did not come

back online

Note: you will compare the cluster health after patching using this saved

output.

8. At this time you are all done notify oracle by email and update the SR that

Exadata

is ready for them to patch.

==================================================================

Post patching steps: Oracle came back and confirmed patching is done.

(oracle will notify by email once they are done with patching)

1. Check cell disks:

log to test1.com as root

#dcli -l root -g cell_group "cellcli -e list griddisk attributes name, asmmodestatus, asmdeactivationoutcome| grep -v OFFLINE"

#dcli -l root -g cell_group "cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome| grep -v OFFLINE"

-- Make sure all online and nothing offline, else bring griddisks online

if you dont know how then notify engineer who patched and he can help.

2. Check cluster

. oraenv

+ASM1

crsctl check cluster

crsctl stat res -t

srvctl status nodeapps

srvctl status database -d -- for each database to see if any one did not come back online

-- file systems will not be fully mounted till DBFSB database is up and online, so

check DBFSB database first in my case. Check if you have the same database.

srvctl status database -d DBFSB

b- start the dbfs resource on crs.

crsctl start resource dbfs_mount ---(This is in Prod env i have
crsctl start resource mount_dbfs ---(This is in CT env i have)

if not you might need to mount it.

[oracle@test1 ~]$ fusermount -u -z /dbfs

fusermount: entry for /dbfs not found in /etc/mtab

--> Before performing unmount need to clear all the directories and its contents residing under dbfs mount point or else you may encounter above error.

then reissue ,
--> umounted the file system using the following command

fusermount -u -z /dbfs

--> start dbfs using crsctl
crsctl start resource dbfs_mount ---(This is in Prod env)
crsctl start resource mount_dbfs ---(This is in CT env)

c- Check if every file system properly mounted with previous saved output in pre steps.

Make sure file system comes back on all nodes as root.

df -h and compare with root home directory saved log for prior patching df -h file system.

IF needed and missing any mount it.

4. Relocate services using the output saved before patching started. Make sure

you in the right directory.

5. Check if standy databases have to be mounted then check status they are mounted.

If they are read only ones then open in read only.

For mount:

srvctl config database -d Database_name ==>> make sure its set to standby mount for mount databases:

srvctl status database -d Database_name

srvctl start database -d Database_name

sqlplus / as sysdba

select status from gv$instance:

for read only open:

srvctl stop database -d Database_name

srvctl start database -d Database_name -o 'read only'

6. Check data guard to see if any errors in configuration and log shipping and applying

looks good.

set environment

dgmgrl sys/password

show configuration

enable database : #if needed

enable configuration; #if needed

In case Apply is not ON:

make sure intended state is online and not offline else bring it online.

edit database 'dbname' set state='ONLINE';

it should show Intended State: APPLY-ON

In case MRP did not work:

To stop MRP:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL

To start MRP:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;

To restart MRP:
alter database recover managed standby database disconnect from session;

If still getting errors test connectivity :

Test Netservice between primary and standby vice versa

i. From Primary ( all the nodes if it is RAC)

sqlplus sys/<password>@<standby_db_service> as sysdba

select name, open_mode, host_name from v$database,v$instance;

sqlplus sys/<password>@<Primary_db_service> as sysdba

select name, open_mode, host_name from v$database,v$instance;

ii. From Standby ( all the nodes if it is RAC)

sqlplus sys/<password>@<Primary_db_service> as sysdba

select name, open_mode, host_name from v$database,v$instance;

sqlplus sys/<password>@<standby_db_service> as sysdba

select name, open_mode, host_name from v$database,v$instance;

7. Check standby databases to see if MRP is running and logs are applying

sqlplus / as sysdba

select inst_id, process, status from gv$managed_standby where process='MRP0';

select process "StdbyProcess", client_process "PrimaryProcess", status, thread#, sequence# from gv$managed_standby where process='MRP0';

Make sure to see below.

StdbyProc PrimaryP STATUS THREAD# SEQUENCE#

--------- -------- ------------ ---------- ----------

RFS LGWR IDLE 6 67958

RFS LGWR IDLE 8 92721

RFS LGWR IDLE 1 69432

RFS LGWR IDLE 7 94216

RFS ARCH IDLE 0 0

MRP0 N/A APPLYING_LOG 4 161541

DGMGRL> show database test

Database - test

Role:            PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag:   31 minutes 26 seconds
Apply Lag:       20 hours 37 minutes 17 seconds
Real Time Query: OFF
Instance(s):
test1 (apply instance)
    test2

Database Status:
SUCCESS

8. Check if there is rebalance going on; if it is notify apps; performance may be

impacted:

su - oracle

menu -- asm

sqlplus / as sysdba

select * from gv$asm_operation;

9. This will be done on the day of patching. Start the omsagent.

Depending on the directory in use for environement start agent:

cd /u01/tools/agents/oracle/occ/agent12c/agent_inst/bin

./emctl status agent

./emctl start agent

10 . Verify Encryption Wallet is open

If encrypted databases are there, and there is a bug, if we do not do the following select,

database will not open for public:

sqlplus / as sysdba

set linesize 100

column WRL_PARAMETER format a40

select * from gv$encryption_wallet;

11. There are several files that need to be setup or possibly got changed during a patch upgrade

a. verify $ORACLE_HOME/rdbms/admin/externaljob.ora contains the following lines in the script:

run_user = oracle

run_group = dba

and has the following file permissions (must be owned by root and only root can write):

-rw-r----- 1 root dba 1531 Mar 26 20:06 /u01/app/oracle/product/11.2.0.4/dbhome_1/rdbms/admin/externaljob.ora

b. verify $ORACLE_HOME/bin file is owned by root and has setuid bit set

-rwsr-x--- 1 root dba 1254458 Mar 4 17:40 /u01/app/oracle/product/11.2.0.4/dbhome_1/bin/extjob

to set permissions as user root:

chown root extjob

chmod 4750 extjob

12. DOS2UNIX verify its installed if not then install it.

run below as root on all nodes.

yum install dos2unix

enter option y

and perform this on other nodes.

13. Notify system support to bring up the apps. We are all done here with our post verification

Illinois School of Technology IST

Thursday, May 16, 2019

Pre and Post steps for platinum oracle Exadata patching.

No comments:

Post a Comment