Search This Blog

2021-04-09

Oracle - Startup Failure

Product: Oracle RDBMS
Version: 10.2 - 19.2
OS: Linux

While helping others to troubleshoot Linux systemd in auto startup Oracle RDBMS, encountering following failure when trying to login as SYSTEM (or any application DB user) in the Oracle server OS itself:

[oracle]$ sqlplus system/non_default_password SQL*Plus: Release 12.1.0.2.0 Production on Fri Apr 9 13:59:13 2021 Copyright (c) 1982, 2014, Oracle. All rights reserved. ERROR: ORA-27140: attach to post/wait facility failed ORA-27300: OS system dependent operation:invalid_egid failed with status: 1 ORA-27301: OS failure message: Operation not permitted ORA-27302: failure occurred at: skgpwinit6 ORA-27303: additional information: startup egid = 500 (oracle), current egid = 501 (dba)

However, if shutdown, then startup Oracle daemon manually, SYSTEM and other application DB user can login without issue.

Diagnostic:

1. Checks the OS username and group name for $ORACLE_HOME/bin/oracle

$ ls -ld $ORACLE_HOME/bin/oracle

-rwxr-x---. 1 oracle dba 323762270 Mar 24 19:34 /dbf/ora12.1/bin/oracle

2. Note down the username, and group name, which is oracle and dba
3. Checks the current running Oracle PMON daemon (it can be any other Oracle daemon) by displaying the OS user and OS group:
$  ps -eo euser,egroup,cmd | grep -i ora_pmon_$ORACLE_SID
oracle   oinstall      ora_pmon_SUPP121
4. Compare the output of ls and ps to ensure that both user & group are identical
5. Typically, the custom auto startup script has wrong OS user group name

6. For example, systemd script could have following wrong entry:

Group=oracle

7. Changes above to the OS user group as shown in #1, which is "dba"

Advance Diagnostic

Oracle daemons allocate "shared memory segment," and "semaphore arrays" in Linux (UNIX in general), which has OS user group which leads to this login issue:

1. Checks "shared memory segment" and "semaphore arrays" allocated to specific ORACLE_SID

$ sysresv -l RMANCAT
IPC Resources for ORACLE_SID "RMANCAT" :
Maximum shared memory segment size (shmmax): 18446744073692774399 bytes
Total system shared memory (shmall): 18446744004990070784 bytes
Total system shared memory count (shmmni): 4096
*********************** Dumping ipcs output ********************
------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 655360     oracle     640        4096       0
0x00000000 688129     oracle     640        4096       0
0x00000000 720898     oracle     640        4096       0
0xd787ced8 753667     oracle     640        16384      70
------ Semaphore Arrays --------
key        semid      owner      perms      nsems
0x720acc58 98304      oracle     640        125
0x720acc59 131073     oracle     640        125
0x720acc5a 163842     oracle     640        125
0x720acc5b 196611     oracle     640        125
0x720acc5c 229380     oracle     640        125
0x720acc5d 262149     oracle     640        125
0x720acc5e 294918     oracle     640        125
0x720acc5f 327687     oracle     640        125
0x720acc60 360456     oracle     640        125
*********************** End of ipcs command dump **************

***************** Dumping Resource Limits(s/h) *****************
core file size                         0 KB/0 KB
data seg size                     UNLIMITED/UNLIMITED
scheduling priority                    0 KB/0 KB
file size                         UNLIMITED/UNLIMITED
pending signals                       30 KB/30 KB
max locked memory                 UNLIMITED/UNLIMITED
max memory size                   UNLIMITED/UNLIMITED
open files                            64 KB/64 KB
POSIX message queues                 800 KB/800 KB
real-time priority                     0 KB/0 KB
stack size                            10 MB/10 MB
cpu time                          UNLIMITED/UNLIMITED
max user processes                    16 KB/16 KB
virtual memory                    UNLIMITED/UNLIMITED
file locks                        UNLIMITED/UNLIMITED
***************** End of Resource Limits Dump ******************
Total /dev/shm size: 14524801024 bytes, used: 2097008640 bytes
Shared Memory:
ID              KEY
688129          0x00000000
720898          0x00000000
655360          0x00000000
753667          0xd787ced8
Semaphores:
ID              KEY
98304           0x720acc58
131073          0x720acc59
163842          0x720acc5a
196611          0x720acc5b
229380          0x720acc5c
262149          0x720acc5d
294918          0x720acc5e
327687          0x720acc5f
360456          0x720acc60
Oracle Instance alive for sid "RMANCAT"

2. Let's pick the first semaphore array ID of 98304 to find out the OS user group using ipcs command:

$ ipcs -si 98304 | head -20

Semaphore Array semid=98304
uid=500  gid=501         cuid=500        cgid=501
mode=0640, access_perms=0640
nsems = 125
otime = Fri Apr  9 16:34:50 2021
ctime = Fri Apr  9 16:34:50 2021
semnum     value      ncount     zcount     pid
0          1          0          0          3252
1          25427      0          0          3252
2          4619       0          0          3252
3          32764      0          0          3252
4          0          0          0          0
5          0          0          0          0
6          0          1          0          3307
7          0          1          0          3309
8          0          0          0          0
9          0          1          0          3318
10         0          1          0          3320
11         0          1          0          3376

3. The "gid" of the semaphore array must match the ORACLE_HOME/bin/oracle program name. Uses following command to show the file owner in gid. Second column is the gid, which is 501:

[oracle]$ ls -ldn $ORACLE_HOME/bin/oracle
-rwxr-x---. 1 500 501 323762270 Mar 24 19:34 /dbf/ora12.1/bin/oracle

4. This value is configured in /etc/passwd and /etc/group file in UNIX

No comments: