Oracle Clusterware 11g addNode.sh ONS bug (with GDB fun & solution)

Yesterday I’ve added new node for one of Clusterware installations. I’ve decided to use addNode.sh script. Unfortunatley after adding new node (prac2) to cluster, kernel reported every few seconds segmentation fault of ONS daemon. Here’s how one can use some in depth knowledge to solve such problems (OS: Oracle Enterprise Linux 5.1, Clusterware: 11.1.0.6, platform: x86_64).

Some messages after Clusterware shutdown — most important are first two lines here.

[root@prac2 conf]# tail /var/log/messages
Jan 25 20:42:14 prac2 kernel: ons[1923]: segfault at 0000000000000060 rip 000000000040cb56 rsp 0000000043204028 error 4
Jan 25 20:42:14 prac2 kernel: ons[1927]: segfault at 0000000000000060 rip 000000000040cb56 rsp 0000000043204028 error 4
Jan 25 20:42:14 prac2 logger: Oracle clsomon shutdown successful.
Jan 25 20:42:14 prac2 logger: Oracle CSS family monitor shutting down gracefully.
Jan 25 20:42:15 prac2 logger: Oracle CSSD graceful shutdown
Jan 25 20:42:15 prac2 logger: Oprocd received graceful shutdown request. Shutting down.
[root@prac2 conf]#

Next, we are going to unlimit core-dump files (by default they are disabled). Later we create special directory for central dumping cores, and also configure kernel to use it.
[root@prac2 ~]# ulimit -c unlimited
[root@prac2 ~]# ulimit -c
unlimited
[root@prac2 ~]# mkdir /tmp/cores
[root@prac2 ~]# chmod 1777 /tmp/cores
[root@prac2 ~]# sysctl kernel.core_pattern='/tmp/cores/core.%e.%p'
kernel.core_pattern = /tmp/cores/core.%e.%p
[root@prac2 ~]#

After bringing up Clusterware (crsctl start crs) we are going to get some cores:
[root@prac2 cores]# ls
core.ons.4990 core.ons.5038 core.ons.5078 core.ons.5123 core.ons.5163 core.ons.5203
core.ons.5004 core.ons.5045 core.ons.5090 core.ons.5135 core.ons.5175 core.ons.5210
core.ons.5011 core.ons.5052 core.ons.5097 core.ons.5142 core.ons.5182 core.ons.5217
core.ons.5018 core.ons.5064 core.ons.5104 core.ons.5149 core.ons.5189 core.ons.5224
core.ons.5031 core.ons.5071 core.ons.5116 core.ons.5156 core.ons.5196
[root@prac2 cores]# /u01/app/crs/bin/crsctl stop crs
Stopping resources.
This could take several minutes.
Successfully stopped Oracle Clusterware resources
Stopping Cluster Synchronization Services.
Shutting down the Cluster Synchronization Services daemon.
Shutdown request successfully issued.
[root@prac2 cores]#

We must find, if that ONS daemon dumps core each time at the same address (rip = instruction pointer), we’ll take core from PID 4990:
[root@prac2 cores]# grep 'ons\[4990\]' /var/log/messages
Jan 25 20:07:12 prac2 kernel: ons[4990]: segfault at 0000000000000060 rip 000000000040cb56 rsp 0000000043204028 error 4
Jan 25 20:16:49 prac2 kernel: ons[4990]: segfault at 0000000000000060 rip 000000000040cb56 rsp 0000000043204028 error 4
Jan 25 20:34:13 prac2 kernel: ons[4990]: segfault at 0000000000000060 rip 000000000040cb56 rsp 0000000043204028 error 4
[root@prac2 cores]#

OK, so we’ve got bad guy at 0x000000000040cb56. Now we have to know why it happens (we want to find the problem, only if we know exact reason, we’are going to fix it — that’s my personal philosophy). Some output has been trimmed for clarity.
[root@prac2 cores]# file core.ons.4990
core.ons.4990: ELF 64-bit LSB core file AMD x86-64, version 1 (SYSV), SVR4-style, from 'ons'
[root@prac2 cores]# gdb /u01/app/crs/opmn/bin/ons core.ons.4990
GNU gdb Red Hat Linux (6.5-25.el5rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
[..]

Core was generated by `/u01/app/oracle/product/11.1.0/crs/opmn/bin/ons -d'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000000040cb56 in opmnHttpFormatConnect ()
(gdb) thread apply all bt

Thread 7 (process 4990):
#0 0x0000003ea6c0a496 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000041aa07 in main ()

Thread 6 (process 4998):
#0 0x0000003ea6c0a687 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000418666 in opmnWorkerEntry ()
#2 0x0000003ea6c062e7 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003ea60ce3bd in clone () from /lib64/libc.so.6

Thread 5 (process 4999):
#0 0x0000003ea6c0a687 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000418666 in opmnWorkerEntry ()
#2 0x0000003ea6c062e7 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003ea60ce3bd in clone () from /lib64/libc.so.6

Thread 4 (process 5000):
#0 0x0000003ea6c0a687 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000418666 in opmnWorkerEntry ()
#2 0x0000003ea6c062e7 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003ea60ce3bd in clone () from /lib64/libc.so.6

---Type to continue, or q to quit---
Thread 3 (process 5001):
#0 0x0000003ea60c7922 in select () from /lib64/libc.so.6
#1 0x0000000000411399 in opmnListenerEntry ()
#2 0x0000003ea6c062e7 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003ea60ce3bd in clone () from /lib64/libc.so.6

Thread 2 (process 5003):
#0 0x0000003ea6095451 in nanosleep () from /lib64/libc.so.6
#1 0x0000003ea6095274 in sleep () from /lib64/libc.so.6
#2 0x0000000000413fb6 in opmnPerformance ()
#3 0x0000003ea6c062e7 in start_thread () from /lib64/libpthread.so.0
#4 0x0000003ea60ce3bd in clone () from /lib64/libc.so.6

Thread 1 (process 5002):
#0 0x000000000040cb56 in opmnHttpFormatConnect ()
#1 0x0000000000408d19 in connectionActive ()
#2 0x0000000000407774 in opmnConnectionEntry ()
#3 0x0000003ea6c062e7 in start_thread () from /lib64/libpthread.so.0
#4 0x0000003ea60ce3bd in clone () from /lib64/libc.so.6

(gdb)
(gdb) disassemble
Dump of assembler code for function opmnHttpFormatConnect:
[..]
0x000000000040cb4b : mov 2121046(%rip),%r11 # 0x6128a8
0x000000000040cb52 : pushq 0x60(%r11)
0x000000000040cb56 : movzwl 0x60(%rbp),%r9d
[..]
0x000000000040cba6 : lea 826995(%rip),%rsi # 0x4d6a20 <__STRING.28>
0x000000000040cbad : mov 0x78(%rsp),%rdi
0x000000000040cbb2 : callq 0x4073d0
[..]
0x000000000040cbf5 : retq
[..]
(gdb)

From the output we can conclude that this function is not calling other functions/syscalls except sprintf. The instruction that SIGSEGVs program movzwl is from familiy of “extended move”. It’s just optimised copy instruction. Something is wrong – probabbly we are loading from bad frame base pointer here? Bad memory pointer? NULL? Also format string for sprintf (for more info: man sprintf) is as follows (loaded by lea instruction):

(gdb) x/s 0x4d6a20
0x4d6a20 <__STRING.28>: "POST /connect HTTP/1.1\r\nVersion: 3\r\nContent-Length: 0\r\nONStarget: %u,%hu\r\nONSsource: %u,%hu,%hu,%hu\r\nONSinfo: %s!%s!%u!%u!%hu!%hu!%hu\r\nhostName: %s\r\nclusterId: %s\r\nclusterName: %s\r\ninstanceId: %s\r\nins"...
(gdb)

As Oracle’s documentation specified, ONS – by default – takes parameters from OCR. We can dump OCR for further analysis:
[root@prac2 cores]# /u01/app/crs/bin/ocrdump
[root@prac2 cores]#

This generates human readable file named OCRDUMPFILE. After viewing it (greping for ONS string) it appears that configuration for prac2 node is not there (and it should be). This can be corrected on prac1 node (where Clusterware is up):

[root@prac1 tmp]# /u01/app/crs/bin/racgons add_config prac2:6251
[root@prac1 tmp]# /u01/app/crs/bin/ocrdump
[root@prac1 tmp]# grep ONS OCRDUMPFILE
[DATABASE.ASM.prac1.+asm1.START_OPTIONS]
[DATABASE.ONS_HOSTS]
[DATABASE.ONS_HOSTS.prac1]
[DATABASE.ONS_HOSTS.prac1.PORT]
[DATABASE.ONS_HOSTS.prac2]
[DATABASE.ONS_HOSTS.prac2.PORT]
ORATEXT : CRS application for ONS on node
ORATEXT : CRS application for ONS on node
[root@prac1 tmp]#

Now we corrected it. Let’s start Clusterware on prac2 and verify how that format function performs. This somehow tricky, because main ONS daemon is child of parent ONS watchdog (21165), that function is being called only while initialising, so we’re going to kill child and follow code-path to the child.. :

[oracle@prac2 tmp]$ ps uaxw|grep opm
oracle 21165 0.0 0.0 15928 392 ? Ss 21:55 0:00 /u01/app/oracle/product/11.1.0/crs/opmn/bin/ons -d
oracle 21166 2.1 0.6 146920 10984 ? Sl 21:55 0:00 /u01/app/oracle/product/11.1.0/crs/opmn/bin/ons -d
oracle 21281 0.0 0.0 61112 700 pts/0 R+ 21:55 0:00 grep opm
[oracle@prac2 tmp]$ gdb -p 21165 # parent
GNU gdb Red Hat Linux (6.5-25.el5rh)
[..]
(gdb) set follow-fork-mode child
(gdb) b *opmnHttpFormatConnect
Breakpoint 1 at 0x40c9f2
(gdb) b *0x000000000040cbf5 # remember that retq address...?
Breakpoint 2 at 0x40cbf5
(gdb) cont
Continuing.

Now to the the other session, where we kill child (parent will restart it):
[oracle@prac2 tmp]$ kill -TERM 21166

On the GDB session we'll get this:
[New process 22004]
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
[New LWP 22015]
[Switching to LWP 22015]

Breakpoint 1, 0x000000000040c9f2 in opmnHttpFormatConnect ()
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000040cbf5 in opmnHttpFormatConnect ()

(gdb) info registers rax
rax 0x1ab38960 447973728
(gdb) x/s 0x1ab38960
0x1ab38960: "POST /connect HTTP/1.1\r\nVersion: 3\r\nContent-Length: 0\r\nONStarget: 174260502,6251\r\nONSsource: 174260501,6251,6150,0\r\nONSinfo: !!174260501!0!6251!0!6150\r\nhostName: prac1\r\nclusterId: databaseClusterId\r\nc"...
(gdb) quit
The program is running. Exit anyway? (y or n) y
[oracle@prac1 prac1]$

Summary: at the breakpoint #2 we are at retq (return from subroutine) and we get in rax (64-bit) register address of memory which contains formated string. This is HTTP request that is sent from child ONS program. We’ve verified that it works correctly :)

2 Responses to “Oracle Clusterware 11g addNode.sh ONS bug (with GDB fun & solution)”

  1. [...] 25/01/2007: Cloning prac2 and adding it to Clusterware cluster (with prac1). Debugging blogged earlier Clusterware bug. [...]

  2. Dzivak says:

    You can read oracle article 239998.1, ithink it’s can help resolve this problem also.