Issue Type: | Bug |
---|---|
Priority: | 4 - Normal |
Status: | Resolved |
Created at: | 2014-06-12T21:18:13.000Z |
Updated at: | 2019-08-29T00:28:13.934Z |
Created by: | Former user |
---|---|
Reported by: | Former user |
Assigned to: | Former user |
Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2014-06-23T17:32:15.000Z)
2014-06-26 Robocop (Release Date: 2014-06-26)
In the lx brand, file usr/src/lib/brand/lx/librtld_db/common/lx_librtld_db.c has function lx_ldb_get_dyns32 used by mdb as part of collecting its data. Part of mdb's code path for lx brand cores passes through fake_elf32, which passes NULL into dynpp_sz, causing a null pointer at line 210.
illumos-joyent commit 3d496fe (branch master, by DJ Hoffman)
OS-3103 lx_ldb_get_dyns32 doesn't properly handle NULL pointer inputs
illumos-joyent commit 9c8327a (branch master, by DJ Hoffman)
OS-3103 lx_ldb_get_dyns32 doesn't properly handle NULL pointer inputs (back out)
source file for code that was having file size limit exceeded problems in lx brand zones.
This investigation started with the core dump of a test run in lx brand zones for preadv and pwritev. The attacked tests64.c file was compiled in a linux vm and run in an lx brand zone
root@ubuntu12:~/lx-syscall-tests/syscall-tests# ./tests64 File size limit exceeded (core dumped)
Interestingly, running mdb on this core dump caused mdb to segmentation fault:
[root@headnode (coal) /zones/ub12/cores]# mdb ./core.tests64.81667 Segmentation Fault (core dumped)
Using mdb on mdb's core dump gave a stack trace. I also disassembled the function mdb crashed in:
[root@headnode (coal) /cores]# mdb core.mdb.81729 mdb: debugger failed with error: fatal signal received Loading modules: [ libumem.so.1 libc.so.1 libproc.so.1 ld.so.1 ] > $c lx_librtld_db.so.1`lx_ldb_get_dyns32+0xa3(8134b60, 8048000, 804565c, 0) librtld_db.so.1`rd_get_dyns+0x31(81125b0, 8048000, 804565c, 0) libproc.so.1`fake_elf32+0x47(8129008, 81c7688, 8048000, 80456b8, 8, 80456ec) libproc.so.1`fake_elf+0xa0(8129008, 81c7688, 8129008, fef16000, 8129008, 8129008) libproc.so.1`build_fake_elf+0x1f(8129008, 81c7688, 8045800, 80457e0, 80457e4, 8045840) libproc.so.1`Pbuild_file_symtab+0x18a(8129008, 81c7688, 40, 400) libproc.so.1`core_iter_mapping+0x462(8046518, 8129008, 8046524, 10) lx_librtld_db.so.1`lx_ldb_loadobj_iter32+0x109(8134b60, fecd1969, 8129008, 80465ac) librtld_db.so.1`_rd_loadobj_iter32+0x49(81125b0, fecd1969, 8129008, 390, fecd17cc, fecd1969) librtld_db.so.1`rd_loadobj_iter+0x2d(81125b0, fecd1969, 8129008, 80467e0) libproc.so.1`Pfgrab_core+0xc15(5, 80468f0, 8046b3c, 0, 6f63002e, 742e6572) libproc.so.1`proc_grab_common+0x347(8047e3c, 80468f0, 3, 0, 8046b3c, 0) libproc.so.1`proc_arg_xgrab+0x1f(8047e3c, 0, 3, 0, 8046b3c, 0) mdb_proc_tgt_create+0x1ad(8112f10, 2, 8047030, 6f632f2e) mdb_tgt_create+0x99(8090872, 0, 2, 8047030, 0, 8047e3c) main+0x12a2(8047d1c, fef20888, 8047d5c, 8063db7, 2, 8047d68) _start+0x83(2, 8047e38, 8047e3c, 0, 8047e51, 8047e5d) > <rip::dis mdb: variable 'rip' is not defined near "<rip" > <eip::dis lx_librtld_db.so.1`lx_ldb_get_dyns32+0x82: pushl %eax lx_librtld_db.so.1`lx_ldb_get_dyns32+0x83: call -0x204 <PLT=ps_plog> lx_librtld_db.so.1`lx_ldb_get_dyns32+0x88: movl $0x0,%eax lx_librtld_db.so.1`lx_ldb_get_dyns32+0x8d: addl $0x10,%esp lx_librtld_db.so.1`lx_ldb_get_dyns32+0x90: jmp +0x101 <lx_librtld_db.so.1`lx_ldb_get_dyns32+0x196> lx_librtld_db.so.1`lx_ldb_get_dyns32+0x95: movl -0x50(%ebp),%edx lx_librtld_db.so.1`lx_ldb_get_dyns32+0x98: movl 0x10(%ebp),%eax lx_librtld_db.so.1`lx_ldb_get_dyns32+0x9b: movl %edx,(%eax) lx_librtld_db.so.1`lx_ldb_get_dyns32+0x9d: movl -0x54(%ebp),%edi lx_librtld_db.so.1`lx_ldb_get_dyns32+0xa0: movl 0x14(%ebp),%eax lx_librtld_db.so.1`lx_ldb_get_dyns32+0xa3: movl %edi,(%eax) (truncated) > $? no process SIGSEGV: Segmentation Fault %cs = 0x0043 %eax = 0x00000000 %ds = 0x004b %ebx = 0xfe793000 %ss = 0x004b %ecx = 0x0804533c %es = 0x004b %edx = 0x08137648 %fs = 0x0000 %esi = 0x08048000 %gs = 0x01c3 %edi = 0x000000f0 %eip = 0xfe780d9f lx_librtld_db.so.1`lx_ldb_get_dyns32+0xa3 %ebp = 0x08045588 %kesp = 0x00000000 %eflags = 0x00010246 id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0 status=<of,df,IF,tf,sf,ZF,af,PF,cf> %esp = 0x08045530 %trapno = 0xe %err = 0x6
From here I saw that the segmentation fault was caused by trying to move into the location pointed at by %eax, which was 0 at the time.
Orienting myself based on the location of calls in the assembly with the source code of the file usr/src/lib/brand/lx/librtld_db/common/lx_librtld_db.c it became clear that lines 208 and 209 of this file were the location where mdb had crashed. Looking at these lines:
*dynpp = dynp; *dynpp_sz = dynp_sz;
It is clear that is dynpp or dynpp_sz is NULL, then this function will cause a segfault.
Those two values happen to be passed in as arguments to the function.
Tracing these back where they came from, it seems that rd_get_dyns is mostly a function that passes through its arguments through to other functions without processing (lines 117, 122, and 126)
112 rd_err_e 113 rd_get_dyns(rd_agent_t *rap, psaddr_t addr, void **dynpp, size_t *dynpp_sz) 114 { 115 if (rap->rd_helper.rh_ops != NULL) 116 return (rap->rd_helper.rh_ops->rho_get_dyns( 117 rap->rd_helper.rh_data, addr, dynpp, dynpp_sz)); 118 119 #ifdef _LP64 120 if (rap->rd_dmodel == PR_MODEL_LP64) 121 return (_rd_get_dyns64(rap, 122 addr, (Elf64_Dyn **)dynpp, dynpp_sz)); 123 else 124 #endif 125 return (_rd_get_dyns32(rap, 126 addr, (Dyn **)dynpp, dynpp_sz)); 127 }
Because of that, I don't believe any changes are necessary in rd_get_dyns.
rd_get_dyns is only called in one place in fake_elf32
264 if (rd_get_dyns(P->rap, addr, (void **)&dp, NULL) != RD_OK)
As we can see, it explicitly puts NULL in as the final argument to rd_get_dyns, which means dynpp_sz will be NULL and cause a segfault in lx_ldb_get_dyns32 when that is called. The question here is, should that be a valid value to pass into rd_get_dyns?
Without looking at other similar functions, simply because I don't know where to look, I'd argue yes. Neither the code in rd_get_dyns nor in lx_ldb_get_dyns32 ever reads these values, they are purely return arguments. It functions should not be forced to allocate space for a pointer that they subsequently never use, so presenting the option of passing NULL to ignore that return argument seems like the best interface. This is common practice in other standard libraries I have familiarity with, specifically calls such as sigaction(2) where if, the third argument is not NULL, it is written to contain the previous sigaction struct. The point being that the third argument can be NULL, in which case this doesn't happen.
Adding more info on whether the NULL argument itself makes sense:
In a separate case in rd_get_dyns, it calls the function _rd_get_dyns32 which contains the lines
447 *dynpp = dynp; 448 if (dynpp_sz != NULL) 449 *dynpp_sz = phdr.p_filesz;
Where dynpp, and dynpp_sz are the same arguments as before. This provides support for the idea that at least dynpp_sz should be allowed to be NULL.
illumos-joyent commit 5eeac42 (branch master, by DJ Hoffman)
OS-3103 lx_ldb_get_dyns32 doesn't properly handle NULL pointer inputs
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>