OS-6564: segvn concatenation causes use-after-free

Details

Issue Type:Bug
Priority:4 - Normal
Status:Resolved
Created at:2018-01-20T02:31:09.371Z
Updated at:2018-01-26T18:10:53.142Z

People

Created by:Former user
Reported by:Former user
Assigned to:Former user

Resolution

Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2018-01-26T18:10:53.131Z)

Fix Versions

2018-02-01 Kalm (Release Date: 2018-02-01)

Description

As part of performance testing for an unrelated issue, I was observing process segfaults during the bootstrap of binutils as part of a smartos-live build cycle.  These would occur consistently while doing a fresh bootstrap on a DEBUG platform image.  Non-DEBUG images were free of the problem.

 Looking at the core file, it appears have died while attempting to extend the stack segment:

> ::status
debugging core file of bash (32-bit) from 32d7b938-4092-65b5-97eb-893809d9142f
file: /usr/bin/bash
initial argv: /bin/bash ./genscripts.sh . /usr/gnu/lib /usr/gnu /usr/gnu i386-pc-solaris2.11
threading model: native threads
status: process terminated by SIGSEGV (Segmentation Fault), addr=8042d7c
> ::mappings
    BASE    LIMIT     SIZE NAME
 8043000  8048000     5000 [ stack ]

Further tracing indicated that the cause of the fault was the process apparently exceeding its VSZ limit, despite seemingly modest allocations. While trying to determine the reason of this failure, I traced the p_as->a_size measure for bash processes during the build and observed what appeared to be an underflow of a_size. As it's an unsigned value, that would indeed cause the VSZ failures when new allocations were attempted.

Continuing to debug, I added custom SDT probes where a_size was modified, trying to track down where its accounting went wrong. A source of suspect activity was a cycle of brk mappings followed by unmaps, where only the unmap would alter the size. The work in OS-6323 does allow for seg_hole segments to be mapped in, adding nothing to the VSZ, but they also subtract nothing when they are unmapped. This special handling of the S_HOLE segment flag was of particular suspicion.

Some dtrace was drawn up to trace the status of this flag in the brk allocations of interest which pointed out the culprit:

383307 brk_internal      81e1000 0
383307 as_map_locked     ffffff1a096ca400 81a1000 40000
383307 seg_alloc         ffffff1a096ca400 81a1000 40000
383307 seg_alloc         -966184019256
383307 segvn_create      ffffff1f0af1dac8 ffffff007df6cdf0 ffffff1f0af1dac8
383307 SF segvn_create 0
383307 SF segvn_create deadbeef
383307 segvn_create      0
383307 SF as_map_locked deadbeef
383307 as_map_locked     0

Here we see entry of brk_internal going to allocate a fresh segvn segment to be concatenated onto the existing heap. On entry to segvn_create the s_flags are zeroed, but on exit, they've been filled with the standard freed pattern by the allocator. Because the segment was concatenated with the existing heap, it makes sense that the passed-in struct seg would have been freed. The deadbeef pattern happens to match S_HOLE in the check following the allocation:

                seg = seg_alloc(as, addr, size);
                if (seg == NULL) {
                        AS_LOCK_EXIT(as);
                        return (ENOMEM);
                }

                error = (*crfp)(seg, argsp);
                if (error != 0) {
                        seg_free(seg);
                        AS_LOCK_EXIT(as);
                        return (error);
                }
                /*
                 * Add size now so as_unmap will work if as_ctl fails.
                 * Not applicable to explicit hole segments.
                 */
                if ((seg->s_flags & S_HOLE) == 0) {
                        as->a_size += rsize;
                        as->a_resvsize += rsize;
                } else {
                        is_hole = B_TRUE;
                }

The freed segment size would be considered a "hole" and not added to the VSZ, causing the error in a_size accounting.

Comments

Comment by Former user
Created at 2018-01-23T23:11:12.123Z

For testing, I repeated the same workload on a patched DEBUG kernel which would previously cause segfaults due to the a_size underflow and observed that the problem was absent. Additionally, I wrote a small program to manually adjust the brk in a way similar to how the offending bash process was. In tracking the size with dtrace, I could see the concatenated segvn allocations appropriately adjusting a_size.


Comment by Jira Bot
Created at 2018-01-26T18:10:02.419Z

illumos-joyent commit b8fef42ed91e3b916b058421d1783c5cba2a73bf (branch master, by Patrick Mooney)

OS-6564 segvn concatenation causes use-after-free
Reviewed by: Mike Gerdts <mike.gerdts@joyent.com>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Approved by: Dan McDonald <danmcd@joyent.com>