OS-7511: undefined behavior in lx_cap_update_priv() turns gcc7 demonic

Details

Issue Type:Bug
Priority:4 - Normal
Status:Resolved
Created at:2019-01-11T20:46:47.922Z
Updated at:2019-08-06T20:45:20.433Z

People

Created by:Former user
Reported by:Former user
Assigned to:Former user

Resolution

Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2019-08-06T20:45:20.415Z)

Description

In running LX on a gcc7-compiled platform, systemd was observed to be enagaged in periodic heavy work. Looking at systemctl revealed some failed units:

root@f184bb79-8aef-c588-c2ec-8a3e33a1ed68:~# systemctl list-units --state=failed
  UNIT                            LOAD   ACTIVE SUB    JOB   DESCRIPTION
● dbus.service                    loaded failed failed start D-Bus System Message Bus
● systemd-random-seed.service     loaded failed failed       Load/Save Random Seed
● systemd-remount-fs.service      loaded failed failed       Remount Root and Kernel File Systems
● systemd-journald-dev-log.socket loaded failed failed       Journal Socket (/dev/log)
● systemd-journald.socket         loaded failed failed       Journal Socket

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
JOB    = Pending job for the unit.

5 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

Looking at D-Bus in particular:

root@f184bb79-8aef-c588-c2ec-8a3e33a1ed68:~# systemctl -l status dbus.service
● dbus.service - D-Bus System Message Bus
   Loaded: loaded (/lib/systemd/system/dbus.service; static; vendor preset: enabled)
   Active: failed (Result: core-dump) since Fri 2019-01-11 00:05:18 UTC; 9h ago
     Docs: man:dbus-daemon(1)
  Process: 13417 ExecStop=/bin/true (code=exited, status=0/SUCCESS)
  Process: 8474 ExecStart=/usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation (code=dumped, signal=SEGV)
 Main PID: 8474 (code=dumped, signal=SEGV)

And indeed, we have quite a few coredumps from dbus:

[root@headnode (east-bay-1) ~]# ls -1 /zones/f184bb79-8aef-c588-c2ec-8a3e33a1ed68/cores/*dbus*
/zones/f184bb79-8aef-c588-c2ec-8a3e33a1ed68/cores/core.dbus-daemon.6363
/zones/f184bb79-8aef-c588-c2ec-8a3e33a1ed68/cores/core.dbus-daemon.6550
/zones/f184bb79-8aef-c588-c2ec-8a3e33a1ed68/cores/core.dbus-daemon.8474
/zones/f184bb79-8aef-c588-c2ec-8a3e33a1ed68/cores/core.dbus-daemon.84961

Looking at one of these:

[root@headnode (east-bay-1) ~]# mdb /zones/f184bb79-8aef-c588-c2ec-8a3e33a1ed68/cores/core.dbus-daemon.6363
Loading modules: [ libc.so.1 ld.so.1 libc.so.6 ]
> $c
LMfd`libc.so.1`ascii_strncasecmp+0x10(656e690036706475, 7fffef1d33d7, 5)
LMfd`libc.so.1`strncasecmp_l+0x75(656e690036706475, 7fffef1d33d7, 5, 7fffef1ef8c0)
LMfd`libc.so.1`strncasecmp+0x35(656e690036706475, 7fffef1d33d7, 5)
LMfd`libc.so.1`__priv_getbyname+0x4a(7fffef2b0890, 656e690036706475)
LMfd`libc.so.1`priv_getbyname+0x30(656e690036706475)
LMfd`libc.so.1`priv_ismember+0x16(7fffef28eb48, 656e690036706475)
LMfd`lx_brand.so.1`lx_cap_update_priv+0xb2(7fffef28eb48, 7fffef28eb98)
LMfd`lx_brand.so.1`lx_cap_to_priv+0x2f(7fffef28eb90, 7fffef28eb20)
LMfd`lx_brand.so.1`lx_capset+0xfb(7fffeefb0724, 7fffeefb072c)
LMfd`lx_brand.so.1`lx_emulate+0xc3(7fffef28eca0, 7e, 7fffef28ec70)

Note that we are calling priv_ismember from lx_cap_update_priv with a clearly bogus string.

Here's the code for lx_cap_update_priv:

static long
lx_cap_update_priv(priv_set_t *priv, const uint32_t cap[])
{
        int i, j; 
        boolean_t cap_set;
        boolean_t priv_set;
        boolean_t updated = B_FALSE;
        for (i = 0; i <= LX_CAP_MAX_CHECK; i++) {
                cap_set = LX_CAP_CAPISSET(i, cap);
                if (lx_cap_mapping[i] == NULL || i > LX_CAP_MAX_VALID) {
                        /* don't allow setting unsupported caps */
                        if (cap_set) {
                                /*
                                 * CAP_SETPCAP is a special capability, with
                                 * varying behavior, that can be used to
                                 * control if the process can change other
                                 * process's capabilities, or to control moving
                                 * capabilities between sets. For now we ignore
                                 * this if its passed in.
                                 */
                                if (i == LX_CAP_SETPCAP) {
                                        continue; 
                                }
                                lx_unsupported("set unsupported capability %d",
                                    i);
                                return (-1);
                        } else {
                                continue;
                        }
                }
                for (j = 0; lx_cap_mapping[i][j] != NULL; j++) {
                        priv_set = priv_ismember(priv, lx_cap_mapping[i][j]);
                        if (priv_set && !cap_set) {
                                VERIFY0(priv_delset(priv,
                                    lx_cap_mapping[i][j]));
                                updated = B_TRUE;
                        } else if (!priv_set && cap_set) {
                                VERIFY0(priv_addset(priv,
                                    lx_cap_mapping[i][j]));
                                updated = B_TRUE;
                        }
                }
        }
        if (updated)
                return (1);
        else
                return (0);
}

Investigation reveals that this string is sitting off the end of lx_cap_mapping (in an adjacent symbol):

> LMfd`lx_brand.so.1`lx_cap_mapping,0t41/nap
LMfd`lx_brand.so.1`lx_cap_mapping:
LMfd`lx_brand.so.1`lx_cap_mapping:              
LMfd`lx_brand.so.1`lx_cap_mapping:              LMfd`lx_brand.so.1`lx_cap_map_chown
LMfd`lx_brand.so.1`lx_cap_mapping+8:            LMfd`lx_brand.so.1`lx_cap_map_dac_override
LMfd`lx_brand.so.1`lx_cap_mapping+0x10:         LMfd`lx_brand.so.1`lx_cap_map_dac_read_search
LMfd`lx_brand.so.1`lx_cap_mapping+0x18:         LMfd`lx_brand.so.1`lx_cap_map_fowner
LMfd`lx_brand.so.1`lx_cap_mapping+0x20:         LMfd`lx_brand.so.1`lx_cap_map_fsetid
LMfd`lx_brand.so.1`lx_cap_mapping+0x28:         LMfd`lx_brand.so.1`lx_cap_map_kill
LMfd`lx_brand.so.1`lx_cap_mapping+0x30:         LMfd`lx_brand.so.1`lx_cap_map_setgid
LMfd`lx_brand.so.1`lx_cap_mapping+0x38:         LMfd`lx_brand.so.1`lx_cap_map_setuid
LMfd`lx_brand.so.1`lx_cap_mapping+0x40:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0x48:         LMfd`lx_brand.so.1`lx_cap_map_linux_immutable
LMfd`lx_brand.so.1`lx_cap_mapping+0x50:         LMfd`lx_brand.so.1`lx_cap_map_bind_service
LMfd`lx_brand.so.1`lx_cap_mapping+0x58:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0x60:         LMfd`lx_brand.so.1`lx_cap_map_net_admin
LMfd`lx_brand.so.1`lx_cap_mapping+0x68:         LMfd`lx_brand.so.1`lx_cap_map_net_raw
LMfd`lx_brand.so.1`lx_cap_mapping+0x70:         LMfd`lx_brand.so.1`lx_cap_map_ipc_lock
LMfd`lx_brand.so.1`lx_cap_mapping+0x78:         LMfd`lx_brand.so.1`lx_cap_map_ipc_owner
LMfd`lx_brand.so.1`lx_cap_mapping+0x80:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0x88:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0x90:         LMfd`lx_brand.so.1`lx_cap_map_sys_chroot
LMfd`lx_brand.so.1`lx_cap_mapping+0x98:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0xa0:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0xa8:         LMfd`lx_brand.so.1`lx_cap_map_sys_admin
LMfd`lx_brand.so.1`lx_cap_mapping+0xb0:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0xb8:         LMfd`lx_brand.so.1`lx_cap_map_sys_nice
LMfd`lx_brand.so.1`lx_cap_mapping+0xc0:         LMfd`lx_brand.so.1`lx_cap_map_sys_resource
LMfd`lx_brand.so.1`lx_cap_mapping+0xc8:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0xd0:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0xd8:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0xe0:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0xe8:         LMfd`lx_brand.so.1`lx_cap_map_audit_write
LMfd`lx_brand.so.1`lx_cap_mapping+0xf0:         LMfd`lx_brand.so.1`lx_cap_map_audit_control
LMfd`lx_brand.so.1`lx_cap_mapping+0xf8:         0               
LMfd`lx_brand.so.1`lx_cap_mapping+0x100:        0               
LMfd`lx_brand.so.1`lx_cap_mapping+0x108:        0               
LMfd`lx_brand.so.1`lx_cap_mapping+0x110:        0               
LMfd`lx_brand.so.1`lx_cap_mapping+0x118:        0               
LMfd`lx_brand.so.1`lx_cap_mapping+0x120:        0               
0x7fffef4fada8: 0               
0x7fffef4fadb0: 0               
0x7fffef4fadb8: 0               
LMfd`lx_brand.so.1`nca:         0x7fffef4e87c1  

And:

>  0x7fffef4e87c1/K
0x7fffef4e87c1: 656e690036706475 

So we've somehow managed to walk off the end of lx_cap_mapping, which doesn't seem possible from the code. Looking at the generated assembly it became clear that the generated assembly is incorrect -- which is in turn due to this undefined load:

        ...
        for (i = 0; i <= LX_CAP_MAX_CHECK; i++) {
                cap_set = LX_CAP_CAPISSET(i, cap);
                if (lx_cap_mapping[i] == NULL || i > LX_CAP_MAX_VALID) {
                        /* don't allow setting unsupported caps */
                        ...

lx_cap_mapping is LX_CAP_MAX_VALID + 1 in size; the load is of lx_cap_mapping[i] is technically undefined for sizes between LX_CAP_MAX_VALID + 1 and LX_CAP_MAX_CHECK. That said, this shouldn't affect the correctness of the code (much): assuming that lx_cap_mapping is mapped from LX_CAP_MAX_VALID + 1 to LX_CAP_MAX_CHECK, the load doesn't result in misbehavior because i is also greater than LX_CAP_MAX_VALID. But for some reason (read: compiler bug), the undefined behavior is causing the compiler to optimize away all other paths -- and it will iterate the loop well past LX_CAP_MAX_CHECK and into adjacent memory!

To see this in a much smaller example, take this C program:

#define NULL ((void *)0)

static char *arr[2] = { "nasal", "demons" };

long
func()
{
        int i;

        for (i = 0; i <= 2; i++) {
                if (arr[i] == NULL && i == 0)
                        return (0xbad);
        }

        return (0xfad);
}

Here is the code when generated with -O2 and GCC7:

% gcc --version
gcc (GCC) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
% gcc -O2 -c ub.c
% dis ub.o
disassembly for ub.o


section .text
func()
    func:     55                 pushl  %ebp
    func+0x1: b8 ad 0b 00 00     movl   $0xbad,%eax
    func+0x6: 89 e5              movl   %esp,%ebp
    func+0x8: 5d                 popl   %ebp
    func+0x9: c3                 ret    

Which definitely doesn't seem right at all! Fortunately, it's easy for us to rectify the undefined behavior here (by flipping the order in the conditional), which gets us out of the compiler issue.

Comments

Comment by Former user
Created at 2019-01-15T12:20:46.422Z

We probably need to think about the no-aggressive option outside of illumos-joyent as well. e.g. it's entirely possible that our very old openssl isn't GCC7-friendly...


Comment by Jira Bot
Created at 2019-02-05T19:28:01.884Z

illumos-joyent commit 5b248548c9ca0108aa36e7e3db7fbc46077bcf4b (branch master, by Bryan Cantrill)

OS-7498 __ctype_mask[EOF] has been working by accident
OS-7511 undefined behavior in lx_cap_update_priv() turns gcc7 demonic
OS-7517 -faggressive-loop-optimizations is too aggressive
OS-7518 array over-read in has_saved_fp()
OS-7535 allow platform builds on base-64 image
OS-7536 stop shipping poold(1M) and its cohorts
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: John Levon <john.levon@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>