OS-5124: lxbrand sequential vfork exits causes segfault

Details

Issue Type:Bug
Priority:4 - Normal
Status:Resolved
Created at:2016-01-27T20:18:54.000Z
Updated at:2016-01-29T22:53:58.000Z

People

Created by:Former user
Reported by:Former user
Assigned to:Former user

Resolution

Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2016-01-28T16:47:10.000Z)

Fix Versions

2016-02-04 Hurricane Karen (Release Date: 2016-04-02)

Related Issues

Labels

lxbrand

Description

Originally reported by Mark Saad via GH:

LX-Zones Running on platforms after 20151126T062747Z can not create/modify users, modify their shadow file, or group files.

For example running the lx-centos-6 image from 20150811 on 20151126T062747Z I can correctly run useradd and vipw.

On 20160108T173524Z useradd , groupadd, vipw all generate a corefile when run.

After conferring with rzezeski we suspect this may be related to a change in how vfork works as denoted in this commit.
93c2b12

Strace'ng the affected binaries produces the following hang that pointed ryan to thing it was due to vfork.

The strace is here http://pastebin.com/dkYtKKiZ

https://github.com/joyent/illumos-joyent/issues/96

The identified strace/vfork issue was fixed in OS-5086 but the segfault behavior is present on a PI built from #master.

Comments

Comment by Former user
Created at 2016-01-28T00:24:25.000Z

Here's a relatively simple program to reproduce the issue:

extern char **environ;

int main()
{
        int pid, pid2;
        char *args[] = { "foo", "bar" };


        pid = posix_spawn(&pid2, "/usr/sbin/does_not_exist", NULL, NULL, args, environ);

        pid = posix_spawn(&pid2, "/usr/sbin/does_not_exist", NULL, NULL, args, environ);
}

Evidence would indicate that the userspace emulation is somehow running on the LX stack for the second vfork.


Comment by Former user
Created at 2016-01-28T02:19:04.000Z

The issue lays with how the lx_is_vforked variable is handled. The lx_exit and lx_group_exit routines both check lx_is_vforked as an indication that they should embark on a "quick" exit, skipping certain follow-up logic to avoid trashing the address space. The issues lies with the fact that both functions were decrementing lx_is_vforked themselves, causing underflow since the variable is in the shared address space.


Comment by Former user
Created at 2016-01-28T16:45:46.000Z

illumos-joyent commit 67304e3 (branch master, by Patrick Mooney)

OS-5124 lxbrand sequential vfork exits causes segfault
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>


Comment by Former user
Created at 2016-01-28T16:55:19.000Z

Introduced by