Issue Type: | Bug |
---|---|
Priority: | 4 - Normal |
Status: | Resolved |
Created at: | 2015-06-10T15:02:12.000Z |
Updated at: | 2019-11-08T21:53:52.672Z |
Created by: | Former user |
---|---|
Reported by: | Former user |
Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2019-05-03T20:40:29.336Z)
If you set the following in /etc/ssh/sshd_config on an l-brand image:
UsePrivilegeSeparation sandbox
The sshd service will fail to start causing the provision to fail. The following error shows up in the logs:
auth.crit sshd[4640]: fatal: Read from socket failed: Resource temporarily unavailable [preauth]
If you enable DEBUG3 LogLevel
in sshd_config you'll see the following:
May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug3: ssh_sandbox_init: preparing seccomp filter sandbox May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug2: Network child is on pid 54877 May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug3: preauth child monitor started May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug3: privsep user:group 22:22 [preauth] May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug1: permanently_set_uid: 22/22 [preauth] May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug3: ssh_sandbox_child: setting PR_SET_NO_NEW_PRIVS [preauth] May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug1: ssh_sandbox_child: prctl(PR_SET_NO_NEW_PRIVS): Function not implemented [preauth] May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug3: ssh_sandbox_child: attaching seccomp filter program [preauth] May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug1: ssh_sandbox_child: prctl(PR_SET_SECCOMP): Function not implemented [preauth] May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug1: list_hostkey_types: ssh-rsa,ssh-dss,ecdsa-sha2-nistp256,ssh-ed25519 [preauth] May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.debug sshd[54876]: debug1: SSH2_MSG_KEXINIT sent [preauth] May 12 13:41:49 817d00a4-6e1f-eabb-b821-fc254e55e77b auth.crit sshd[54876]: fatal: Read from socket failed: Resource temporarily unavailable [preauth]
Googling the above, it seems like an issue with sandbox_child so I changed the sshd setting:
UsePrivilegeSeparation sandbox
To:
UsePrivilegeSeparation yes
The sshd service starts and the issue goes away.
These seems to happen regardless of the version of Alpine and Openssh
Also worth noting is that "UsePrivilegeSeparation yes" as a workaround is supposedly less secure than "UsePrivilegeSeparation sandbox"
Since we don't support the seccomp stuff at the moment, it would probable be best to set that as the default in our images until the functionality can be revisited.
OK, I stick with "UsePrivilegeSeparation yes" for now.
We would need to implement bpf seccomp support in order for sandbox
to work so we should continue to use the workaround in the config file indefinitely.
OpenSSH 7.5 has made UsePrivilegeSeparation sandbox
compulsory and it can no longer be reconfigured. This is starting to appear in some mainstream linux distros now.
FWIW, in sandbox-seccomp-filter.c
, it looks like this:
void ssh_sandbox_child(struct ssh_sandbox *box) { struct rlimit rl_zero; int nnp_failed = 0; /* Set rlimits for completeness if possible. */ rl_zero.rlim_cur = rl_zero.rlim_max = 0; if (setrlimit(RLIMIT_FSIZE, &rl_zero) == -1) fatal("%s: setrlimit(RLIMIT_FSIZE, { 0, 0 }): %s", __func__, strerror(errno)); if (setrlimit(RLIMIT_NOFILE, &rl_zero) == -1) fatal("%s: setrlimit(RLIMIT_NOFILE, { 0, 0 }): %s", __func__, strerror(errno)); if (setrlimit(RLIMIT_NPROC, &rl_zero) == -1) fatal("%s: setrlimit(RLIMIT_NPROC, { 0, 0 }): %s", __func__, strerror(errno)); #ifdef SANDBOX_SECCOMP_FILTER_DEBUG ssh_sandbox_child_debugging(); #endif /* SANDBOX_SECCOMP_FILTER_DEBUG */ debug3("%s: setting PR_SET_NO_NEW_PRIVS", __func__); if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) == -1) { debug("%s: prctl(PR_SET_NO_NEW_PRIVS): %s", __func__, strerror(errno)); nnp_failed = 1; } debug3("%s: attaching seccomp filter program", __func__); if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &preauth_program) == -1) debug("%s: prctl(PR_SET_SECCOMP): %s", __func__, strerror(errno)); else if (nnp_failed) fatal("%s: SECCOMP_MODE_FILTER activated but " "PR_SET_NO_NEW_PRIVS failed", __func__); }
It looks like they made it non-fatal if both the PR_SET_NO_NEW_PRIVS
and PR_SET_SECCOMP
fail, I guess to continue working on kernels without seccomp-bpf. So we're not actually dying due to the seccomp calls here, at least not directly.
Maybe it's those setrlimit
calls?
One of the calls that we're actually dying on later is a call to select(). Once we've set a zero rlimit for RLIMIT_NOFILE
, it seems that select()
(and some other functions maybe?) always returns an error. There's actually a configure.ac
test looking for this in OpenSSH, but of course it's a build-time check and these binaries were all built on real Linux, where select/poll/read etc are always fine to use on an existing FD, even after you've set your RLIMIT_NOFILE
to zero.
Gonna have to trace this down to where in the kernel we are when we deny this FD creation and see if we can match Linux's behaviour.
At https://github.com/joyent/illumos-joyent/blob/master/usr/src/uts/common/brand/lx/syscall/lx_poll.c#L175-L182 it seems that we copy-pasted from the native poll implementation where we check the RLIMIT_NOFILE
during poll
or select
. Linux does not do this (it only checks that rlimit when handling syscalls that the user expects to actually allocate a new FD -- open, pipe, dup etc). So select/poll are failing in this context where on native Linux they succeed just fine.
Confirmed this by hot-patching out that branch in the code for lx_poll_common
and lx_select_common
-- now OpenSSH is working fine in a new alpine image
I tested the CR by creating an alpine lx branded zone with the following json
{ "brand": "lx", "kernel_version": "4.3", "image_uuid": "19aa3328-0025-11e7-a19a-c39077bfd4cf", "autoboot": true, "alias": "alpine", "hostname": "alpine", "max_physical_memory": 512, "max_swap": 1024, "nics": [ { "nic_tag": "admin", "ip": "dhcp", "primary": true } ] }
# cat /etc/alpine-release 3.5.2
Followed the upgrade procedure documented here:
https://wiki.alpinelinux.org/wiki/Upgrading_Alpine
# cat /etc/alpine-release 3.8.1
On a platform without the patch
link - zebes ~ $ ssh root@192.168.168.106 date Connection closed by 192.168.168.106 port 22
On a platform with the patch from the CR
link - zebes ~ $ ssh root@192.168.168.106 date Wed Sep 26 17:56:58 UTC 2018
Also build a simple little tester:
#include <stdio.h> #include <poll.h> #include <errno.h> #include <stdlib.h> int main(int argc, char *argv[]) { int rc; struct pollfd pfd; rc = poll(&pfd, 1024*1024, 0); printf("rc = %d (errno = %d / %s)\n", rc, errno, strerror(errno)); return (0); }
Ran this with 1024*1024
and 1024*1024+1
-- it gets EFAULT
and EINVAL
, respectively, so the limit is working as expected.
illumos-joyent commit eee2da8296e69bdb47747ea36f4d186e6a133c96 (branch master, by Alex Wilson)
OS-4407 OpenSSH 7.5+ broken in lx brand
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Mike Zeller <mike.zeller@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>
I just tried the workaround described in the description with OpenSSH 8.1 on Alpine 3.11_alpha20190925 running on SmartOS joyent_20180816T001857Z and found it no longer works. It logs the following.
/etc/ssh/sshd_config line 96: Deprecated option UsePrivilegeSeparation
PIs with this fix are not broken. Only the workaround that is needed on 7.5 is broken with 8.1.