See illumos-joyent#148
6 -> lx_setsockopt_tcp 2017 Aug 25 07:41:06
lx_brand`lx_setsockopt+0x315
lx_brand`lx_syscall_enter+0x16f
unix`sys_syscall+0x142
6 | lx_setsockopt_tcp:entry
6 -> lx_sockopt_lookup
6 <- lx_sockopt_lookup Returns 0x1
6 -> socket_setsockopt
6 -> so_setsockopt
6 <- so_setsockopt Returns 0x16
6 <- socket_setsockopt Returns 0x16
6 <- lx_setsockopt_tcp Returns 0x16
In so_setsockopt
792 /* X/Open requires this check */
793 if (so->so_state & SS_CANTSENDMORE && !xnet_skip_checks) {
794 SO_UNBLOCK_FALLBACK(so);
795 if (xnet_check_print)
796 printf("sockfs: X/Open setsockopt check => EINVAL\n");
797 return (EINVAL);
798 }
Dan McDonald commented on 2017-08-25T15:43:18.000-0400 (edited 2017-12-14T12:01:32.150-0500):
If there's a way to determine if the calling thread and/or sonode is from an LX zone, adding more checks to line 793 above should be sufficient. It's possible that the semantics for failing here EXIST but are merely different in Linux. If that's indeed the case, perhaps a check in the lx_brand kernel module that modsubs out to return 0 would be best.
Dan McDonald commented on 2020-01-30T14:39:54.120-0500 (edited 2020-01-31T10:20:00.890-0500):
Attaching a toy program.
MacOS X 10.15.2:
everywhere(/tmp)[0]% ./a.out
everywhere(/tmp)[0]% ./a.out asdf
a.out: setsockopt(TCP_NODELAY): Invalid argument
everywhere(/tmp)[255]%
SmartOS:
smartos-build(/tmp)[0]% ./a.out
smartos-build(/tmp)[0]% ./a.out asdf
a.out: setsockopt(TCP_NODELAY): Invalid argument
smartos-build(/tmp)[255]%
Centos (on bhyve):
centos(/tmp)[0]% ./a.out
centos(/tmp)[0]% ./a.out asdf
centos(/tmp)[0]%
LX Ubuntu (using compiled-on-CentOS binary):
root@7e738cf4-2f13-45a1-a51b-80f2be35fa78:/tmp# ./a.out
root@7e738cf4-2f13-45a1-a51b-80f2be35fa78:/tmp# ./a.out asdf
a.out: setsockopt(TCP_NODELAY): Invalid argument
root@7e738cf4-2f13-45a1-a51b-80f2be35fa78:/tmp#
FreeBSD (12.1 shown, also same on 11.3, thanks @Former user)
timf@puroto ./OS-6312
timf@puroto echo $?
0
timf@puroto ./OS-6312 asdf
timf@puroto echo $?
0
timf@puroto
Dan McDonald commented on 2020-01-30T15:49:07.282-0500:
As a workaround utter this in the global zone:
root@global# echo "xnet_skip_checks/W1" | mdb -kw
Change the '1' to '0' to restore default system behavior. NOTE that several X/Net socket safety checks (not just setsockopt() after shutdown) will be disabled with that switch.
Dan McDonald commented on 2020-02-18T17:55:43.709-0500:
There is a workaround. Given the corner-case nature of this, I'm inclined to suggest the workaround until a hard problem with it is found. A FreeBSD maintainer suggested this POSIX compliance never occurred to them, so it may be a case of "satisfying the standards lawyers by default".
servicenowjiraconnector commented on 2020-02-28T16:52:19.089-0500:
02-28-2020 14:52:18 - Mary Hood (Work notes (Internal Only))
ZD Ticket:
https://joyentcloud.zendesk.com/agent/tickets/229754
Dan McDonald commented on 2020-03-05T11:38:16.256-0500 (edited 2020-03-05T11:40:06.654-0500):
Hmmm. Truthfully, this might be a good thing to instantiate per-zone. OR (and this puts brand code into bigger paths) have a per-brand policy on this.
Dan McDonald commented on 2023-09-26T11:55:17.020-0400 (edited 2023-09-26T11:56:08.672-0400):
Reupping with a more detailed analysis:
Where does xnet_skip_checks affect?
Actually, there's two printing globals beyond xnet_skip_checks. One is
to flag xnet_skip_check checks (xnet_check_print), the other is for a
specific case in socksyscalls.c where copyout_name() has a kernel-namelen
greater than the user-provided one. When that happens ulen bytes get
copied out, BUT the ulenp ("bytes received") is klen, not ulen.
While these vars appear in sockcommon.c, they doesn't appear to be USED AT
ALL.
xnet_skip_checks processing appears in socktpi.c & sockcommon_sops.c
(corresponding to over-stream and direct sockets).
Four cases in each file:
bind (sotpi_bindlisten() & so_bind()) when SS_CANTSENDMORE is checked
(EINVAL)
shutdown on unconnected sockets (sotpi_shutdown() & so_shutdown())
(ENOTCONN)
getpeername (sotpi_getpeername() & so_getpeername()) when SS_CANTSENDMORE
(EINVAL)
setsockopt (sotpi_setsockopt() & so_setsockopt()) when SS_CANTSENDMORE
(EINVAL)
I can't see what problem any of these solve except perhaps the shutdown
on unconnected socket, but is no-error-on-NOP okay?
Two change local socket state (bind & setsockopt) only, why block on
SS_CANTSENDMORE?
Last one queries local socket state, why block on SS_CANTSENDMORE
And finally, do we want this to be a big-global switch? Or per-zone?
IF we think per-zone is a good idea, we need to be prepared for how to do
this, as sockfs does not have ZSD today, but can probably at creation time
query a zone property (set with zonecfg?) and cache it somewhere (SM_* bits).