OS-6312

lx setsockopt TCP_NODELAY returns EINVAL when (so->so_state & SS_CANTSENDMORE)

Status:
Open
Created:
2017-08-25T15:34:27.000-0400
Updated:
2023-09-26T11:56:08.691-0400

Description

See illumos-joyent#148

6    -> lx_setsockopt_tcp 2017 Aug 25 07:41:06

              lx_brand`lx_setsockopt+0x315
              lx_brand`lx_syscall_enter+0x16f
              unix`sys_syscall+0x142

  6    | lx_setsockopt_tcp:entry 
  6        -> lx_sockopt_lookup 
  6        <- lx_sockopt_lookup Returns 0x1
  6        -> socket_setsockopt 
  6            -> so_setsockopt 
  6            <- so_setsockopt Returns 0x16
  6        <- socket_setsockopt Returns 0x16
  6    <- lx_setsockopt_tcp Returns 0x16

In so_setsockopt

792  	/* X/Open requires this check */
793  	if (so->so_state & SS_CANTSENDMORE && !xnet_skip_checks) {
794  		SO_UNBLOCK_FALLBACK(so);
795  		if (xnet_check_print)
796  			printf("sockfs: X/Open setsockopt check => EINVAL\n");
797  		return (EINVAL);
798  	}

Comments (7)

Dan McDonald commented on 2017-08-25T15:43:18.000-0400 (edited 2017-12-14T12:01:32.150-0500):

If there's a way to determine if the calling thread and/or sonode is from an LX zone, adding more checks to line 793 above should be sufficient. It's possible that the semantics for failing here EXIST but are merely different in Linux. If that's indeed the case, perhaps a check in the lx_brand kernel module that modsubs out to return 0 would be best.

Dan McDonald commented on 2020-01-30T14:39:54.120-0500 (edited 2020-01-31T10:20:00.890-0500):

Attaching a toy program.

MacOS X 10.15.2:

everywhere(/tmp)[0]% ./a.out
everywhere(/tmp)[0]% ./a.out asdf
a.out: setsockopt(TCP_NODELAY): Invalid argument
everywhere(/tmp)[255]%

SmartOS:

smartos-build(/tmp)[0]% ./a.out
smartos-build(/tmp)[0]% ./a.out asdf
a.out: setsockopt(TCP_NODELAY): Invalid argument
smartos-build(/tmp)[255]%

Centos (on bhyve):

centos(/tmp)[0]% ./a.out
centos(/tmp)[0]% ./a.out asdf
centos(/tmp)[0]%

LX Ubuntu (using compiled-on-CentOS binary):

root@7e738cf4-2f13-45a1-a51b-80f2be35fa78:/tmp# ./a.out
root@7e738cf4-2f13-45a1-a51b-80f2be35fa78:/tmp# ./a.out  asdf
a.out: setsockopt(TCP_NODELAY): Invalid argument
root@7e738cf4-2f13-45a1-a51b-80f2be35fa78:/tmp#

FreeBSD (12.1 shown, also same on 11.3, thanks @Former user)

timf@puroto ./OS-6312
timf@puroto echo $?
0
timf@puroto ./OS-6312 asdf
timf@puroto echo $?
0
timf@puroto

Dan McDonald commented on 2020-01-30T15:49:07.282-0500:

As a workaround utter this in the global zone:

 

root@global# echo "xnet_skip_checks/W1" | mdb -kw

Change the '1' to '0' to restore default system behavior.  NOTE that several X/Net socket safety checks (not just setsockopt() after shutdown) will be disabled with that switch.

Dan McDonald commented on 2020-02-18T17:55:43.709-0500:

There is a workaround.  Given the corner-case nature of this, I'm inclined to suggest the workaround until a hard problem with it is found.  A FreeBSD maintainer suggested this POSIX compliance never occurred to them, so it may be a case of "satisfying the standards lawyers by default".

servicenowjiraconnector commented on 2020-02-28T16:52:19.089-0500:

02-28-2020 14:52:18 - Mary Hood (Work notes (Internal Only))
ZD Ticket:
https://joyentcloud.zendesk.com/agent/tickets/229754

Dan McDonald commented on 2020-03-05T11:38:16.256-0500 (edited 2020-03-05T11:40:06.654-0500):

Hmmm.  Truthfully, this might be a good thing to instantiate per-zone. OR (and this puts brand code into bigger paths) have a per-brand policy on this.

Dan McDonald commented on 2023-09-26T11:55:17.020-0400 (edited 2023-09-26T11:56:08.672-0400):

Reupping with a more detailed analysis:
Where does xnet_skip_checks affect?

And finally, do we want this to be a big-global switch? Or per-zone?
IF we think per-zone is a good idea, we need to be prepared for how to do
this, as sockfs does not have ZSD today, but can probably at creation time
query a zone property (set with zonecfg?) and cache it somewhere (SM_* bits).