OS-4525

lxbrand implement getsockopt(TCP_INFO)

Status:
Resolved
Created:
2015-07-20T15:28:04.000-0400
Updated:
2024-01-25T16:38:34.881-0500

Description

I think I figured out the root cause. There is no log output for tutum/hello-world. In Linux, this just shows a blank line.

This is illustrated with this simple command:

➜  haproxy-bug  docker run -it tutum/hello-world
[20-Jul-2015 19:40:44] ERROR: failed to retrieve TCP_INFO for socket: Protocol not available (92)

Original steps, left for historical purposes:

1. Used docker-compose to spin up two containers that are linked to each other.
2. Attempted to view the logs on one container (dekobon/bouncy-haproxy). There were no problems.
3. Attempted to view the logs on the other container (tutum/hello-world). There were problems as listed below.

haproxy-bug  docker-compose -p hatest up -d
Creating hatest_site_1...
Creating hatest_lb_1...
➜  haproxy-bug  # ok. This is a huge divergence.
➜  haproxy-bug  # Docker logs should show all of the logs since container start...
➜  haproxy-bug  # let's see if we can get them without using docker-compose
➜  haproxy-bug  docker logs hatest_lb_1
*** Running /etc/my_init.d/00_regen_ssh_host_keys.sh...
*** Running /etc/rc.local...
*** Booting runit daemon...
*** Runit started as PID 5863
ok: run: syslog-ng: (pid 5874) 0s
Jul 20 18:50:02 215639fdfa0c syslog-ng[5874]: syslog-ng starting up; version='3.5.3'
Jul 20 18:50:04 127.0.0.1 haproxy[6335]: Proxy stats started.
Jul 20 18:50:04 127.0.0.1 haproxy[6335]: Proxy stats started.
Jul 20 18:50:04 127.0.0.1 haproxy[6335]: Proxy http-in started.
Jul 20 18:50:04 127.0.0.1 haproxy[6335]: Proxy http-in started.
Jul 20 18:50:04 127.0.0.1 haproxy[6335]: Proxy servers started.
Jul 20 18:50:04 127.0.0.1 haproxy[6335]: Proxy servers started.
➜  haproxy-bug  docker logs hatest_site_1
[20-Jul-2015 18:48:38] ERROR: failed to retrieve TCP_INFO for socket: Protocol not available (92)
➜  haproxy-bug  docker logs hatest_site_1
[20-Jul-2015 18:48:38] ERROR: failed to retrieve TCP_INFO for socket: Protocol not available (92)
➜  haproxy-bug  # That's weird...
quote>
➜  haproxy-bug  docker-compose -p hatest ps
    Name                   Command               State                    Ports
-------------------------------------------------------------------------------------------------
hatest_lb_1     /sbin/my_init                    Up      0.0.0.0:443->443/tcp, 0.0.0.0:80->80/tcp
hatest_site_1   /bin/sh -c php-fpm -d vari ...   Up      80/tcp
➜  haproxy-bug  docker logs hatest_site_1
[20-Jul-2015 18:48:38] ERROR: failed to retrieve TCP_INFO for socket: Protocol not available (92)
➜  haproxy-bug  docker ps
CONTAINER ID        IMAGE                    COMMAND                CREATED             STATUS              PORTS                                      NAMES
215639fdfa0c        dekobon/bouncy-haproxy   "/sbin/my_init"        2 minutes ago       Up 2 minutes        0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp   hatest_lb_1
f3e0956d21cb        tutum/hello-world        "/bin/sh -c 'php-fpm   4 minutes ago       Up 3 minutes        80/tcp                                     hatest_site_1
➜  haproxy-bug  docker logs f3e0956d21cb
[20-Jul-2015 18:48:38] ERROR: failed to retrieve TCP_INFO for socket: Protocol not available (92)

Comments (10)

Josh Wilsdon commented on 2015-07-20T16:25:56.000-0400 (edited 2018-05-07T13:50:26.888-0400):

The error message here:

[20-Jul-2015 18:48:38] ERROR: failed to retrieve TCP_INFO for socket: Protocol not available (92)

is the actual content of the log, so sdc-docker is working correctly here. I've reproduced this myself with:

[joshw@sledgehammer ~]$ docker run -it tutum/hello-world
Unable to find image 'tutum/hello-world:latest' locally
latest: Pulling from tutum/hello-world (5f509591-d423-41ac-a4c3-6fd589a4aee2)
c833a1892a15: Already exists.
5e314245be14: Already exists.
da052ae129c2: Already exists.
a1dd7097a8e8: Already exists.
2e26b130ff1d: Already exists.
01fb4017dfb8: Already exists.
b93eb18c953b: Already exists.
8989fc3642f5: Already exists.
Status: Downloaded newer image for docker.io/tutum/hello-world:latest
[20-Jul-2015 20:02:52] ERROR: failed to retrieve TCP_INFO for socket: Protocol not available (92)

and this VM is happily running. The data in the log on the CN is indeed:

[root@5DVP842 (us-east-3b) ~]# cat /zones/0a6e729d-9036-47a9-a9db-7d289df052ab/logs/stdio.log
{"log":"[20-Jul-2015 20:02:52] ERROR: failed to retrieve TCP_INFO for socket: Protocol not available (92)\r\n","stream":"stdout","time":"2015-07-20T20:02:52.31660000Z"}
[root@5DVP842 (us-east-3b) ~]#

so we're correctly returning the data here. The remaining question is what this image is doing that it's trying to TCP_INFO here and that'll either be a bug with the image or the OS/LX here so I'm moving this to the OS project for now.

Former user commented on 2015-07-21T15:38:35.000-0400:

patrick.mooney:
	ElijahZ: What app is running inside that tutum container?
	The fact that it demands TCP_INFO is unfortunate
jwilsdon:
	pmooney this is what it runs:
	        "Cmd": [
            "/bin/sh",
            "-c",
            "php-fpm -d variables_order=\"EGPCS\" && exec nginx -g \"daemon off;\""
        ],
elijah.zupancic:
	I'm not sure, give me a second
patrick.mooney:
	ah
	probably nginx, then
	I assume the '-d' param on php-fpm will silence its output

Former user commented on 2015-07-21T21:33:53.000-0400:

You can find the Dockerfile definition here: https://registry.hub.docker.com/u/tutum/hello-world/dockerfile/

Hopping on the instance. I have invoked only php-fpm in daemon mode:

entrypoint-divergence  docker run -it tutum/hello-world sh
/ # php-fpm -d variables_order="EGPCS"
[22-Jul-2015 01:30:00] ERROR: failed to retrieve TCP_INFO for socket: Protocol not available (92)

So, it looks like php-fpm is the culprit.

Using my remedial understanding of dtrace:

/ # /native/usr/sbin/dtrace -n 'syscall:::entry/pid == $target/{@[probefunc] = count();}' -c "php-fpm"
dtrace: description 'syscall:::entry' matched 236 probes
[22-Jul-2015 01:32:43] ERROR: failed to retrieve TCP_INFO for socket: Protocol not available (92)
dtrace: pid 23986 has exited

  access                                                            1
  forksys                                                           1
  getcwd                                                            1
  getpid                                                            1
  getrlimit                                                         1
  memcntl                                                           1
  mmapobj                                                           1
  open                                                              1
  pollsys                                                           1
  rexit                                                             1
  schedctl                                                          1
  sigpending                                                        1
  getuid                                                            2
  lseek                                                             2
  sysconfig                                                         2
  zone                                                              2
  munmap                                                            4
  lwp_cond_broadcast                                                5
  setcontext                                                        5
  fstat                                                             6
  lstat                                                             6
  mprotect                                                          6
  modctl                                                            7
  fcntl                                                             8
  stat                                                              8
  close                                                            16
  mmap                                                             18
  lwp_sigmask                                                      22
  uucopy                                                           28
  brandsys                                                         71
  sigaction                                                        81

Former user commented on 2015-07-21T21:47:41.000-0400 (edited 2015-07-21T21:48:07.000-0400):

See attachment for strace output of php-fpm.

Former user commented on 2015-07-22T12:07:49.000-0400:

@Former user Can you confirm if the container running php-fpm is (or is not) functioning properly. From the above notes, the container appears to stay up. I want to differentiate between an error indicating broken service and a printed (but harmless) warning.

Former user commented on 2015-07-22T13:15:59.000-0400:

Just confirmed this doesn't impact hello world functionality. I do wonder, what effects it is having on php-fpm, but it is not a priority.

Former user commented on 2015-07-22T13:26:59.000-0400:

It appears that php uses TCP_INFO to query the queue length on the listening socket. From my cursory glance, it seems to handle failure gracefully.

Former user commented on 2019-09-03T16:21:44.418-0400 (edited 2019-09-03T16:22:18.021-0400):

I took a look at what it would take to implement this a while back. The Linux header file frustratingly leaves the struct undocumented, so I put together a commented version:

typedef struct lx_tcp_info {
	/* Current state in TCP state machine */
	uint8_t		tcpi_state;
	/* Congestion avoidance state */
	uint8_t		tcpi_ca_state;
	/* Number of unrecovered RTO timeouts */
	uint8_t		tcpi_retransmits;
	/* Unanswered 0 window probes */
	uint8_t		tcpi_probes;
	/* Current exponential backoff for RTO */
	uint8_t		tcpi_backoff;
	/* Enabled TCP options */
	uint8_t		tcpi_options;
#define	LX_TCPI_OPT_TIMESTAMPS	0x01	/* Negotiated TCP Timestamps */
#define	LX_TCPI_OPT_SACK	0x02	/* Negotiated SACK */
#define	LX_TCPI_OPT_WSCALE	0x04	/* Negotiated Window Scaling */
#define	LX_TCPI_OPT_ECN		0x08	/* Negotiated ECN */
#define	LX_TCPI_OPT_ECN_SEEN	0x10	/* Received at least 1 packet w/ ECT */
#define	LX_TCPI_OPT_SYN_DATA	0x20	/* Sent or received SYN-ACK for SYN */

	uint8_t
		tcpi_snd_wscale : 4,	/* Send window scale shift */
		tcpi_rcv_wscale : 4;	/* Receive window scale shift */

	/* Whether the application is sending less than the send window */
	uint8_t		tcpi_delivery_rate_app_limited : 1;

	/* Retransmission timeout (usecs) */
	uint32_t	tcpi_rto;
	/* Predicted soft clock tick for delivering delayed ACK */
	uint32_t	tcpi_ato;
	/* Maximum Segment Size, sent (RFC 4898 tcpEStatsStackMSSSent) */
	uint32_t	tcpi_snd_mss;
	/* Maximum Segment Size, received (RFC 4898 tcpEStatsStackMSSRcvd) */
	uint32_t	tcpi_rcv_mss;

	/* Sent but unacknowledged bytes */
	uint32_t	tcpi_unacked;
	/* # of SACKed packets; without SACK, # of recvd dup packets */
	uint32_t	tcpi_sacked;
	/* Estimated # of packets lost */
	uint32_t	tcpi_lost;
	/* Total # of rexmitted segments */
	uint32_t	tcpi_retrans;
	/* # of packets to highest SACKed sequence (deprecated on Linux) */
	uint32_t	tcpi_fackets;

	/* Time (msecs) since last sent data */
	uint32_t	tcpi_last_data_sent;
	/* Time (msecs) since last sent ACK (not filled in on Linux) */
	uint32_t	tcpi_last_ack_sent;
	/* Time (msecs) since last recv data */
	uint32_t	tcpi_last_data_recv;
	/* Time (msecs) since last recv ACK */
	uint32_t	tcpi_last_ack_recv;

	/* Last PMTU seen by socket */
	uint32_t	tcpi_pmtu;
	/* Slow start threshold (recv) */
	uint32_t	tcpi_rcv_ssthresh;
	/* Smoothed RTT (usecs) */
	uint32_t	tcpi_rtt;
	/* RTT variance (usecs) */
	uint32_t	tcpi_rttvar;
	/* Slow start threshold (send) */
	uint32_t	tcpi_snd_ssthresh;
	/* Send congestion window */
	uint32_t	tcpi_snd_cwnd;
	/* Advertised MSS */
	uint32_t	tcpi_advmss;
	/* ? */
	uint32_t	tcpi_reordering;

	/* ? */
	uint32_t	tcpi_rcv_rtt;
	/* Advertised recv window */
	uint32_t	tcpi_rcv_space;

	/* Total # of rexmitted segments for connection */
	uint32_t	tcpi_total_retrans;

	/* ? */
	uint64_t	tcpi_pacing_rate;
	/* ? */
	uint64_t	tcpi_max_pacing_rate;
	/* RFC 4898 tcpEStatsAppHCThruOctetsAcked */
	uint64_t	tcpi_bytes_acked;
	/* RFC 4898 tcpEStatsAppHCThruOctetsReceived */
	uint64_t	tcpi_bytes_received;
	/* RFC 4898 tcpEStatsPerfSegsOut */
	uint32_t	tcpi_segs_out;
	/* RFC 4898 tcpEStatsPerfSegsIn */
	uint32_t	tcpi_segs_in;
	/* Current # of unsent bytes */
	uint32_t	tcpi_notsent_bytes;
	/* Minimum observed RTT */
	uint32_t	tcpi_min_rtt;
	/* RFC 4898 tcpEStatsDataSegsIn */
	uint32_t	tcpi_data_segs_in;
	/* RFC 4898 tcpEStatsDataSegsOut */
	uint32_t	tcpi_data_segs_out;
	/* ? */
	uint64_t	tcpi_delivery_rate;
	/* Time (usecs) busy sending data */
	uint64_t	tcpi_busy_time;
	/* Time (usecs) limited by received window */
	uint64_t	tcpi_rwnd_limited;
	/* Time (usecs) limited by send buffer */
	uint64_t	tcpi_sndbuf_limited;

	/* Total # of data packets delivered to peer, including rexmits */
	uint32_t	tcpi_delivered;
	/* Same as the above field, but only counts ECE-marked packets */
	uint32_t	tcpi_delivered_ce;

	/* RFC 4898 tcpEStatsPerfHCDataOctetsOut */
	uint64_t	tcpi_bytes_sent;
	/* Total # of bytes rexmitted (RFC 4898 tcpEStatsPerfOctetsRetrans) */
	uint64_t	tcpi_bytes_retrans;
	/* RFC 4898 tcpEStatsStackDSACKDups */
	uint32_t	tcpi_dsack_dups;
	/* # of reordering events seen */
	uint32_t	tcpi_reord_seen;
} lx_tcp_info_t;

Note that when TCP_INFO is used on a listening socket, tcpi_unacked is the current backlog length, and tcpi_sacked is the maximum allowed backlog length.

I also took a look at what fields different software looks at:

Dan McDonald commented on 2024-01-16T10:01:01.725-0500:

Community member Carlos Neirac took this over. His testing notes from the github PR:

I have a sample C program to test some scenarios:
the results match in TCP states between native Linux and an LX branded zone, but the values for other struct members for tcp_info in this patch need more work in converting the values to Linux expected metrics.

LX Branded Zone

test_connect[SUCCESS] TCP state: LX_TCP_ESTABLISHED
test_connect[SUCCESS] tcpi_rto  1000
test_connect[SUCCESS] tcpi_last_data_recv 13125255
test_connect[SUCCESS] tcpi_rtt  4000000
test_connect[SUCCESS] tcpi_rttvar 500000
test_connect[SUCCESS] tcpi_snd_ssthresh 32975
test_connect[SUCCESS] tcpi_snd_cwnd  16360
test_connect[SUCCESS] tcpi_snd_mss  8180
test_connect[SUCCESS] tcpi_unacked 1289948609
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu  8252
test_connect[SUCCESS] tcpi_total_retrans 1289948609
test_connect[SUCCESS] tcpi_unacked  1289948609
test_connect[SUCCESS] tcpi_sacked  0
test_connect[SUCCESS] tcpi_rcv_space 5
test_connect[SUCCESS]:SOCK_STREAM NOT_CONNECTED TCP state: LX_TCP_CLOSE
test_connect[SUCCESS] tcpi_rto  1000
test_connect[SUCCESS] tcpi_last_data_recv 0
test_connect[SUCCESS] tcpi_rtt  4000000
test_connect[SUCCESS] tcpi_rttvar 500000
test_connect[SUCCESS] tcpi_snd_ssthresh 0
test_connect[SUCCESS] tcpi_snd_cwnd  0
test_connect[SUCCESS] tcpi_snd_mss  536
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu  0
test_connect[SUCCESS] tcpi_total_retrans 0
test_connect[SUCCESS] tcpi_unacked  0
test_connect[SUCCESS] tcpi_sacked  0
test_connect[SUCCESS] tcpi_rcv_space 14
test_connect[FAIL] SOCK_STREAM CONNECTED: connect() failed: Address family not supported by protocol
test_connect[SUCCESS]:SOCK_STREAM CONNECTED TCP state: LX_TCP_ESTABLISHED
test_connect[SUCCESS] tcpi_rto  1000
test_connect[SUCCESS] tcpi_last_data_recv 13125256
test_connect[SUCCESS] tcpi_rtt  4000000
test_connect[SUCCESS] tcpi_rttvar 500000
test_connect[SUCCESS] tcpi_snd_ssthresh 32975
test_connect[SUCCESS] tcpi_snd_cwnd  16360
test_connect[SUCCESS] tcpi_snd_mss  8180
test_connect[SUCCESS] tcpi_unacked -253238741
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu  8232
test_connect[SUCCESS] tcpi_total_retrans -253238741
test_connect[SUCCESS] tcpi_unacked  -253238741
test_connect[SUCCESS] tcpi_sacked  0
test_connect[SUCCESS] tcpi_rcv_space 5
test_bound[SUCCESS] BOUND tcp state: LX_TCP_CLOSE
test_bound[SUCCESS] tcpi_rto  1000
test_bound[SUCCESS] tcpi_last_data_recv 0
test_bound[SUCCESS] tcpi_rtt  4000000
test_bound[SUCCESS] tcpi_rttvar 500000
test_bound[SUCCESS] tcpi_snd_ssthresh 0
test_bound[SUCCESS] tcpi_snd_cwnd  0
test_bound[SUCCESS] tcpi_snd_mss  536
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 0
test_bound[SUCCESS] tcpi_pmtu  0
test_bound[SUCCESS] tcpi_total_retrans 0
test_bound[SUCCESS] tcpi_unacked  0
test_bound[SUCCESS] tcpi_sacked  0
test_bound[SUCCESS] tcpi_rcv_space 14
test_bound[SUCCESS] LISTEN tcp state: LX_TCP_LISTEN
test_bound[SUCCESS] tcpi_rto  1000
test_bound[SUCCESS] tcpi_last_data_recv 0
test_bound[SUCCESS] tcpi_rtt  4000000
test_bound[SUCCESS] tcpi_rttvar 500000
test_bound[SUCCESS] tcpi_snd_ssthresh 0
test_bound[SUCCESS] tcpi_snd_cwnd  0
test_bound[SUCCESS] tcpi_snd_mss  536
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 0
test_bound[SUCCESS] tcpi_pmtu  0
test_bound[SUCCESS] tcpi_total_retrans 0
test_bound[SUCCESS] tcpi_unacked  0
test_bound[SUCCESS] tcpi_sacked  0
test_bound[SUCCESS] tcpi_rcv_space 14

Native Linux

test_connect[SUCCESS] TCP state: LX_TCP_ESTABLISHED
test_connect[SUCCESS] tcpi_rto  200000
test_connect[SUCCESS] tcpi_last_data_recv 0
test_connect[SUCCESS] tcpi_rtt  17
test_connect[SUCCESS] tcpi_rttvar 8
test_connect[SUCCESS] tcpi_snd_ssthresh 2147483647
test_connect[SUCCESS] tcpi_snd_cwnd  10
test_connect[SUCCESS] tcpi_snd_mss  32732
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu  65536
test_connect[SUCCESS] tcpi_total_retrans 0
test_connect[SUCCESS] tcpi_unacked  0
test_connect[SUCCESS] tcpi_sacked  0
test_connect[SUCCESS] tcpi_rcv_space 65476
test_connect[SUCCESS]:SOCK_STREAM NOT_CONNECTED TCP state: LX_TCP_CLOSE
test_connect[SUCCESS] tcpi_rto  1000000
test_connect[SUCCESS] tcpi_last_data_recv 366854276
test_connect[SUCCESS] tcpi_rtt  0
test_connect[SUCCESS] tcpi_rttvar 250000
test_connect[SUCCESS] tcpi_snd_ssthresh 2147483647
test_connect[SUCCESS] tcpi_snd_cwnd  10
test_connect[SUCCESS] tcpi_snd_mss  536
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu  0
test_connect[SUCCESS] tcpi_total_retrans 0
test_connect[SUCCESS] tcpi_unacked  0
test_connect[SUCCESS] tcpi_sacked  0
test_connect[SUCCESS] tcpi_rcv_space 0
test_connect[FAIL] SOCK_STREAM CONNECTED: connect() failed: Address family not supported by protocol
test_connect[SUCCESS]:SOCK_STREAM CONNECTED TCP state: LX_TCP_ESTABLISHED
test_connect[SUCCESS] tcpi_rto  200000
test_connect[SUCCESS] tcpi_last_data_recv 0
test_connect[SUCCESS] tcpi_rtt  12
test_connect[SUCCESS] tcpi_rttvar 6
test_connect[SUCCESS] tcpi_snd_ssthresh 2147483647
test_connect[SUCCESS] tcpi_snd_cwnd  10
test_connect[SUCCESS] tcpi_snd_mss  32741
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu  65535
test_connect[SUCCESS] tcpi_total_retrans 0
test_connect[SUCCESS] tcpi_unacked  0
test_connect[SUCCESS] tcpi_sacked  0
test_connect[SUCCESS] tcpi_rcv_space 65495
test_bound[SUCCESS] BOUND tcp state: LX_TCP_CLOSE
test_bound[SUCCESS] tcpi_rto  1000000
test_bound[SUCCESS] tcpi_last_data_recv 366854276
test_bound[SUCCESS] tcpi_rtt  0
test_bound[SUCCESS] tcpi_rttvar 250000
test_bound[SUCCESS] tcpi_snd_ssthresh 2147483647
test_bound[SUCCESS] tcpi_snd_cwnd  10
test_bound[SUCCESS] tcpi_snd_mss  536
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 0
test_bound[SUCCESS] tcpi_pmtu  0
test_bound[SUCCESS] tcpi_total_retrans 0
test_bound[SUCCESS] tcpi_unacked  0
test_bound[SUCCESS] tcpi_sacked  0
test_bound[SUCCESS] tcpi_rcv_space 0
test_bound[SUCCESS] LISTEN tcp state: LX_TCP_LISTEN
test_bound[SUCCESS] tcpi_rto  0
test_bound[SUCCESS] tcpi_last_data_recv 0
test_bound[SUCCESS] tcpi_rtt  0
test_bound[SUCCESS] tcpi_rttvar 0
test_bound[SUCCESS] tcpi_snd_ssthresh 0
test_bound[SUCCESS] tcpi_snd_cwnd  10
test_bound[SUCCESS] tcpi_snd_mss  0
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 10
test_bound[SUCCESS] tcpi_pmtu  0
test_bound[SUCCESS] tcpi_total_retrans 0
test_bound[SUCCESS] tcpi_unacked  0
test_bound[SUCCESS] tcpi_sacked  10
test_bound[SUCCESS] tcpi_rcv_space 0
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>




int test_connect (int argc, char *argv[], int test);
int test_bound (int argc, char *argv[], int test);
char *state2string (int state);

int
main (int argc, char *argv[])
{
  test_connect (argc, argv, 1);
  test_connect (argc, argv, 2);
  test_connect (argc, argv, 3);
  test_bound (argc, argv, 1);
  test_bound (argc, argv, 2);

}


int
test_bound (int argc, char *argv[], int test)
{

  int listenfd = 0, connfd = 0;
  struct sockaddr_in serv_addr;

  listenfd = socket (AF_INET, SOCK_STREAM, 0);
  memset (&serv_addr, '0', sizeof (serv_addr));

  serv_addr.sin_family = AF_INET;
  serv_addr.sin_addr.s_addr = htonl (INADDR_ANY);
  serv_addr.sin_port = htons (5000);

  bind (listenfd, (struct sockaddr *) &serv_addr, sizeof (serv_addr));
  char name[40] = "";
  int namelen = sizeof (struct tcp_info);
  struct tcp_info tcpi;

  switch (test)
    {

    case 1:
      if (getsockopt (listenfd, IPPROTO_TCP, TCP_INFO, &tcpi, &namelen) == -1)
	{
	  perror ("test_bound[FAIL] BOUND getsockopt(TCP_INFO) failed");
	  return (1);
	}
      printf ("test_bound[SUCCESS] BOUND tcp state: %s\n",
	      state2string (tcpi.tcpi_state));
      break;
    case 2:
      listen (listenfd, 10);

      if (getsockopt (listenfd, IPPROTO_TCP, TCP_INFO, &tcpi, &namelen) == -1)
	{
	  perror ("test_bound[FAIL] LISTEN: getsockopt(TCP_INFO) failed");
	  return (1);
	}
      printf ("test_bound[SUCCESS] LISTEN tcp state: %s\n",
	      state2string (tcpi.tcpi_state));
      break;
    }

  printf ("test_bound[SUCCESS] tcpi_rto  %d\n", tcpi.tcpi_rto);
  printf ("test_bound[SUCCESS] tcpi_last_data_recv %d\n",
	  tcpi.tcpi_last_data_recv);
  printf ("test_bound[SUCCESS] tcpi_rtt  %d\n", tcpi.tcpi_rtt);
  printf ("test_bound[SUCCESS] tcpi_rttvar %d\n", tcpi.tcpi_rttvar);
  printf ("test_bound[SUCCESS] tcpi_snd_ssthresh %d\n",
	  tcpi.tcpi_snd_ssthresh);
  printf ("test_bound[SUCCESS] tcpi_snd_cwnd  %d\n", tcpi.tcpi_snd_cwnd);
  printf ("test_bound[SUCCESS] tcpi_snd_mss  %d\n", tcpi.tcpi_snd_mss);
  printf ("test_bound[SUCCESS] tcpi_unacked %d\n", tcpi.tcpi_unacked);
  printf ("test_bound[SUCCESS] tcpi_sacked %d\n", tcpi.tcpi_sacked);
  printf ("test_bound[SUCCESS] tcpi_pmtu  %d\n", tcpi.tcpi_pmtu);
  printf ("test_bound[SUCCESS] tcpi_total_retrans %d\n",
	  tcpi.tcpi_total_retrans);
  printf ("test_bound[SUCCESS] tcpi_unacked  %d\n", tcpi.tcpi_unacked);
  printf ("test_bound[SUCCESS] tcpi_sacked  %d\n", tcpi.tcpi_sacked);
  printf ("test_bound[SUCCESS] tcpi_rcv_space %d\n", tcpi.tcpi_rcv_space);

}

int
test_connect (int argc, char *argv[], int test)
{
  struct addrinfo hints, *gair, *p;
  int fd, rv, rlen;

  if (argc != 3)
    {
      fprintf (stderr, "%s <host> <port>\n", argv[0]);
      return (1);
    }

  memset (&hints, 0, sizeof (hints));
  hints.ai_family = PF_UNSPEC;
  hints.ai_socktype = SOCK_STREAM;

  if ((rv = getaddrinfo (argv[1], argv[2], &hints, &gair)) != 0)
    {
      fprintf (stderr, "test_connect[FAIL]: getaddrinfo() failed: %s\n",
	       gai_strerror (rv));
      return (1);
    }

  for (p = gair; p != NULL; p = p->ai_next)
    {
      switch (test)
	{
	case 1:
	  if ((fd =
	       socket (p->ai_family, p->ai_socktype, p->ai_protocol)) == -1)
	    {
	      perror ("test_connect[FAIL]: socket() failed");
	      continue;
	    }
	  if (connect (fd, p->ai_addr, p->ai_addrlen) == -1)
	    {
	      close (fd);
	      perror ("test_connect[FAIL]: connect() failed");
	      continue;
	    }
	  break;

	case 2:
	  if ((fd = socket (AF_INET, SOCK_STREAM, 0)) == -1)
	    {
	      perror
		("test_connect[FAIL] SOCK_STREAM NOT_CONNECTED: socket() failed");
	      continue;
	    }
	  break;

	case 3:
	  if ((fd = socket (AF_INET, SOCK_STREAM, 0)) == -1)
	    {
	      perror
		("test_connect[FAIL] SOCK_STREAM CONNECTED: socket() failed");
	      continue;
	    }
	  if (connect (fd, p->ai_addr, p->ai_addrlen) == -1)
	    {
	      close (fd);
	      perror
		("test_connect[FAIL] SOCK_STREAM CONNECTED: connect() failed");
	      continue;
	    }
	  break;
	}
      break;
    }

  if (p == NULL)
    {
      fprintf (stderr, "test_connect[FAIL]:  failed to connect to server\n");
      return (1);
    }

  freeaddrinfo (gair);

  int namelen = sizeof (struct tcp_info);
  struct tcp_info tcpi;
  if (getsockopt (fd, IPPROTO_TCP, TCP_INFO, &tcpi, &namelen) == -1)
    {
      switch (test)
	{
	case 1:
	  perror ("test_connect[FAIL]:getsockopt(TCP_INFO) failed");
	  break;

	case 2:
	  perror
	    ("test_connect[FAIL] SOCK_STREAM NOT_CONNECTED :getsockopt(TCP_INFO) failed");
	  break;

	case 3:
	  perror
	    ("test_connect[FAIL] SOCK_STREAM CONNECTED :getsockopt(TCP_INFO) failed");
	  break;

	default:
	  perror
	    ("test_connect[FAIL] SOCK_STREAM :getsockopt(TCP_INFO) failed");
	}
      return -1;
    }

  switch (test)
    {
    default:
    case 1:
      printf ("test_connect[SUCCESS] TCP state: %s\n",
	      state2string (tcpi.tcpi_state));
      break;

    case 2:
      printf
	("test_connect[SUCCESS]:SOCK_STREAM NOT_CONNECTED TCP state: %s\n",
	 state2string (tcpi.tcpi_state));
      break;

    case 3:
      printf
	("test_connect[SUCCESS]:SOCK_STREAM CONNECTED TCP state: %s\n",
	 state2string (tcpi.tcpi_state));
    }

  printf ("test_connect[SUCCESS] tcpi_rto  %d\n", tcpi.tcpi_rto);
  printf ("test_connect[SUCCESS] tcpi_last_data_recv %d\n",
	  tcpi.tcpi_last_data_recv);
  printf ("test_connect[SUCCESS] tcpi_rtt  %d\n", tcpi.tcpi_rtt);
  printf ("test_connect[SUCCESS] tcpi_rttvar %d\n", tcpi.tcpi_rttvar);
  printf ("test_connect[SUCCESS] tcpi_snd_ssthresh %d\n",
	  tcpi.tcpi_snd_ssthresh);
  printf ("test_connect[SUCCESS] tcpi_snd_cwnd  %d\n", tcpi.tcpi_snd_cwnd);
  printf ("test_connect[SUCCESS] tcpi_snd_mss  %d\n", tcpi.tcpi_snd_mss);
  printf ("test_connect[SUCCESS] tcpi_unacked %d\n", tcpi.tcpi_unacked);
  printf ("test_connect[SUCCESS] tcpi_sacked %d\n", tcpi.tcpi_sacked);
  printf ("test_connect[SUCCESS] tcpi_pmtu  %d\n", tcpi.tcpi_pmtu);
  printf ("test_connect[SUCCESS] tcpi_total_retrans %d\n",
	  tcpi.tcpi_total_retrans);
  printf ("test_connect[SUCCESS] tcpi_unacked  %d\n", tcpi.tcpi_unacked);
  printf ("test_connect[SUCCESS] tcpi_sacked  %d\n", tcpi.tcpi_sacked);
  printf ("test_connect[SUCCESS] tcpi_rcv_space %d\n", tcpi.tcpi_rcv_space);
}


/* TCP states */
#define	TCPS_CLOSED		-6
#define	TCPS_IDLE		-5	/* idle (opened, but not bound) */
#define	TCPS_BOUND		-4	/* bound, ready to connect or accept */
#define	TCPS_LISTEN		-3	/* listening for connection */
#define	TCPS_SYN_SENT		-2	/* active, have sent syn */
#define	TCPS_SYN_RCVD		-1	/* have received syn (and sent ours) */
/* states < TCPS_ESTABLISHED are those where connections not established */
#define	TCPS_ESTABLISHED	0	/* established */
#define	TCPS_CLOSE_WAIT		1	/* rcvd fin, waiting for close */
/* states > TCPS_CLOSE_WAIT are those where user has closed */
#define	TCPS_FIN_WAIT_1		2	/* have closed and sent fin */
#define	TCPS_CLOSING		3	/* closed, xchd FIN, await FIN ACK */
#define	TCPS_LAST_ACK		4	/* had fin and close; await FIN ACK */
/* states > TCPS_CLOSE_WAIT && < TCPS_FIN_WAIT_2 await ACK of FIN */
#define	TCPS_FIN_WAIT_2		5	/* have closed, fin is acked */
#define	TCPS_TIME_WAIT		6	/* in 2*msl quiet wait after close */



#define LX_TCP_ESTABLISHED  1
#define LX_TCP_SYN_SENT  2
#define LX_TCP_SYN_RECV  3
#define LX_TCP_FIN_WAIT1  4
#define LX_TCP_FIN_WAIT2  5
#define LX_TCP_TIME_WAIT  6
#define LX_TCP_CLOSE  7
#define LX_TCP_CLOSE_WAIT  8
#define LX_TCP_LAST_ACK  9
#define LX_TCP_LISTEN  10
#define LX_TCP_CLOSING  11
#define LX_TCP_NEW_SYN_RECV  12

char *
state2string (int state)
{
  static char st[32];
  switch (state)
    {
    case LX_TCP_ESTABLISHED:
      sprintf (st, "LX_TCP_ESTABLISHED");
      break;
    case LX_TCP_SYN_SENT:
      sprintf (st, "LX_TCP_SYN_SENT");
      break;
    case LX_TCP_SYN_RECV:
      sprintf (st, "LX_TCP_SYN_RECV");
      break;
    case LX_TCP_FIN_WAIT1:
      sprintf (st, "LX_TCP_FIN_WAIT1");
      break;
    case LX_TCP_FIN_WAIT2:
      sprintf (st, "LX_TCP_FIN_WAIT2");
      break;
    case LX_TCP_TIME_WAIT:
      sprintf (st, "LX_TCP_TIME_WAIT");
      break;
    case LX_TCP_CLOSE:
      sprintf (st, "LX_TCP_CLOSE");
      break;

    case LX_TCP_CLOSE_WAIT:
      sprintf (st, "LX_TCP_CLOSE_WAIT");
      break;

    case LX_TCP_LAST_ACK:
      sprintf (st, "LX_TCP_LAST_ACK");
      break;

    case LX_TCP_LISTEN:
      sprintf (st, "LX_TCP_LISTEN");
      break;

    case LX_TCP_CLOSING:
      sprintf (st, "LX_TCP_CLOSING");
      break;
    case LX_TCP_NEW_SYN_RECV:
      sprintf (st, "LX_TCP_NEW_SYN_RECV");
      break;
    }
  return st;
}

Dan McDonald commented on 2024-01-24T13:47:01.645-0500:

Latest test results from Carlos are below. NOTE that the tcpi_unacked is assigned by illumos’s tcp_suna which is initialized with the randomized ISS in tcp_iss_init() function.

Here are the tests I have done after the latest changes :

Linux Branded Zone

t```est_connect[SUCCESS] TCP state: LX_TCP_ESTABLISHED
test_connect[SUCCESS] tcpi_rto 1125000
test_connect[SUCCESS] tcpi_last_data_recv 0
test_connect[SUCCESS] tcpi_rtt 4000000
test_connect[SUCCESS] tcpi_rttvar 500000
test_connect[SUCCESS] tcpi_snd_ssthresh 0
test_connect[SUCCESS] tcpi_snd_cwnd 16360
test_connect[SUCCESS] tcpi_snd_mss 8180
test_connect[SUCCESS] tcpi_unacked -74323462
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu 0
test_connect[SUCCESS] tcpi_total_retrans 0
test_connect[SUCCESS] tcpi_unacked -74323462
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_rcv_space 1056768
test_connect[SUCCESS]:SOCK_STREAM NOT_CONNECTED TCP state: LX_TCP_ESTABLISHED
test_connect[SUCCESS] tcpi_rto 1125000
test_connect[SUCCESS] tcpi_last_data_recv 0
test_connect[SUCCESS] tcpi_rtt 4000000
test_connect[SUCCESS] tcpi_rttvar 500000
test_connect[SUCCESS] tcpi_snd_ssthresh 0
test_connect[SUCCESS] tcpi_snd_cwnd 0
test_connect[SUCCESS] tcpi_snd_mss 536
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu 0
test_connect[SUCCESS] tcpi_total_retrans 0
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_rcv_space 1048576
test_connect[SUCCESS]:SOCK_STREAM CONNECTED TCP state: LX_TCP_ESTABLISHED
test_connect[SUCCESS] tcpi_rto 1125000
test_connect[SUCCESS] tcpi_last_data_recv 0
test_connect[SUCCESS] tcpi_rtt 4000000
test_connect[SUCCESS] tcpi_rttvar 500000
test_connect[SUCCESS] tcpi_snd_ssthresh 0
test_connect[SUCCESS] tcpi_snd_cwnd 16360
test_connect[SUCCESS] tcpi_snd_mss 8180
test_connect[SUCCESS] tcpi_unacked -1767823445
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu 0
test_connect[SUCCESS] tcpi_total_retrans 0
test_connect[SUCCESS] tcpi_unacked -1767823445
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_rcv_space 1056768
test_bound[SUCCESS] BOUND tcp state: LX_TCP_LISTEN
test_bound[SUCCESS] tcpi_rto 1125000
test_bound[SUCCESS] tcpi_last_data_recv 0
test_bound[SUCCESS] tcpi_rtt 4000000
test_bound[SUCCESS] tcpi_rttvar 500000
test_bound[SUCCESS] tcpi_snd_ssthresh 0
test_bound[SUCCESS] tcpi_snd_cwnd 0
test_bound[SUCCESS] tcpi_snd_mss 536
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 0
test_bound[SUCCESS] tcpi_pmtu 0
test_bound[SUCCESS] tcpi_total_retrans 0
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 0
test_bound[SUCCESS] tcpi_rcv_space 1048576
test_bound[SUCCESS] LISTEN tcp state: LX_TCP_LISTEN
test_bound[SUCCESS] tcpi_rto 1125000
test_bound[SUCCESS] tcpi_last_data_recv 0
test_bound[SUCCESS] tcpi_rtt 4000000
test_bound[SUCCESS] tcpi_rttvar 500000
test_bound[SUCCESS] tcpi_snd_ssthresh 0
test_bound[SUCCESS] tcpi_snd_cwnd 0
test_bound[SUCCESS] tcpi_snd_mss 536
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 0
test_bound[SUCCESS] tcpi_pmtu 0
test_bound[SUCCESS] tcpi_total_retrans 0
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 0
test_bound[SUCCESS] tcpi_rcv_space 1048576

Linux Native VM

test_connect[SUCCESS] TCP state: LX_TCP_ESTABLISHED
test_connect[SUCCESS] tcpi_rto 204000
test_connect[SUCCESS] tcpi_last_data_recv 0
test_connect[SUCCESS] tcpi_rtt 50
test_connect[SUCCESS] tcpi_rttvar 25
test_connect[SUCCESS] tcpi_snd_ssthresh 2147483647
test_connect[SUCCESS] tcpi_snd_cwnd 10
test_connect[SUCCESS] tcpi_snd_mss 32732
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu 65536
test_connect[SUCCESS] tcpi_total_retrans 0
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_rcv_space 65476
test_connect[SUCCESS]:SOCK_STREAM NOT_CONNECTED TCP state: LX_TCP_CLOSE
test_connect[SUCCESS] tcpi_rto 1000000
test_connect[SUCCESS] tcpi_last_data_recv 16758308
test_connect[SUCCESS] tcpi_rtt 0
test_connect[SUCCESS] tcpi_rttvar 250000
test_connect[SUCCESS] tcpi_snd_ssthresh 2147483647
test_connect[SUCCESS] tcpi_snd_cwnd 10
test_connect[SUCCESS] tcpi_snd_mss 536
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu 0
test_connect[SUCCESS] tcpi_total_retrans 0
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_rcv_space 0
test_connect[SUCCESS]:SOCK_STREAM CONNECTED TCP state: LX_TCP_ESTABLISHED
test_connect[SUCCESS] tcpi_rto 204000
test_connect[SUCCESS] tcpi_last_data_recv 0
test_connect[SUCCESS] tcpi_rtt 40
test_connect[SUCCESS] tcpi_rttvar 20
test_connect[SUCCESS] tcpi_snd_ssthresh 2147483647
test_connect[SUCCESS] tcpi_snd_cwnd 10
test_connect[SUCCESS] tcpi_snd_mss 32741
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_pmtu 65535
test_connect[SUCCESS] tcpi_total_retrans 0
test_connect[SUCCESS] tcpi_unacked 0
test_connect[SUCCESS] tcpi_sacked 0
test_connect[SUCCESS] tcpi_rcv_space 65495
test_bound[SUCCESS] BOUND tcp state: LX_TCP_CLOSE
test_bound[SUCCESS] tcpi_rto 1000000
test_bound[SUCCESS] tcpi_last_data_recv 16758308
test_bound[SUCCESS] tcpi_rtt 0
test_bound[SUCCESS] tcpi_rttvar 250000
test_bound[SUCCESS] tcpi_snd_ssthresh 2147483647
test_bound[SUCCESS] tcpi_snd_cwnd 10
test_bound[SUCCESS] tcpi_snd_mss 536
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 0
test_bound[SUCCESS] tcpi_pmtu 0
test_bound[SUCCESS] tcpi_total_retrans 0
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 0
test_bound[SUCCESS] tcpi_rcv_space 0
test_bound[SUCCESS] LISTEN tcp state: LX_TCP_LISTEN
test_bound[SUCCESS] tcpi_rto 0
test_bound[SUCCESS] tcpi_last_data_recv 0
test_bound[SUCCESS] tcpi_rtt 0
test_bound[SUCCESS] tcpi_rttvar 0
test_bound[SUCCESS] tcpi_snd_ssthresh 0
test_bound[SUCCESS] tcpi_snd_cwnd 10
test_bound[SUCCESS] tcpi_snd_mss 0
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 10
test_bound[SUCCESS] tcpi_pmtu 0
test_bound[SUCCESS] tcpi_total_retrans 0
test_bound[SUCCESS] tcpi_unacked 0
test_bound[SUCCESS] tcpi_sacked 10
test_bound[SUCCESS] tcpi_rcv_space 0

There are still differences for example in tcpi_unacked we return a negative value, Linux returns 0.

Related Links