OS-7755: add sendmmsg() and recvmmsg() to lx

Details

Issue Type:Improvement
Priority:4 - Normal
Status:Resolved
Created at:2019-04-18T12:40:29.771Z
Updated at:2019-09-06T18:56:25.104Z

People

Created by:Former user
Reported by:Former user

Resolution

Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2019-05-08T23:54:59.194Z)

Fix Versions

2019-05-09 Rural Juror (Release Date: 2019-05-09)

Labels

lxbrand

Description

OmniOS has implemented sendmmsg() and recvmmsg() for lx. We should consider bringing that into SmartOS.

https://github.com/omniosorg/illumos-omnios/commit/8ae09771ae90a2a95cc58ccc89ae0c0ebdfd6af8

Comments

Comment by Former user
Created at 2019-05-02T19:55:18.768Z
Updated at 2019-05-02T19:55:35.199Z

Transcribing test notes from andyf:

In addition to the following tests, I also confirmed that this patch causes DNS resolution in an Ubuntu 18 lx zone to work without the `single-request` option in `/etc/resolv.conf`.

The following tests were performed in the same Ubuntu 18 zone.

sendmmsg

For testing sendmmsg, I used the following short program.

#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <errno.h>

int
main(int argc, char **argv)
{
        struct sockaddr_in dst0, dst1;
        struct mmsghdr hdr[2];
        struct iovec iov[2];

        dst0.sin_addr.s_addr = inet_addr("172.27.10.254");
        dst0.sin_family = AF_INET;
        dst0.sin_port = htons(6666);

        iov[0].iov_base = "test";
        iov[0].iov_len = 4;
        iov[1].iov_base = "SET";
        iov[1].iov_len = 3;

        memset(&hdr[0], '\0', sizeof(struct mmsghdr));
        hdr[0].msg_hdr.msg_name = &dst0;
        hdr[0].msg_hdr.msg_namelen = sizeof(dst0);
        hdr[0].msg_hdr.msg_iovlen = 2;
        hdr[0].msg_hdr.msg_iov = iov;

        dst1.sin_addr.s_addr = inet_addr("172.27.10.254");
        dst1.sin_family = AF_INET;
        dst1.sin_port = htons(7777);

        memset(&hdr[1], '\0', sizeof(struct mmsghdr));
        if (argc == 1) {
                hdr[1].msg_hdr.msg_name = &dst1;
                hdr[1].msg_hdr.msg_namelen = sizeof(dst1);
        }
        hdr[1].msg_hdr.msg_iovlen = 1;
        hdr[1].msg_hdr.msg_iov = iov;

        int fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
        printf("FD: %d\n", fd);
        int i = sendmmsg(fd, (struct mmsghdr *)&hdr, 2, 0);

        printf("Return: %d\n", i);
        printf("Errno: %d (%s)\n", errno, strerror(errno));
        printf("Sent0: %d\n", hdr[0].msg_len);
        printf("Sent1: %d\n", hdr[1].msg_len);

        return 0;
}

Running this with an argument omits the destination address from the second message, causing an early return.
This was run as a 64 and 32-bit binary while snooping for traffic on a different host.

64-bit, address on both messages

lx# file sendmmsg
sendmmsg: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/l, for GNU/Linux 3.2.0, BuildID[sha1]=b7ce2862a4c5031ed031895e17a9c7615260f3c5, not stripped

lx# ./sendmmsg
FD: 3
Return: 2
Errno: 0 (Success)
Sent0: 7
Sent1: 4

build# snoop -x 0 -d build_10 udp \( port 6666 or port 7777 \)
Using device build_10 (promiscuous mode)
      lx -> 172.27.10.254 UDP D=6666 S=51192 LEN=15

           0: 0208 201d 0293 0208 20b8 7c3f 0800 4500    .. ..... .|?..E.
          16: 0023 007b 0000 ff11 4db6 ac1b 0a64 ac1b    .#.{....M....d..
          32: 0afe c7f8 1a0a 000f 2115 7465 7374 5345    ........!.testSE
          48: 54                                         T

      lx -> 172.27.10.254 UDP D=7777 S=51192 LEN=12

           0: 0208 201d 0293 0208 20b8 7c3f 0800 4500    .. ..... .|?..E.
          16: 0020 007c 0000 ff11 4db8 ac1b 0a64 ac1b    . .|....M....d..
          32: 0afe c7f8 1e61 000c c409 7465 7374         .....a....test

64-bit, second address missing

For this run, the second message is missing a destination address so sendmmsg will return early, after sending one message, but without producing an error.
dtrace shows that the underlying sendmsg() is returning errno 96 and that this is not being propagated back from sendmmsg() which is returning 1.

> #define EDESTADDRREQ    96      /* Destination address required */
lx# ./sendmmsg f
FD: 3
Return: 1
Errno: 0 (Success)
Sent0: 7
Sent1: 0

build# snoop -x 0 -d build_10 udp \( port 6666 or port 7777 \)
Using device build_10 (promiscuous mode)
      sparse -> 172.27.10.254 UDP D=6666 S=36101 LEN=15

           0: 0208 201d 0293 0208 20b8 7c3f 0800 4500    .. ..... .|?..E.
          16: 0023 0080 0000 ff11 4db1 ac1b 0a64 ac1b    .#......M....d..
          32: 0afe 8d05 1a0a 000f 5c08 7465 7374 5345    ........\.testSE
          48: 54                                         T

bloody# dtrace -n 'fbt:lx_brand:lx_sendmsg:return{trace(arg1)}'
dtrace: description 'fbt:lx_brand:lx_sendmsg:return' matched 1 probe
CPU     ID                    FUNCTION:NAME
 11  71004                lx_sendmsg:return                 7
 11  71004                lx_sendmsg:return                96

32-bit, address on both messages

lx# file sendmmsg
sendmmsg: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-, for GNU/Linux 3.2.0, BuildID[sha1]=a9d6fab6851c6554486b1d13c711c2f18b75c7f0, not stripped
lx# ./sendmmsg
FD: 3
Return: 2
Errno: 0 (Success)
Sent0: 7
Sent1: 4

build# snoop -x 0 -d build_10 udp \( port 6666 or port 7777 \)
Using device build_10 (promiscuous mode)
      sparse -> 172.27.10.254 UDP D=6666 S=50640 LEN=15

           0: 0208 201d 0293 0208 20b8 7c3f 0800 4500    .. ..... .|?..E.
          16: 0023 0084 0000 ff11 4dad ac1b 0a64 ac1b    .#......M....d..
          32: 0afe c5d0 1a0a 000f 233d 7465 7374 5345    ........#=testSE
          48: 54                                         T

      sparse -> 172.27.10.254 UDP D=7777 S=50640 LEN=12

           0: 0208 201d 0293 0208 20b8 7c3f 0800 4500    .. ..... .|?..E.
          16: 0020 0085 0000 ff11 4daf ac1b 0a64 ac1b    . ......M....d..
          32: 0afe c5d0 1e61 000c c631 7465 7374         .....a...1test

32-bit, second address missing

lx# ./sendmmsg f
FD: 3
Return: 1
Errno: 0 (Success)
Sent0: 7
Sent1: 0

build# snoop -x 0 -d build_10 udp \( port 6666 or port 7777 \)
Using device build_10 (promiscuous mode)
      sparse -> 172.27.10.254 UDP D=6666 S=52632 LEN=15

           0: 0208 201d 0293 0208 20b8 7c3f 0800 4500    .. ..... .|?..E.
          16: 0023 0086 0000 ff11 4dab ac1b 0a64 ac1b    .#......M....d..
          32: 0afe cd98 1a0a 000f 1b75 7465 7374 5345    .........utestSE
          48: 54                                         T

recvmsg

For recvmsg, I used the test program from the Linux man page:

#define _GNU_SOURCE
#include <netinet/ip.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>

#define VLEN 10
#define BUFSIZE 200
#define TIMEOUT 1

int
main(void)
{
        int sockfd, retval, i;
        struct sockaddr_in addr;
        struct mmsghdr msgs[VLEN];
        struct iovec iovecs[VLEN];
        char bufs[VLEN][BUFSIZE+1];
        struct timespec timeout;

        sockfd = socket(AF_INET, SOCK_DGRAM, 0);
        if (sockfd == -1) {
                perror("socket()");
                exit(EXIT_FAILURE);
        }

        addr.sin_family = AF_INET;
        addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
        addr.sin_port = htons(1234);
        if (bind(sockfd, (struct sockaddr *) &addr, sizeof(addr)) == -1) {
                perror("bind()");
                exit(EXIT_FAILURE);
        }

        memset(msgs, 0, sizeof(msgs));
        for (i = 0; i < VLEN; i++) {
                iovecs[i].iov_base         = bufs[i];
                iovecs[i].iov_len          = BUFSIZE;
                msgs[i].msg_hdr.msg_iov    = &iovecs[i];
                msgs[i].msg_hdr.msg_iovlen = 1;
        }

        timeout.tv_sec = TIMEOUT;
        timeout.tv_nsec = 0;

        retval = recvmmsg(sockfd, msgs, VLEN, 0, &timeout);
        if (retval == -1) {
                perror("recvmmsg()");
                exit(EXIT_FAILURE);
        }

        printf("%d messages received\n", retval);
        for (i = 0; i < retval; i++) {
                printf("LEN: %d\n", msgs[i].msg_len);
                bufs[i][msgs[i].msg_len] = 0;
                printf("%d %s", i+1, bufs[i]);
        }
        exit(EXIT_SUCCESS);
}

together with a simple loop in bash to send a packet every 0.25 seconds

 while true; do echo $RANDOM > /dev/udp/127.0.0.1/1234;
                 sleep 0.25; done

This tests that the timeout has at least basic functionality since there are plenty of packets available to receive.

64-bit

lx# file recvmmsg
recvmmsg: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/l, for GNU/Linux 3.2.0, BuildID[sha1]=ab4a2eb269df622f90b526da6ff75e3c3c8e1fec, not stripped
lx# ./recvmmsg
4 messages received
LEN: 6
1 22613
LEN: 4
2 172
LEN: 6
3 28355
LEN: 5
4 4129
lx# ./recvmmsg
4 messages received
LEN: 6
1 31869
LEN: 6
2 16173
LEN: 6
3 20932
LEN: 6
4 30046

32-bit

lx# file recvmmsg
recvmmsg: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-, for GNU/Linux 3.2.0, BuildID[sha1]=fa8ca219b2dce85a6b1656774cad94814a5930ba, not stripped
lx# ./recvmmsg
4 messages received
LEN: 5
1 8403
LEN: 6
2 18001
LEN: 6
3 16254
LEN: 5
4 7752

Comment by Jira Bot
Created at 2019-05-08T23:54:15.968Z

illumos-joyent commit 7847a0def2f92ff8670c7d008f9caae2b2585752 (branch master, by Andy Fiddaman)

OS-7755 add sendmmsg() and recvmmsg() to lx
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Approved by: Mike Gerdts <mike.gerdts@joyent.com>