OS-6135: comm page should fallback to syscall after excessive migration

Details

Issue Type:Improvement
Priority:4 - Normal
Status:Resolved
Created at:2017-05-18T18:35:10.000Z
Updated at:2019-07-05T14:35:41.858Z

People

Created by:Former user
Reported by:Former user
Assigned to:Former user

Resolution

Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2019-07-05T14:35:41.843Z)

Fix Versions

2019-07-18 Werewolf Bar Mitzvah (Release Date: 2019-07-18)

Description

On platforms that have unsynchronized TSCs and lack support for the RDTSCP instruction, determining the TSC offset is a multi-step process. The current CPU ID is queried (via the lsl/GDT method), the TSC is read, then finally the CPU ID is re-checked in case a migration occurred. This is meant to ensure that the offset applied to the TSC reading corresponds to the CPU from which it was taken. If the CPU IDs don't match, the logic is repeated until it is successful.

While this ID-checking loop is rather tight, it can become a problem in certain circumstances. If the system is heavily loaded, migrations may occur rapidly. Any process "stuck" in this loop will represent an additional source of CPU load, potentially exacerbating the problem. It would be valuable to have bail-out logic to limit the loop iterations, perform the time reading via syscall if it cannot be completed by the userspace code in a timely manner.

Comments

Comment by Former user
Created at 2019-06-25T21:29:31.049Z

To test this, I simulated CPU migrations during __cp_tsc_read by utilizing writable uregs[] in dtrace. Two separate scripts, one each for i386 and amd64, were drafted up for this task:

check.d:

pid$target::__cp_tsc_read:entry {
        self->dupe = 1
}

/* Force TSC_MFENCE-style readings */
pid$target::__cp_tsc_read:11,
pid$target::__cp_tsc_read:4c
{
        uregs[R_EAX] = 3
}

pid$target::__cp_tsc_read:45
{
        /* CPU ID in %eax */
        if (self->dupe < 10) {
                this->cpu = uregs[R_EAX];
                printf("pretending to migrate %d -> %d\n", this->cpu, this->cpu + self->dupe);
                uregs[R_EAX] = this->cpu + self->dupe;
                self->dupe++;
        }
}
pid$target::__cp_tsc_read:86
{
        /* CPU ID in %ecx */
        if (self->dupe < 10) {
                this->cpu = uregs[R_ECX];
                printf("pretending to migrate %d -> %d\n", this->cpu, this->cpu + self->dupe);
                uregs[R_ECX] = this->cpu + self->dupe;
                self->dupe++;
        }
}

pid$target::__cp_tsc_read:return {
        trace(arg1);
}

check64.d:

pid$target::__cp_tsc_read:entry {
        self->dupe = 1
}

/* Force TSC_MFENCE-style readings */
pid$target::__cp_tsc_read:7
{
        uregs[R_ESI] = 3
}

pid$target::__cp_tsc_read:3f,
pid$target::__cp_tsc_read:85
{
        /* CPU ID in %edx */
        if (self->dupe < 10) {
                this->cpu = uregs[R_EDX];
                printf("pretending to migrate %d -> %d\n", this->cpu, this->cpu + self->dupe);
                uregs[R_EDX] = this->cpu + self->dupe;
                self->dupe++;
        }
}

pid$target::__cp_tsc_read:return {
        trace(arg1);
}

The corresponding dtrace script was run against a binary (built for the proper bit-ness) of this simple test program:

#include <stdio.h>
#include <time.h>


int
main(int argc, char *argv[])
{
        struct timespec ts;

        clock_gettime(CLOCK_MONOTONIC, &ts);
        printf("mono: %lu %lu\n", ts.tv_sec, ts.tv_nsec);
        clock_gettime(CLOCK_REALTIME, &ts);
        printf("real: %lu %lu\n", ts.tv_sec, ts.tv_nsec);
        return (0);
}

The results showed the repeated attempts to get a TSC and CPU-ID reading, terminating once the limit had been reached:

[root@6cda7a83-0243-4027-e3f4-e1fa5553209f ~/commpage]# dtrace -ws check.d -c ./run32
dtrace: script 'check.d' matched 11 probes
dtrace: allowing destructive actions
mono: 6696 903929625
real: 1561498076 418969887
dtrace: pid 9582 has exited
CPU     ID                    FUNCTION:NAME
  0  89632                 __cp_tsc_read:45 pretending to migrate 0 -> 1

  0  89633                 __cp_tsc_read:86 pretending to migrate 0 -> 2

  0  89633                 __cp_tsc_read:86 pretending to migrate 0 -> 3

  0  89633                 __cp_tsc_read:86 pretending to migrate 0 -> 4

  0  89633                 __cp_tsc_read:86 pretending to migrate 0 -> 5

  0  89633                 __cp_tsc_read:86 pretending to migrate 0 -> 6

  0  89634             __cp_tsc_read:return                 0
  0  89632                 __cp_tsc_read:45 pretending to migrate 0 -> 1

  0  89633                 __cp_tsc_read:86 pretending to migrate 0 -> 2

  0  89633                 __cp_tsc_read:86 pretending to migrate 0 -> 3

  0  89633                 __cp_tsc_read:86 pretending to migrate 0 -> 4

  0  89633                 __cp_tsc_read:86 pretending to migrate 0 -> 5

  0  89633                 __cp_tsc_read:86 pretending to migrate 0 -> 6

  0  89634             __cp_tsc_read:return                 0

[root@6cda7a83-0243-4027-e3f4-e1fa5553209f ~/commpage]# dtrace -ws check64.d -c ./run64
dtrace: script 'check64.d' matched 10 probes
dtrace: allowing destructive actions
mono: 6700 271958594
real: 1561498079 786812533
dtrace: pid 9584 has exited
CPU     ID                    FUNCTION:NAME
 10  89631                 __cp_tsc_read:3f pretending to migrate 10 -> 11

 10  89632                 __cp_tsc_read:85 pretending to migrate 10 -> 12

 10  89632                 __cp_tsc_read:85 pretending to migrate 10 -> 13

 10  89632                 __cp_tsc_read:85 pretending to migrate 10 -> 14

 10  89632                 __cp_tsc_read:85 pretending to migrate 10 -> 15

 10  89632                 __cp_tsc_read:85 pretending to migrate 10 -> 16

 10  89633             __cp_tsc_read:return                 0
 10  89631                 __cp_tsc_read:3f pretending to migrate 10 -> 11

 10  89632                 __cp_tsc_read:85 pretending to migrate 10 -> 12

 10  89632                 __cp_tsc_read:85 pretending to migrate 10 -> 13

 10  89632                 __cp_tsc_read:85 pretending to migrate 10 -> 14

 10  89632                 __cp_tsc_read:85 pretending to migrate 10 -> 15

 10  89632                 __cp_tsc_read:85 pretending to migrate 10 -> 16

 10  89633             __cp_tsc_read:return                 0

(This was performed on an Ivy Bridge system which the TSC reset errata applies to, resulting in offset TSCs and thus potential exercise of such logic)


Comment by Former user
Created at 2019-07-03T14:20:34.948Z

Per @accountid:62431b8f258562006fa2866a's suggestion, I set up an additional test to check that under normal conditions (without excessive migration, simulated or otherwise) that the new logic goes through the reading loop only the once. It succeeded on both 32-bit and 64-bit:

dtrace: script 'check_fine.d' matched 6 probes
dtrace: allowing destructive actions
mono: 366 383668954
real: 1562163502 51244099
dtrace: pid 9954 has exited
CPU     ID                    FUNCTION:NAME
 10  94913              __cp_tsc_read:entry
 10  94916                 __cp_tsc_read:45 reading on 10

 10  94917                 __cp_tsc_read:86 reading on 10

 10  94918             __cp_tsc_read:return        3660775750
 10  94913              __cp_tsc_read:entry
 10  94916                 __cp_tsc_read:45 reading on 10

 10  94917                 __cp_tsc_read:86 reading on 10

 10  94918             __cp_tsc_read:return        3661455338
dtrace: script 'check64_fine.d' matched 5 probes
dtrace: allowing destructive actions
mono: 256 918387873
real: 1562163392 588601207
dtrace: pid 9781 has exited
CPU     ID                    FUNCTION:NAME
 10  94912              __cp_tsc_read:entry
 10  94914                 __cp_tsc_read:3f reading on cpu 10

 10  94915                 __cp_tsc_read:85 reading on cpu 10

 10  94916             __cp_tsc_read:return      719363575581
 10  94912              __cp_tsc_read:entry
 10  94914                 __cp_tsc_read:3f reading on cpu 10

 10  94915                 __cp_tsc_read:85 reading on cpu 10

 10  94916             __cp_tsc_read:return      719364160501

Comment by Jira Bot
Created at 2019-07-05T14:33:34.143Z

illumos-joyent commit ea9d7b5c804b8f84f0f7f73d8747c06291d0614d (branch master, by Patrick Mooney)

OS-6135 comm page should fallback to syscall after excessive migration
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Ryan Zezeski <rpz@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>