Issue Type: | Improvement |
---|---|
Priority: | 4 - Normal |
Status: | Resolved |
Created at: | 2017-05-18T18:35:10.000Z |
Updated at: | 2019-07-05T14:35:41.858Z |
Created by: | Former user |
---|---|
Reported by: | Former user |
Assigned to: | Former user |
Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2019-07-05T14:35:41.843Z)
2019-07-18 Werewolf Bar Mitzvah (Release Date: 2019-07-18)
On platforms that have unsynchronized TSCs and lack support for the RDTSCP instruction, determining the TSC offset is a multi-step process. The current CPU ID is queried (via the lsl/GDT method), the TSC is read, then finally the CPU ID is re-checked in case a migration occurred. This is meant to ensure that the offset applied to the TSC reading corresponds to the CPU from which it was taken. If the CPU IDs don't match, the logic is repeated until it is successful.
While this ID-checking loop is rather tight, it can become a problem in certain circumstances. If the system is heavily loaded, migrations may occur rapidly. Any process "stuck" in this loop will represent an additional source of CPU load, potentially exacerbating the problem. It would be valuable to have bail-out logic to limit the loop iterations, perform the time reading via syscall if it cannot be completed by the userspace code in a timely manner.
To test this, I simulated CPU migrations during __cp_tsc_read
by utilizing writable uregs[]
in dtrace. Two separate scripts, one each for i386 and amd64, were drafted up for this task:
check.d:
pid$target::__cp_tsc_read:entry { self->dupe = 1 } /* Force TSC_MFENCE-style readings */ pid$target::__cp_tsc_read:11, pid$target::__cp_tsc_read:4c { uregs[R_EAX] = 3 } pid$target::__cp_tsc_read:45 { /* CPU ID in %eax */ if (self->dupe < 10) { this->cpu = uregs[R_EAX]; printf("pretending to migrate %d -> %d\n", this->cpu, this->cpu + self->dupe); uregs[R_EAX] = this->cpu + self->dupe; self->dupe++; } } pid$target::__cp_tsc_read:86 { /* CPU ID in %ecx */ if (self->dupe < 10) { this->cpu = uregs[R_ECX]; printf("pretending to migrate %d -> %d\n", this->cpu, this->cpu + self->dupe); uregs[R_ECX] = this->cpu + self->dupe; self->dupe++; } } pid$target::__cp_tsc_read:return { trace(arg1); }
check64.d:
pid$target::__cp_tsc_read:entry { self->dupe = 1 } /* Force TSC_MFENCE-style readings */ pid$target::__cp_tsc_read:7 { uregs[R_ESI] = 3 } pid$target::__cp_tsc_read:3f, pid$target::__cp_tsc_read:85 { /* CPU ID in %edx */ if (self->dupe < 10) { this->cpu = uregs[R_EDX]; printf("pretending to migrate %d -> %d\n", this->cpu, this->cpu + self->dupe); uregs[R_EDX] = this->cpu + self->dupe; self->dupe++; } } pid$target::__cp_tsc_read:return { trace(arg1); }
The corresponding dtrace script was run against a binary (built for the proper bit-ness) of this simple test program:
#include <stdio.h> #include <time.h> int main(int argc, char *argv[]) { struct timespec ts; clock_gettime(CLOCK_MONOTONIC, &ts); printf("mono: %lu %lu\n", ts.tv_sec, ts.tv_nsec); clock_gettime(CLOCK_REALTIME, &ts); printf("real: %lu %lu\n", ts.tv_sec, ts.tv_nsec); return (0); }
The results showed the repeated attempts to get a TSC and CPU-ID reading, terminating once the limit had been reached:
[root@6cda7a83-0243-4027-e3f4-e1fa5553209f ~/commpage]# dtrace -ws check.d -c ./run32 dtrace: script 'check.d' matched 11 probes dtrace: allowing destructive actions mono: 6696 903929625 real: 1561498076 418969887 dtrace: pid 9582 has exited CPU ID FUNCTION:NAME 0 89632 __cp_tsc_read:45 pretending to migrate 0 -> 1 0 89633 __cp_tsc_read:86 pretending to migrate 0 -> 2 0 89633 __cp_tsc_read:86 pretending to migrate 0 -> 3 0 89633 __cp_tsc_read:86 pretending to migrate 0 -> 4 0 89633 __cp_tsc_read:86 pretending to migrate 0 -> 5 0 89633 __cp_tsc_read:86 pretending to migrate 0 -> 6 0 89634 __cp_tsc_read:return 0 0 89632 __cp_tsc_read:45 pretending to migrate 0 -> 1 0 89633 __cp_tsc_read:86 pretending to migrate 0 -> 2 0 89633 __cp_tsc_read:86 pretending to migrate 0 -> 3 0 89633 __cp_tsc_read:86 pretending to migrate 0 -> 4 0 89633 __cp_tsc_read:86 pretending to migrate 0 -> 5 0 89633 __cp_tsc_read:86 pretending to migrate 0 -> 6 0 89634 __cp_tsc_read:return 0 [root@6cda7a83-0243-4027-e3f4-e1fa5553209f ~/commpage]# dtrace -ws check64.d -c ./run64 dtrace: script 'check64.d' matched 10 probes dtrace: allowing destructive actions mono: 6700 271958594 real: 1561498079 786812533 dtrace: pid 9584 has exited CPU ID FUNCTION:NAME 10 89631 __cp_tsc_read:3f pretending to migrate 10 -> 11 10 89632 __cp_tsc_read:85 pretending to migrate 10 -> 12 10 89632 __cp_tsc_read:85 pretending to migrate 10 -> 13 10 89632 __cp_tsc_read:85 pretending to migrate 10 -> 14 10 89632 __cp_tsc_read:85 pretending to migrate 10 -> 15 10 89632 __cp_tsc_read:85 pretending to migrate 10 -> 16 10 89633 __cp_tsc_read:return 0 10 89631 __cp_tsc_read:3f pretending to migrate 10 -> 11 10 89632 __cp_tsc_read:85 pretending to migrate 10 -> 12 10 89632 __cp_tsc_read:85 pretending to migrate 10 -> 13 10 89632 __cp_tsc_read:85 pretending to migrate 10 -> 14 10 89632 __cp_tsc_read:85 pretending to migrate 10 -> 15 10 89632 __cp_tsc_read:85 pretending to migrate 10 -> 16 10 89633 __cp_tsc_read:return 0
(This was performed on an Ivy Bridge system which the TSC reset errata applies to, resulting in offset TSCs and thus potential exercise of such logic)
Per @accountid:62431b8f258562006fa2866a's suggestion, I set up an additional test to check that under normal conditions (without excessive migration, simulated or otherwise) that the new logic goes through the reading loop only the once. It succeeded on both 32-bit and 64-bit:
dtrace: script 'check_fine.d' matched 6 probes dtrace: allowing destructive actions mono: 366 383668954 real: 1562163502 51244099 dtrace: pid 9954 has exited CPU ID FUNCTION:NAME 10 94913 __cp_tsc_read:entry 10 94916 __cp_tsc_read:45 reading on 10 10 94917 __cp_tsc_read:86 reading on 10 10 94918 __cp_tsc_read:return 3660775750 10 94913 __cp_tsc_read:entry 10 94916 __cp_tsc_read:45 reading on 10 10 94917 __cp_tsc_read:86 reading on 10 10 94918 __cp_tsc_read:return 3661455338
dtrace: script 'check64_fine.d' matched 5 probes dtrace: allowing destructive actions mono: 256 918387873 real: 1562163392 588601207 dtrace: pid 9781 has exited CPU ID FUNCTION:NAME 10 94912 __cp_tsc_read:entry 10 94914 __cp_tsc_read:3f reading on cpu 10 10 94915 __cp_tsc_read:85 reading on cpu 10 10 94916 __cp_tsc_read:return 719363575581 10 94912 __cp_tsc_read:entry 10 94914 __cp_tsc_read:3f reading on cpu 10 10 94915 __cp_tsc_read:85 reading on cpu 10 10 94916 __cp_tsc_read:return 719364160501
illumos-joyent commit ea9d7b5c804b8f84f0f7f73d8747c06291d0614d (branch master, by Patrick Mooney)
OS-6135 comm page should fallback to syscall after excessive migration
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Ryan Zezeski <rpz@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>