OS-5309: TSC sync detection should be NUMA friendly

Details

Issue Type:Bug
Priority:4 - Normal
Status:Resolved
Created at:2016-04-06T22:01:41.000Z
Updated at:2016-04-14T21:40:09.000Z

People

Created by:Patrick Mooney [X]
Reported by:Patrick Mooney [X]
Assigned to:Patrick Mooney [X]

Resolution

Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2016-04-14T21:40:09.000Z)

Fix Versions

2016-04-28 Nellie (Release Date: 2016-04-28)

Description

On i86pc, certain precautions must be taken before using the TSC as a source for gethrtime. It's possible that the TSC will be skewed between different CPUs in a system. To compensate for that fact, logic attempts to measure any potential skew.

The algorithm used to make this measurement is sensitive to memory write latency between each pair of cores under consideration. The current implementation chooses to activate the delta-sensitive gethrtime routines if the largest TSC delta is greater than the shortest write latency. While this was adequate on older SMP systems with a shared memory bus, it poses a challenge in the face of NUMA. On such systems, large TSC skews detected are often accompanied by long memory write times (like when the compared core is off-socket).

In order to make an accurate assessment of TSC skew on NUMA systems, the global tsc_max_delta and write_time_min should be eschewed for thresholds which are set per-CPU. This will allow systems with synchronized TSCs to avoid the cost of the "delta variant" gethrtime routines.

Comments

Comment by Patrick Mooney [X]
Created at 2016-04-06T23:00:08.000Z
Updated at 2016-04-07T21:14:37.000Z

One additional data point regarding TSC sync on a multi-socket AMD system:

AMD Opteron(tm) Processor 4234 - 2 physical 12 cores
tsc_sync_tick_delta: [ 0, 0x11, 0x104, 0xbb, 0xcd, 0xe9, 0xe7, 0xde, 0xd7, 0xba, 0xef, 0xed, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... ]

nsec_scale: 294a3db

The Bulldozer architecture has two-core pairs which share L1/L2 caches. That seems like a reasonable explanation for why the delta is so low for core 1.


Comment by Patrick Mooney [X]
Created at 2016-04-07T19:47:03.000Z

raw TSC timings taken from my test machine:

CPU     tWRITE  tTSC    tTSC-tWRITE
1       156     76      -80
2       156     80      -76
3       152     98      -54
4       152     74      -78
5       156     74      -82
6       280     120     -160
7       280     142     -138
8       280     132     -148
9       276     120     -156
10      276     116     -160
11      280     136     -144
12      80      16      -64
13      156     64      -92
14      160     80      -80
15      160     50      -110
16      152     66      -86
17      156     70      -86
18      280     134     -146
19      276     128     -148
20      276     144     -132
21      276     144     -132
22      276     150     -126
23      276     138     -138

Comment by Bot Bot [X]
Created at 2016-04-14T21:37:37.000Z

illumos-joyent commit 46243df (branch master, by Patrick Mooney)

OS-5309 TSC sync detection should be NUMA friendly
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>