OS-6837: bhyve must use separate ipi vector for PIR

Details

Issue Type:Bug
Priority:4 - Normal
Status:Resolved
Created at:2018-03-26T20:21:53.487Z
Updated at:2018-03-30T04:43:49.833Z

People

Created by:Patrick Mooney [X]
Reported by:Patrick Mooney [X]
Assigned to:Patrick Mooney [X]

Resolution

Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2018-03-30T04:43:49.821Z)

Fix Versions

2018-04-12 Promised Land (Release Date: 2018-04-12)

Related Links

Labels

bhyve

Description

When the APICv/PIR code in bhyve was initially enabled, I just recycled the poke_cpu vector for doing PIR notifications.  To ensure that poke_cpu actually kicks the CPU out of VMX context when we need it, a separate IPI vector is needed to register for PIR.  That way, PIR IPIs will trigger the VMX-internal interrupt processing while poke_cpu IPIs will kick it out of the VMX context to process the "external interrupt".

Comments

Comment by Patrick Mooney [X]
Created at 2018-03-29T21:50:42.565Z

I tested this on a Westmere machine (pcplusmp) and an Ivy Bridge machine (apix, but no x2apic support from the BIOS).  In both cases, I see the PIR interrupt be allocated properly by the respective PSM logic.  On the IvyB node, where PIR is supported by VMX, I traced poke_cpu and apic_send_pir_ipi calls while inducing work inside and outside bhyve instances. It clearly showed the different mechanisms calling the appropriate IPI (poke for thread wake-up, pir_ipi for APICv notifications) function.


Comment by Bryan Cantrill [X]
Created at 2018-03-30T00:59:37.231Z

@tomas.celaya may be seeing this issue manifest itself as periods of inactivity (for some benchmarks, he sees performance plunge temporarily). Here is a D script to explore this:

#pragma D option aggsortkey
#pragma D option quiet

apix_dispatch_by_vector:entry
{
	this->vecp = `apixs[cpu]->x_vectbl[arg0];
	this->avp = this->vecp->v_autovect;
}

apix_dispatch_by_vector:entry
/this->avp == NULL || this->avp->av_vector == NULL/
{
	@[cpu] = count();
}

tick-1sec
{
	printf("time=%d\n", walltimestamp / 1000000000);
	printa("cpu=%d pokes=%@d\n", @);
	clear(@);
}

Comment by Patrick Mooney [X]
Created at 2018-03-30T01:28:18.990Z

Robert gave me access to a node with proper X2APIC support on it and I attempted the same test, tracing the PIR and cpu_poke IPIs. The mechanisms appeared to be functioning as desired.


Comment by Jira Bot
Created at 2018-03-30T01:34:52.158Z

illumos-joyent commit b409c9a314a881a4aa643e85cd5494af77fa4488 (branch master, by Patrick Mooney)

OS-6837 bhyve must use separate ipi vector for PIR
Reviewed by: Bryan Cantrill <bryan@joyent.com>
Reviewed by: John Levon <john.levon@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>