OS-6864: bhyve should guard against going off-cpu

Details

Issue Type:Improvement
Priority:4 - Normal
Status:Resolved
Created at:2018-04-02T20:48:47.509Z
Updated at:2018-05-03T19:08:51.928Z

People

Created by:Patrick Mooney [X]
Reported by:Patrick Mooney [X]
Assigned to:Patrick Mooney [X]

Resolution

Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2018-04-05T23:54:22.548Z)

Fix Versions

2018-04-12 Promised Land (Release Date: 2018-04-12)

Related Links

Labels

bhyve

Description

During portions of its execution, when the computing context is constrained by certain factors such as having a VMX context loaded or MSRs loaded with guest values (which would be invalid for the host), bhyve protects its thread from going off-cpu with critical_enter, which is effectively a direct map to kpreempt_disable. While this prevents interrupting threads from doing meaningful work on the CPU until bhyve has cleared that "critical section", it does not guard against the possibility that bhyve might inadvertently sleep on synchronization resource. We have seen examples of this in tickets like OS-6860, where rare adverse circumstances result in such sleeping behavior. When a thread voluntarily sleeps in that manner, kpreempt_disable is effectively ignored (since it was the thread which decided to give up the CPU itself).

While many of the circumstances which lead to this situation have been addressed, it would be nice to have some guard rails to protect against such problems in the future. A more comprehensive set of savectx/restorectx handlers should be integrated in order to do so.

Comments

Comment by Jira Bot
Created at 2018-04-05T23:50:50.761Z

illumos-joyent commit 6a175f35f25ea47a4b116ad2dd1a0600fdf5a2bc (branch master, by Patrick Mooney)

OS-6864 bhyve should guard against going off-cpu
OS-6865 bhyve could be lazy about FPU state
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: John Levon <john.levon@joyent.com>
Approved by: John Levon <john.levon@joyent.com>