|Priority:||4 - Normal|
|Created by:||Patrick Mooney [X]|
|Reported by:||Patrick Mooney [X]|
|Assigned to:||Patrick Mooney [X]|
Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2018-04-05T23:54:22.548Z)
2018-04-12 Promised Land (Release Date: 2018-04-12)
During portions of its execution, when the computing context is constrained by certain factors such as having a VMX context loaded or MSRs loaded with guest values (which would be invalid for the host), bhyve protects its thread from going off-cpu with
critical_enter, which is effectively a direct map to
kpreempt_disable. While this prevents interrupting threads from doing meaningful work on the CPU until bhyve has cleared that "critical section", it does not guard against the possibility that bhyve might inadvertently sleep on synchronization resource. We have seen examples of this in tickets like OS-6860, where rare adverse circumstances result in such sleeping behavior. When a thread voluntarily sleeps in that manner,
kpreempt_disable is effectively ignored (since it was the thread which decided to give up the CPU itself).
While many of the circumstances which lead to this situation have been addressed, it would be nice to have some guard rails to protect against such problems in the future. A more comprehensive set of
restorectx handlers should be integrated in order to do so.
illumos-joyent commit 6a175f35f25ea47a4b116ad2dd1a0600fdf5a2bc (branch master, by Patrick Mooney)
OS-6864 bhyve should guard against going off-cpu
OS-6865 bhyve could be lazy about FPU state
Reviewed by: Jerry Jelinek <firstname.lastname@example.org>
Reviewed by: John Levon <email@example.com>
Approved by: John Levon <firstname.lastname@example.org>