OS-7622: bhyve vioapic writes can deadlock instance

Details

Issue Type:Bug
Priority:4 - Normal
Status:Resolved
Created at:2019-02-27T15:22:31.242Z
Updated at:2019-06-20T18:35:54.594Z

People

Created by:Patrick Mooney [X]
Reported by:Patrick Mooney [X]
Assigned to:Patrick Mooney [X]

Resolution

Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2019-06-20T18:35:54.583Z)

Fix Versions

2019-07-04 Verdukianism (Release Date: 2019-07-04)

Related Issues

Related Links

Labels

bhyve

Description

During testing for OS-7447, I found that writes to the vioapic are capable of deadlocking with certain other operations on the VM. Say that a multi-vCPU instance is running and a vCPU other than 0 writes to the vioapic redirection table. Assuming certain fields in the register are changed, a call to vm_smp_rendezvous will be made in order to update the trigger mode registers on all the vCPUs. The vioapic write, including the call to vm_smp_rendenzvous is all made in the context of that vCPU. Now say that another thread attempts one of the ioctls which requires locking all the vCPUs. It will do so, starting at vCPU 0 to VM_MAXCPU-1. If it is able to lock any of the vCPUs before the rendezvous begins, a deadlock will occur. This is because it won't be able to lock the vCPU handling the vioapic write, which is waiting for the rendezvous to finish. That rendezvous cannot finish, since any of the vCPUs locked by the other ioctl will never be released until they are all locked, the ioctl completes its work, and they are released.

This was captured live, albeit on a PI with the OS-7447 change applied:

stack pointer for thread fffffe5af6d7abc0 (bhyve/44 [vcpu 3]): fffffe007bc161a0
[ fffffe007bc161a0 _resume_from_idle+0x12b() ]
  fffffe007bc161d0 swtch+0x141()
  fffffe007bc16270 cv_timedwait_sig_hires+0x39d(fffffe5960964120, fffffe5960964118, 3b9aca00, f4240, 0)
  fffffe007bc162a0 cv_reltimedwait_sig+0x4f(fffffe5960964120, fffffe5960964118, 3e8, 4)
  fffffe007bc16300 vm_handle_rendezvous+0x9b(fffffe5960964000, 3)
  fffffe007bc16360 vm_smp_rendezvous+0x10b()
  fffffe007bc16410 vioapic_write+0x153(fffffe5b032c67c0, 3, 1c, 1a036)
  fffffe007bc16490 vioapic_mmio_rw+0xe6(fffffe5b032c67c0, 3, fec00010, fffffe007bc164b8, 4, 0)
  fffffe007bc164f0 vioapic_mmio_write+0x4e(fffffe5960964000, 3, fec00010, 1a036, 4, fffffe007bc1671f)
  fffffe007bc16580 emulate_mov+0xb2(fffffe5960964000, 3, fec00010, fffffe5960964600, fffffffff8458e10, fffffffff8458e70, fffffe007bc1671f)
  fffffe007bc165d0 vmm_emulate_instruction+0x6c(fffffe5960964000, 3, fec00010, fffffe5960964600, fffffe59609645e8, fffffffff8458e10, fffffffff8458e70, fffffe007bc1671f)
  fffffe007bc16670 vm_handle_inst_emul+0x18c(fffffe5960964000, 3, fffffe007bc1671f)
  fffffe007bc16760 vm_run+0x3f7(fffffe5960964000, fffffe007bc16ae0)
  fffffe007bc16c20 vmmdev_do_ioctl+0xca6(fffffe5afc395040, c0907601, fffffc7fe7e01e00, 202003, fffffe5afe105640, fffffe007bc16ea8)
  fffffe007bc16cc0 vmm_ioctl+0x132(13100000002, c0907601, fffffc7fe7e01e00, 202003, fffffe5afe105640, fffffe007bc16ea8)
  fffffe007bc16d00 cdev_ioctl+0x39(13100000002, c0907601, fffffc7fe7e01e00, 202003, fffffe5afe105640, fffffe007bc16ea8)
  fffffe007bc16d50 spec_ioctl+0x60(fffffe5afc87d680, c0907601, fffffc7fe7e01e00, 202003, fffffe5afe105640, fffffe007bc16ea8, 0)
  fffffe007bc16de0 fop_ioctl+0x55(fffffe5afc87d680, c0907601, fffffc7fe7e01e00, 202003, fffffe5afe105640, fffffe007bc16ea8, 0)
  fffffe007bc16f00 ioctl+0x9b(3, c0907601, fffffc7fe7e01e00)
  fffffe007bc16f10 sys_syscall+0x19f()
> ::pgrep bhyvectl
S    PID   PPID   PGID    SID    UID      FLAGS             ADDR NAME
R  17074   6395   6395   6334      0 0x4a004000 fffffe5af7f2a070 bhyvectl
> fffffe5af7f2a070::walk thread | ::findstack -v
stack pointer for thread fffffe5af7b884a0 (bhyvectl/1): fffffe00817125b0
[ fffffe00817125b0 _resume_from_idle+0x12b() ]
  fffffe00817125e0 swtch+0x141()
  fffffe0081712620 cv_wait+0x70(fffffe596096435e, fffffe5960964350)
  fffffe0081712690 vcpu_set_state_locked+0x8e(fffffe5960964000, 1, 1, 1)
  fffffe00817126f0 vcpu_set_state+0x5e(fffffe5960964000, 1, 1, 1)
  fffffe0081712710 vcpu_lock_one+0x23(fffffe5afc395040, 1)
  fffffe0081712760 vmm_write_lock+0x36(fffffe5afc395040)
  fffffe0081712c20 vmmdev_do_ioctl+0x186(fffffe5afc395040, 20007701, 0, 202003, fffffe5af80763f0, fffffe0081712ea8)
  fffffe0081712cc0 vmm_ioctl+0x132(13100000002, 20007701, 0, 202003, fffffe5af80763f0, fffffe0081712ea8)
  fffffe0081712d00 cdev_ioctl+0x39(13100000002, 20007701, 0, 202003, fffffe5af80763f0, fffffe0081712ea8)
  fffffe0081712d50 spec_ioctl+0x60(fffffe5af8088280, 20007701, 0, 202003, fffffe5af80763f0, fffffe0081712ea8, 0)
  fffffe0081712de0 fop_ioctl+0x55(fffffe5af8088280, 20007701, 0, 202003, fffffe5af80763f0, fffffe0081712ea8, 0)
  fffffe0081712f00 ioctl+0x9b(3, 20007701, 0)
  fffffe0081712f10 sys_syscall+0x19f()

Without that change, it's still possible, where vmm_write_lock would instead be a vcpu_lock_all call.

To fix this, the means by which the vioapic updates TMRs should probably be examined. It might make sense to eliminate the vm_rendezvous facility entirely.

Comments

Comment by Patrick Mooney [X]
Created at 2019-03-07T22:11:27.340Z

For the purposes of code review, I'd like to capture some of my design decisions about how to go about fixing this here. Since the bhyve porting work began, the vm_smp_rendezvous facility, used by the vIOAPIC, has been dangerous from the point of deadlock. The vIOAPIC operation must wait until the rendezvous is completed successfully on all active CPUs before it's able to safely return a correct result. That rendezvous relies on vCPU threads proceeding in a timely manner into the rendezvous logic. If for some reason they're not able to, either because they're blocked on another resource, or the thread driving them is dead/inoperable, then the vIOAPIC operation which initiated the rendezvous is essentially blocked forever. I believe it's possible to "flip the script" on how trigger-mode registers in the vCPU LAPICs are updated in order to avoid such danger.

First let's examine this specific vIOAPIC operation in detail. When a redirection table entry is written to, it may result in assertion/deassertion of trigger-mode registers on some vLAPICs. Those TMR bits determine if the vioapic is notified when as part of EOI processing for level-triggered interrupts. For proper operation to occur, the vLAPICs in question must have updated their TMRs prior to receiving any new interrupts or processing EOIs prior to when the vIOAPIC write completes. The existing rendezvous logic, when it is not ensnared by deadlock conditions, does successfully ensure this by resetting TMRs and then asserting any necessary bits on all the vCPUs prior to the return from vioapic_write. It's actions go beyond the true requirements, however. The TMRs must be updated before the vCPU goes to process a new interrupt or EOI, but if the vCPU is idle (either due to HLT, a userspace exit, or other conditions), then recording the designated TMR changes to be applied later is totally adequate. As long as those designated changes take effect before the vCPU enters guest context, the result should be the same.

With all this in mind, I propose that the vm_smp_rendezvous system be ripped out. It's place, some new calls to temporarily exclude a vCPU from running (vcpu_block_run/vcpu_unblock_run will provide the safety necessary to do TMR designation. When a vCPU performs a write to the vIOAPIC redirection tables, the we can use vlapic_calcdest to determine which vLAPICs need to be updated. Looping over those vCPUs, we can force them out of running context (which is very cheap/fast for idle vCPUs), mark the designated TMR changes in their vLAPIC structure, and then unblock them from running. Whenever they reenter running context, they can apply those TMR changes without any worry of contention.


Comment by Patrick Mooney [X]
Created at 2019-05-21T19:27:21.067Z

I wrote a little dtrace script for tracing events of interest while testing this:

#!/usr/sbin/dtrace -Cs

/* Run with -DNEWSTYLE=1 when testing proposed patch */

vioapic_write:entry
{
        self->w = timestamp;
}
vioapic_write:return
/self->w/
{
        @ = quantize(timestamp - self->w);
        self->w = 0;
}

#ifdef NEWSTYLE
vioapic_update_tmrs:entry
{
        self->t = timestamp
}
vcpu_block_run:entry
{
        self->b = timestamp
}
vcpu_block_run:return
/self->b/
{
        trace(timestamp - self->b);
        self->b = 0
}

:vmm:vlapic_*:entry /self->t/
{}

vioapic_update_tmrs:return /self->t/
{
        trace(timestamp - self->t); self->t = 0
}

#else
vioapic_update_tmr:entry
{
        self->t = timestamp;
}
vioapic_update_tmr:return
/self->t/
{
        trace(timestamp - self->t);
        self->t = 0
}
#endif

Testing with stock bits, we see the relative high expense of vioapic writes during the boot-up of a Linux guest:

CPU     ID                    FUNCTION:NAME
  4  74268        vioapic_update_tmr:return             25519
  4  74268        vioapic_update_tmr:return             22667
  4  74268        vioapic_update_tmr:return             22890
  4  74268        vioapic_update_tmr:return             22839
  4  74268        vioapic_update_tmr:return             22647
  4  74268        vioapic_update_tmr:return             22573
  4  74268        vioapic_update_tmr:return             22580
  4  74268        vioapic_update_tmr:return             22753
  4  74268        vioapic_update_tmr:return             22570
  4  74268        vioapic_update_tmr:return             22581
  4  74268        vioapic_update_tmr:return             22569
  4  74268        vioapic_update_tmr:return             22728
  4  74268        vioapic_update_tmr:return             24993
  4  74268        vioapic_update_tmr:return             24862
  4  74268        vioapic_update_tmr:return             22565
  4  74268        vioapic_update_tmr:return             22681
  4  74268        vioapic_update_tmr:return             22608
  4  74268        vioapic_update_tmr:return             22730
  4  74268        vioapic_update_tmr:return             24744
  4  74268        vioapic_update_tmr:return             22715
  4  74268        vioapic_update_tmr:return             22523
  4  74268        vioapic_update_tmr:return             22708
  4  74268        vioapic_update_tmr:return             22824
  4  74268        vioapic_update_tmr:return             22810
  4  74268        vioapic_update_tmr:return             22564
  4  74268        vioapic_update_tmr:return             22675
  4  74268        vioapic_update_tmr:return             22619
  4  74268        vioapic_update_tmr:return             22721
  4  74268        vioapic_update_tmr:return             22581
  4  74268        vioapic_update_tmr:return             22582
  4  74268        vioapic_update_tmr:return             22646
  4  74268        vioapic_update_tmr:return             22584
  4  74268        vioapic_update_tmr:return             22733
  4  74268        vioapic_update_tmr:return             22835
  4  74268        vioapic_update_tmr:return             22599
  4  74268        vioapic_update_tmr:return             22592
  4  74268        vioapic_update_tmr:return             22569
  4  74268        vioapic_update_tmr:return             22616
  4  74268        vioapic_update_tmr:return             22746
  4  74268        vioapic_update_tmr:return             22740
  4  74268        vioapic_update_tmr:return             23452
  4  74268        vioapic_update_tmr:return             27501
  4  74268        vioapic_update_tmr:return             22816
  4  74268        vioapic_update_tmr:return             23045
  4  74268        vioapic_update_tmr:return             22872
  4  74268        vioapic_update_tmr:return             22749
  4  74268        vioapic_update_tmr:return             22755
  4  74268        vioapic_update_tmr:return             27791
 16  74268        vioapic_update_tmr:return             31523
  7  74268        vioapic_update_tmr:return             64098
 31  74268        vioapic_update_tmr:return             28783
 28  74268        vioapic_update_tmr:return             28829
 39  74268        vioapic_update_tmr:return             29395
 34  74268        vioapic_update_tmr:return             28869
 26  74268        vioapic_update_tmr:return             34178
 28  74268        vioapic_update_tmr:return             27148
 39  74268        vioapic_update_tmr:return             27716
 34  74268        vioapic_update_tmr:return             27766
 26  74268        vioapic_update_tmr:return             32319
 28  74268        vioapic_update_tmr:return             27196
 34  74268        vioapic_update_tmr:return             61270
 39  74268        vioapic_update_tmr:return             27713
 26  74268        vioapic_update_tmr:return             31067


^C


           value  ------------- Distribution ------------- count
             512 |                                         0
            1024 |@@@                                      82
            2048 |@@@@@@@@@@@@@@@@@@@@@@@@@                757
            4096 |@@@@@@@@                                 252
            8192 |@@                                       58
           16384 |@@                                       49
           32768 |                                         0
           65536 |                                         0
          131072 |                                         4
          262144 |                                         0

The story is similar for a Windows guest:

CPU     ID                    FUNCTION:NAME
 11  74268        vioapic_update_tmr:return             26462
 11  74268        vioapic_update_tmr:return             22830
 11  74268        vioapic_update_tmr:return             22712
 11  74268        vioapic_update_tmr:return             22824
 11  74268        vioapic_update_tmr:return             22670
 11  74268        vioapic_update_tmr:return             22694
 11  74268        vioapic_update_tmr:return             22589
 11  74268        vioapic_update_tmr:return             22721
 11  74268        vioapic_update_tmr:return             22640
 11  74268        vioapic_update_tmr:return             23007
 11  74268        vioapic_update_tmr:return             22580
 11  74268        vioapic_update_tmr:return             22742
 11  74268        vioapic_update_tmr:return             22688
 11  74268        vioapic_update_tmr:return             22710
 11  74268        vioapic_update_tmr:return             22586
 11  74268        vioapic_update_tmr:return             22770
 11  74268        vioapic_update_tmr:return             22648
 11  74268        vioapic_update_tmr:return             24998
 11  74268        vioapic_update_tmr:return             22724
 11  74268        vioapic_update_tmr:return             24299
 11  74268        vioapic_update_tmr:return             22610
 11  74268        vioapic_update_tmr:return             22661
 11  74268        vioapic_update_tmr:return             22598
 11  74268        vioapic_update_tmr:return             22823
 11  74268        vioapic_update_tmr:return             22612
 11  74268        vioapic_update_tmr:return             22665
 11  74268        vioapic_update_tmr:return             22582
 11  74268        vioapic_update_tmr:return             22835
 11  74268        vioapic_update_tmr:return             22680
 11  74268        vioapic_update_tmr:return             22827
 11  74268        vioapic_update_tmr:return             22609
 11  74268        vioapic_update_tmr:return             22655
 11  74268        vioapic_update_tmr:return             22593
 11  74268        vioapic_update_tmr:return             22789
 11  74268        vioapic_update_tmr:return             22698
 11  74268        vioapic_update_tmr:return             22674
 11  74268        vioapic_update_tmr:return             22596
 11  74268        vioapic_update_tmr:return             22665
 11  74268        vioapic_update_tmr:return             22594
 11  74268        vioapic_update_tmr:return             22821
 11  74268        vioapic_update_tmr:return             22683
 11  74268        vioapic_update_tmr:return             24330
 11  74268        vioapic_update_tmr:return             22682
 11  74268        vioapic_update_tmr:return             22778
 11  74268        vioapic_update_tmr:return             22682
 11  74268        vioapic_update_tmr:return             22773
 11  74268        vioapic_update_tmr:return             22575
 11  74268        vioapic_update_tmr:return             22658
 11  74268        vioapic_update_tmr:return             22611
 11  74268        vioapic_update_tmr:return             22882
 11  74268        vioapic_update_tmr:return             22594
 11  74268        vioapic_update_tmr:return             22678
 11  74268        vioapic_update_tmr:return             22596
 11  74268        vioapic_update_tmr:return             22699
 11  74268        vioapic_update_tmr:return             22580
 11  74268        vioapic_update_tmr:return             22685
 11  74268        vioapic_update_tmr:return             22765
 11  74268        vioapic_update_tmr:return             22685
 11  74268        vioapic_update_tmr:return             22630
 11  74268        vioapic_update_tmr:return             22838
 11  74268        vioapic_update_tmr:return             22600
 11  74268        vioapic_update_tmr:return             22681
 11  74268        vioapic_update_tmr:return             22690
 11  74268        vioapic_update_tmr:return             24177
 11  74268        vioapic_update_tmr:return             26403
  1  74268        vioapic_update_tmr:return             32289
 38  74268        vioapic_update_tmr:return             66301
  3  74268        vioapic_update_tmr:return             28722
  3  74268        vioapic_update_tmr:return             26969
 38  74268        vioapic_update_tmr:return             67634
  3  74268        vioapic_update_tmr:return             24550
 38  74268        vioapic_update_tmr:return             62271
 38  74268        vioapic_update_tmr:return             62027
  3  74268        vioapic_update_tmr:return             24125
^C


           value  ------------- Distribution ------------- count
             512 |                                         0
            1024 |@                                        1
            2048 |@                                        1
            4096 |                                         0
            8192 |@@                                       4
           16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@       63
           32768 |@                                        1
           65536 |@@                                       3
          131072 |@                                        2
          262144 |                                         0

With the patch applied (and using the NEWSTYLE trace) the story is different.

Linux (patched):

dtrace: script 'vioapic_tmr.d' matched 51 probes
CPU     ID                    FUNCTION:NAME
 11  83351       vioapic_update_tmrs:return              2624
 11  83351       vioapic_update_tmrs:return              1790
 11  83351       vioapic_update_tmrs:return              1743
 11  83351       vioapic_update_tmrs:return              1728
 11  83351       vioapic_update_tmrs:return              1654
 11  83351       vioapic_update_tmrs:return              1624
 11  83351       vioapic_update_tmrs:return              1594
 11  83351       vioapic_update_tmrs:return              1587
 11  83351       vioapic_update_tmrs:return              2065
 11  83351       vioapic_update_tmrs:return              1896
 11  83351       vioapic_update_tmrs:return              4156
 11  83351       vioapic_update_tmrs:return              1636
 11  83351       vioapic_update_tmrs:return              1600
 11  83351       vioapic_update_tmrs:return              1673
 11  83351       vioapic_update_tmrs:return              1599
 11  83351       vioapic_update_tmrs:return              1596
 11  83351       vioapic_update_tmrs:return              1589
 11  83351       vioapic_update_tmrs:return              1585
 11  83351       vioapic_update_tmrs:return              1595
 11  83351       vioapic_update_tmrs:return              1761
 11  83351       vioapic_update_tmrs:return              1689
 11  83351       vioapic_update_tmrs:return              1634
 11  83351       vioapic_update_tmrs:return              1658
 11  83351       vioapic_update_tmrs:return              1597
 11  83351       vioapic_update_tmrs:return              1645
 11  83351       vioapic_update_tmrs:return              1619
 11  83351       vioapic_update_tmrs:return              1589
 11  83351       vioapic_update_tmrs:return              1589
 11  83351       vioapic_update_tmrs:return              1572
 11  83351       vioapic_update_tmrs:return              1591
 11  83351       vioapic_update_tmrs:return              1623
 11  83351       vioapic_update_tmrs:return              1792
 11  83351       vioapic_update_tmrs:return              1634
 11  83351       vioapic_update_tmrs:return              1627
 11  83351       vioapic_update_tmrs:return              1804
 11  83351       vioapic_update_tmrs:return              1813
 11  83351       vioapic_update_tmrs:return              1811
 11  83351       vioapic_update_tmrs:return              1928
 11  83351       vioapic_update_tmrs:return              1819
 11  83351       vioapic_update_tmrs:return              1606
 11  83842            vlapic_calcdest:entry
 11  84220             vlapic_tmr_set:entry
 11  83351       vioapic_update_tmrs:return              7297
 11  83351       vioapic_update_tmrs:return              4035
 11  83351       vioapic_update_tmrs:return              1892
 11  83351       vioapic_update_tmrs:return              1861
 11  83351       vioapic_update_tmrs:return              1843
 11  83351       vioapic_update_tmrs:return              1820
 11  83351       vioapic_update_tmrs:return              1763
 11  83842            vlapic_calcdest:entry
 11  84220             vlapic_tmr_set:entry
 11  83351       vioapic_update_tmrs:return              8966
 17  83842            vlapic_calcdest:entry
 17  84220             vlapic_tmr_set:entry
 17  83351       vioapic_update_tmrs:return              8892
 17  83842            vlapic_calcdest:entry
 17  84220             vlapic_tmr_set:entry
 17  83351       vioapic_update_tmrs:return              6338
 17  83842            vlapic_calcdest:entry
 17  84220             vlapic_tmr_set:entry
 17  83351       vioapic_update_tmrs:return              6286
^C


           value  ------------- Distribution ------------- count
             512 |                                         0
            1024 |@                                        22
            2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@        998
            4096 |@@@@@                                    144
            8192 |@                                        39
           16384 |                                         2
           32768 |                                         0
           65536 |                                         1
          131072 |                                         0

Windows (patched):

CPU     ID                    FUNCTION:NAME
 23  83351       vioapic_update_tmrs:return              4086
 23  83351       vioapic_update_tmrs:return              1603
 23  83351       vioapic_update_tmrs:return              1681
 23  83351       vioapic_update_tmrs:return              1673
 23  83351       vioapic_update_tmrs:return              1648
 23  83351       vioapic_update_tmrs:return              1695
 23  83351       vioapic_update_tmrs:return              1715
 23  83351       vioapic_update_tmrs:return              1583
 23  83351       vioapic_update_tmrs:return              1622
 23  83351       vioapic_update_tmrs:return              1684
 23  83351       vioapic_update_tmrs:return              1586
 23  83351       vioapic_update_tmrs:return              1580
 23  83351       vioapic_update_tmrs:return              1664
 23  83351       vioapic_update_tmrs:return              1585
 23  83351       vioapic_update_tmrs:return              1585
 23  83351       vioapic_update_tmrs:return              1575
 23  83351       vioapic_update_tmrs:return              1594
 23  83351       vioapic_update_tmrs:return              1702
 23  83351       vioapic_update_tmrs:return              1757
 23  83351       vioapic_update_tmrs:return              3747
 23  83351       vioapic_update_tmrs:return              1674
 23  83351       vioapic_update_tmrs:return              1770
 23  83351       vioapic_update_tmrs:return              1669
 23  83351       vioapic_update_tmrs:return              1718
 23  83351       vioapic_update_tmrs:return              1669
 23  83351       vioapic_update_tmrs:return              1675
 23  83351       vioapic_update_tmrs:return              1759
 23  83351       vioapic_update_tmrs:return              1723
 23  83351       vioapic_update_tmrs:return              1668
 23  83351       vioapic_update_tmrs:return              1709
 23  83351       vioapic_update_tmrs:return              1679
 23  83351       vioapic_update_tmrs:return              1753
 23  83351       vioapic_update_tmrs:return              1678
 23  83351       vioapic_update_tmrs:return              1669
 23  83351       vioapic_update_tmrs:return              1723
 23  83351       vioapic_update_tmrs:return              1760
 23  83351       vioapic_update_tmrs:return              1676
 23  83351       vioapic_update_tmrs:return              1677
 23  83351       vioapic_update_tmrs:return              1694
 23  83351       vioapic_update_tmrs:return              1758
 23  83351       vioapic_update_tmrs:return              1717
 23  83351       vioapic_update_tmrs:return              1756
 23  83351       vioapic_update_tmrs:return              1670
 23  83351       vioapic_update_tmrs:return              1667
 23  83351       vioapic_update_tmrs:return              1688
 23  83351       vioapic_update_tmrs:return              1579
 23  83351       vioapic_update_tmrs:return              1721
 23  83351       vioapic_update_tmrs:return              1780
 23  83351       vioapic_update_tmrs:return              1674
 23  83351       vioapic_update_tmrs:return              1723
 23  83351       vioapic_update_tmrs:return              1689
 23  83351       vioapic_update_tmrs:return              1684
 23  83351       vioapic_update_tmrs:return              1694
 23  83351       vioapic_update_tmrs:return              1724
 23  83351       vioapic_update_tmrs:return              1663
 23  83351       vioapic_update_tmrs:return              1682
 23  83351       vioapic_update_tmrs:return              1752
 23  83351       vioapic_update_tmrs:return              1685
 23  83351       vioapic_update_tmrs:return              3687
 23  83351       vioapic_update_tmrs:return              1677
 23  83351       vioapic_update_tmrs:return              1696
 23  83351       vioapic_update_tmrs:return              1678
 23  83351       vioapic_update_tmrs:return              1670
 23  83351       vioapic_update_tmrs:return              1682
 23  83842            vlapic_calcdest:entry
 23  84220             vlapic_tmr_set:entry
 23  83351       vioapic_update_tmrs:return              8022
 23  83351       vioapic_update_tmrs:return              5507
 23  83351       vioapic_update_tmrs:return              2516
 13  83351       vioapic_update_tmrs:return              2515
 13  83351       vioapic_update_tmrs:return              1600
^C


           value  ------------- Distribution ------------- count
            1024 |                                         0
            2048 |@                                        2
            4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@        62
            8192 |@@@@@                                    9
           16384 |@                                        2
           32768 |                                         0

Comment by Patrick Mooney [X]
Created at 2019-05-21T19:32:57.768Z

Repeated reboots of the Linux guest on a patched host show the behavior when TMR updates in the vIOAPIC require vCPUs to be kicked out of VMX context and temporarily blocked from running so the operation can proceed:

 19  83351       vioapic_update_tmrs:return              1818
 19  83351       vioapic_update_tmrs:return              1545
 19  83842            vlapic_calcdest:entry
 19  84220             vlapic_tmr_set:entry
 19  83351       vioapic_update_tmrs:return              9359
 19  83351       vioapic_update_tmrs:return              1634
 19  83351       vioapic_update_tmrs:return              1533
 19  83351       vioapic_update_tmrs:return              1547
 19  83351       vioapic_update_tmrs:return              1735
 19  83351       vioapic_update_tmrs:return              1641
 19  83351       vioapic_update_tmrs:return              1742
 13  83842            vlapic_calcdest:entry
 13  84220             vlapic_tmr_set:entry
 13  83351       vioapic_update_tmrs:return              8258
 27  83842            vlapic_calcdest:entry
 27  84515            vcpu_block_run:return             22115
 27  84220             vlapic_tmr_set:entry
 27  83351       vioapic_update_tmrs:return             39114
 27  83842            vlapic_calcdest:entry
 27  84515            vcpu_block_run:return             18539
 27  84220             vlapic_tmr_set:entry
 27  83351       vioapic_update_tmrs:return             31708
 27  83842            vlapic_calcdest:entry
 27  84515            vcpu_block_run:return             18126
 27  84220             vlapic_tmr_set:entry
 27  83351       vioapic_update_tmrs:return             31326
^C


           value  ------------- Distribution ------------- count
             512 |                                         0
            1024 |@@                                       73
            2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@            1224
            4096 |@@@@@@@                                  293
            8192 |@@                                       71
           16384 |                                         8
           32768 |                                         4
           65536 |                                         0

The performance is still on-par with or better than the old logic.


Comment by Patrick Mooney [X]
Created at 2019-05-22T18:23:20.445Z

Absent a bhyve/KVM test suite, I've not been able to find a device setup to exercise level-triggered interrupts and the TMRs themselves.


Comment by Patrick Mooney [X]
Created at 2019-05-23T21:03:37.120Z

Posted for upstream CR: https://reviews.freebsd.org/D20389


Comment by Patrick Mooney [X]
Created at 2019-06-05T20:26:21.915Z

Using this wad combined with that in OS-7447, I repeated the wrlock-cycle test. A VM was subjected to repeated wrlock-cycle actions while undergoing multiple reboots in an attempt to reproduce the former race. It completed this without issue.


Comment by Patrick Mooney [X]
Created at 2019-06-11T21:04:06.307Z

Using a Linux guest, finding a consumer of the TMRs is hard to come by. One such user is acpi:

  9:          0          0          0          0   IO-APIC-fasteoi   acpi

We can see it being programmed during machine start-up:

 10  83600        vioapic_update_tmrs:entry 00010000 0001a039

 10  84092            vlapic_calcdest:entry
 10  84470             vlapic_tmr_set:entry
 10  83601       vioapic_update_tmrs:return              7000
 10  83805                vmx_set_tmr:entry 0000000000000000000000000000000000000000000000000200000000000000

When initiating a power button press, the EOI processing on the vIOAPIC is triggered:

 33  84514        vioapic_process_eoi:entry vcpu:0 vec:57

This mirrors the same EOI activity we see in a pre-patch system.


Comment by Jira Bot
Created at 2019-06-20T18:26:16.041Z

illumos-joyent commit ec6f18e946abf1cebedfd0d7ca47dd5a5d183ff8 (branch master, by Patrick Mooney)

OS-7622 bhyve vioapic writes can deadlock instance
Reviewed by: Hans Rosenfeld <hans.rosenfeld@joyent.com>
Reviewed by: John Baldwin <jhb@FreeBSD.org>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>