OS-8021: bhyve uefi omnios guests fail to boot on SmartOS

Details

Issue Type:Bug
Priority:4 - Normal
Status:Open
Created at:2019-10-09T15:59:26.045Z
Updated at:2020-08-19T08:24:48.515Z

People

Created by:Former user
Reported by:Former user

Labels

bhyve

Description

Trying to boot the current omniosce-r151030.iso stable image on current (~= joyent_20191009T102556Z ) SmartOS bits results in us being unable to boot, and being dumped to the UEFI shell:

[NOTICE: Zone booting up]
Boot Failed. EFI DVD/CDROM
Boot Failed. EFI Misc Device
..Boot Failed. EFI Network
UEFI Interactive Shell v2.1
EDK II
UEFI v2.40 (BHYVE, 0x00010000)
Mapping table
     BLK0: Alias(s):BLK1:;BLK2:
          PciRoot(0x0)/Pci(0x4,0x0)
Press ESC in 4 seconds to skip startup.nsh or any other key to continue.
Shell>

The vm json.conf we were using was:

{
  "autoboot": false,
  "bootrom": "uefu",
  "alias": "deepnest",
  "hostname": "deepnest",
  "brand": "bhyve",
  "bhyve_extra_opts": "-s 3,ahci-cd,/omniosce-r151030.iso",
  "resolvers": [
    "10.0.1.2"
  ],
  "ram": 1024,
  "cpu_cap": 200,
  "vcpus": 2,
  "nics": [
    {
      "nic_tag": "admin",
      "ip": "dhcp",
      "model": "virtio",
      "primary": true
    }
  ],
  "disks": [
    {
      "boot": true,
      "model": "virtio",
      "size": 20480
    }
  ]
}

We found that when dropping in the UEFI firmware from the current FreeNAS stable bits,

timf@linn.local uname -a
FreeBSD linn.local 11.2-STABLE FreeBSD 11.2-STABLE #0 r325575+5920981193f(HEAD): Mon Sep 16 23:00:13 UTC 2019     root@nemesis:/freenas-releng/freenas/_BE/objs/freenas-releng/freenas/_BE/os/sys/FreeNAS.amd64  amd64

/usr/local/share/uefi-firmware/BHYVE_UEFI.fd

the system was able to boot just fine. That is, by doing:

`vmadm update <uuid> bootrom=/BHYVE_UEFI.fd`

and dropping a copy of the firmware into /zones/<uuid>/root, booting proceeded as normal and we were able to install the guest.

A copy of that firmware is at

https://us-east.manta.joyent.com/timf/public/BHYVE_UEFI.fd

Comments

Comment by Former user
Created at 2019-10-09T16:00:38.277Z

When we get dumped at the uefi shell, it appears we're able to see a CD device as expected:

Shell> devices
     T   D
     Y C I
     P F A
CTRL E G G #P #D #C  Device Name
==== = = = == == === =========================================================
  2F R - -  0  1   7 PciRoot(0x0)
  4D D - -  3  0   0 Primary Console Input Device
  4E D - -  2  0   0 Primary Console Output Device
  4F D - -  1  0   0 Primary Standard Error Device
  79 D - -  1  0   0 PciRoot(0x0)/Pci(0x0,0x0)
  7A B - -  1  4   1 Sata Controller
  7B D - -  1  3   0 PciRoot(0x0)/Pci(0x4,0x0)
  7C B - -  1  2   1 Virtio Network Device
  7D B - -  1  4   1 PciRoot(0x0)/Pci(0x1E,0x0)
  7E D - -  1  0   0 PciRoot(0x0)/Pci(0x1E,0x1)
  7F B - -  1  2   4 PciRoot(0x0)/Pci(0x1F,0x0)
  80 B - -  1  1   1 PciRoot(0x0)/Pci(0x1F,0x0)/Serial(0x0)
  81 B - -  1  1   1 PciRoot(0x0)/Pci(0x1F,0x0)/Serial(0x1)
  82 B - -  1  3   1 PS/2 Keyboard Device
  83 B - -  1  2   1 PS/2 Mouse Device
  84 B - -  1  1   1 PciRoot(0x0)/Pci(0x1F,0x0)/Serial(0x0)/Uart(115200,8,N,1)
  85 B - -  1  5   3 VT-100+ Serial Console
  86 B - -  1  3   1 SCSI Disk Device
  87 D - -  1  1   0 PciRoot(0x0)/Pci(0x3,0x0)/Sata(0x0,0x0,0x0)/CDROM(0x0)
  88 B - -  1  3   5 Virtio Network Device
  89 D - -  1  0   0 PciRoot(0x0)/Pci(0x6,0x0)/MAC(227DEE44AE31,0x1)/VenHw(D79DF6B0-EF44-43BD-9797-43E93BCF5FA8)
  8A B - -  1  1   2 MNP (MAC=22-7D-EE-44-AE-31, ProtocolType=0x806, VlanId=0)
  8B D - -  1  1   0 MNP (Not started)
  8C D - -  1  0   0 PciRoot(0x0)/Pci(0x6,0x0)/MAC(227DEE44AE31,0x1)/VenHw(D8944553-C4DD-41F4-9B30-E1397CFB267B)
  8D B - -  1  1  10 MNP (MAC=22-7D-EE-44-AE-31, ProtocolType=0x800, VlanId=0)
  8E B - -  1  1   1 IPv4 (SrcIP=0.0.0.0)
  8F B - -  1  1   6 IPv4 (SrcIP=0.0.0.0)
  90 B - -  1  1   1 IPv4 (SrcIP=0.0.0.0)
  91 B - -  2  1   1 UDPv4 (SrcPort=68, DestPort=67)
  92 B - -  1  1   1 IPv4 (Not started)
  93 B - -  2  1   1 UDPv4 (Not started)
  94 D - -  1  1   0 PXE Controller
  95 D - -  1  1   0 PXE Controller
  96 D - -  2  1   0 PXE Controller
  97 B - -  1  1   1 IPv4 (Not started)
  98 B - -  2  1   1 UDPv4 (Not started)
  99 D - -  2  1   0 PXE Controller
  9A B - -  2  1   1 IPv4 (SrcIP=10.0.0.83)
  9B D - -  2  1   0 PXE Controller
  9C B - -  1  1   1 IPv4 (Not started)
  9D D - -  2  1   0 PXE Controller
  9F D - -  2  1   0 TCPv4 (Not started)
  A0 B - -  1  1   1 IPv4 (Not started)
  A1 B - -  1  1   1 PciRoot(0x0)/Pci(0x1F,0x0)/Serial(0x1)/Uart(115200,8,N,1)
  A2 D - -  1  0   0 PC-ANSI Serial Console
  A3 B - -  2  1   1 IPv4 (SrcIP=10.0.0.83)
  A4 D - -  2  1   0 UDPv4 (SrcPort=68, DestPort=67)
  A5 B - -  1  1   3 ARP Controller

Comment by Former user
Created at 2019-10-09T16:04:05.248Z

We're able to boot the same omnios iso using KVM just fine, and are also able to boot current SmartOS iso images without problems.


Comment by Former user
Created at 2019-10-10T14:34:28.103Z

Does the omniOS image boot with the CSM bootrom we ship? (Just for a datapoint)

Maybe it's time to look at properly building and updating uefi-uedk2 from illumos-extra again? Perhaps Mike or Hans can remember the history there.


Comment by Former user
Created at 2019-10-10T14:42:30.353Z

[excepted from a mail I sent Mike yesterday evening]

I found some weird stuff happened after I made it through the install with that new firmware.

Having removed the bhyve_extra_options, I halted the machine and rebooted, only to find that we were again dropping to the UEFI shell, this time with it complaining:

Boot Failed. EFI Misc Device
Boot Failed. EFI Misc Device 1

My reading of that is it didn't seem to recognise that my two disks were actually disks.

I tried switching back to the stock joyent bootrom=uefi, thinking perhaps we might have had hacks in there to make that work, but that didn't help. I then tried switching the disks to 'ahci' rather than 'virtio', wondering if there was some weird device detection code that wasn't working, walking through the installer again to install fresh bits. That didn't make a difference either unfortunately.

However, I did find this time, that after tweaking the post-install settings to setup a serial console, that when I switched to bootrom=bios (which is effectively UEFI-CSM, I think) and switched disks back to 'virtio', I was able to boot just fine, and had a usable 'vmadm console'.

So I think there's a few things going on:

1. our UEFI firmware isn't recognising devices properly
2. after we install and get bootable bits on disk, we need to switch to uefi-csm rom
3. there's something weird with the bhyve vnc server, making the above hard to debug. I eventually found a 3rd party mac vncviewer which, after a few experimental settings, I managed to get to not corrupt console session.


Comment by Former user
Created at 2020-08-19T08:24:48.515Z

I just tried omniosce-r151034l.iso with bootrom=uefi and it boots.

Without uefi I can't interact properly with loader (keys stop working after one press)