The Home Lab (Part 2)

In Part 1, I talked about how I built up my home lab over the years. Here, I won't go into the summary, but will do a quick recap of gear:

  • Networking
    • Ubiquiti UniFi Dream Machine Pro (Routing, Firewalling)
    • Ubiquiti US-16-XG for switch aggregation
    • Ubiquiti US-48-500W for PoE and wall-jacks around the house
    • Ubiquiti USW-Pro-Aggregation for 10G server connections
    • Ubiquiti USW-Pro-24 for IPMI links (mounted backwards in the rack for shorter cable runs)
    • Ubiquiti UAP-IW-HDs for WiFi around the house
    • (There's some other Ubiquti networking gear around, but it's not super relevant to this discussion)
  • Servers
    • 3 x HPE DL360 G9
    • 3 x ASRock Rack 1U4LW-X470
    • 1 x AsRock Rack 1U2LW-X470
  • Storage
    • 1 x QNAP TS-1232PXU-RP-4G

So far, I'm pretty happy with all of this gear. It's all rack mounted in my basement and after replacing the fans in the ASRocks with Noctuas, doesn't make a bunch of noise unless I'm rebooting something.

My first effort at doing KubeVirt was to try out Harvester, however, I found it too opinionated when it came to storage and networking and too difficult to get to use my NAS for storage while I get the cluster up to then provision a Ceph cluster.

Thus, after a bunch of reading and digging, I ended up settling on OKD with KubeVirt.

This brings with it a bunch of challenges, given that I'm looking to do this in a bare metal environment:

  • Operating Systems
    • OKD requires Fedora CoreOS. Fedora CoreOS isn't a traditional Linux distro that you install and can hand-hack or automate with Chef/Puppet/Ansible/Salt
    • Installing FCOS really is designed to be done via PXE for bare metal
    • Configuring FCOS requires building Ignition Configs
  • Networking
    • I can't PXE with 802.3ad LACP bonding configured on the USW-Pro-Aggregation. Additionally, the time needed for the UniFi controller to reconfigure the ports can be longer than the timeout of the network tests done by the OS
  • Storage

Alright, this is a lot to take on, but I'm not scared by some coding and configuring! So, let's start by taking my 1U2LW-X470 and get FCOS on it - it only has one NVMe SSD, so it's quick and easy to rebuild without damaging the already running oVirt environment. First challenge I encountered was PXE with the Intel x710 dual-port 10GbE NIC and the EFI of the 1U2LW-X470 weren't playing nicely. Updated the firmware of the card and it actually got worse and managed to lock the Ubiquiti switch up once where I had to disconnect power. I replaced the Intel x710 with a SolarFlare SF329-9021 (It's old, I know, but it's cheap and reliable) and PXE booting was finally possible as an option from the UEFI boot menu.

Next step, I had never set up a PXE environment before, so let's document that in part 3.