complete-computing-environment/nixos-tower.org

15 KiB

My NixOS Tower

Window Smoke is a run of the mill Endpoint managed within Arroyo Systems Management, with a small wrinkle: it has an Nvidia GTX 1060 graphics card attached to it and a 2tb SATA SSD with Windows 10. For the longest time, this machine has primarily booted to Windows and served as a gaming machine.

Well, I want to put the CPU cycles to use in more ways especially now that I am building the Git edge-version of Emacs within Arroyo Emacs for both amd64 and aarch64 every time I bump my Nix Version Pins. I have also been having some somewhat annoying reliability issues cropping up with Virtuous Cassette so having a backup environment will feel alright.

This page isn't an Arroyo Module per se, but a NixOS morph role which is included in to Window Smoke, my desktop tower.

My desktop setup is very much inspired by my good friend kalessin's: A long USB cable, a long display cable, and long power cord connect a monitor and USB hub attached to a sit/stand desktop on casters to my desktop tower sitting in the corner of the room. The desk is spacious and easy to move while still having a full size display and a powerful machine attached to it. This has served me really well, especially considering that my monitor, a Dell U3818DW wide-screen display, has a USB-C input with alt-mode Display Port support. I can plug Virtuous Cassette in to the display and it swaps over my display and the USB hub attached to it. This plug-and-play setup has served me really well for the last few years.

El 🅱lan

This is disabled right now because vfio-pci was not robust enough. See Okay maybe I was wrong about 6.x

How much farther could I take it though?

The last time I visited kalessin he showed me something that I thought was quite remarkable: his desktop would virtualize Windows 10 with his graphics card directly forwarded to the VM so that his entire Steam library would run without having to consider any Proton shenanigans1[SteamDeck]] I am much less concerned with Proton compatibility in general… Worth not having to consider it all the time though and having a Windows machine at my finger tips.]. I thought that was pretty impressive and it's got my brain going in this direction. His machine is an AMD threadripper and AMD GPU and the host runs Arch, but in theory this should work well enough with Intel, Nvidia, and NixOS, right? And if it is good, maybe I'll finally get around to refreshing my desktop with a bit more ram and a bit better GPU… As long as I don't have to deal with the Nvidia X11 drivers or Nouveau ever again.

skinparam linetype polyline
package "Tower" {
        [Intel iGPU]
        [RTX 1060]

        cloud "NixOS" {
                [libvirtd windows10 domain] --> "gamer mode"
                [plasma desktop] --> "goblin mode"
        }

        [Motherboard USB Controller] -> [plasma desktop]
        [Intel iGPU] -> [plasma desktop]
        [PCIe USB Controller] -> [libvirtd windows10 domain]
        [RTX 1060] -> [libvirtd windows10 domain]
        [1tb NVMe] -> [plasma desktop]
        [2tb SATA SSD] -> [libvirtd windows10 domain]
}

package "Desk" {
        node "Dell U3818DW" {
                interface HDMI2
                interface "USB in 1"
                interface DP
                interface "USB in 2"
                interface "Type-C DP"

                "USB out" --> "USB in 1" : when HDMI2 active
                "USB out" --> "USB in 2" : when DP active
        }

        "Type-C DP" --> [laptop] : forwards USB out when active

        "USB out" <-- [keyboard, mouse, etc] 
        HDMI2 --> [Intel iGPU]
        DP --> [RTX 1060]
        "USB in 1" --> [Motherboard USB Controller]
        "USB in 2" --> [PCIe USB Controller]

}

' hidden lines for node ordering/positioning
[keyboard, mouse, etc] -[hidden]d- [Motherboard USB Controller]
[keyboard, mouse, etc] -[hidden]d- "USB out"
[Type-C DP] -[hidden]l- [laptop]
[RTX 1060] -[hidden]d- [libvirtd windows10 domain]
[PCIe USB Controller] -[hidden]d- [libvirtd windows10 domain]
[Motherboard USB Controller] -[hidden]d- [plasma desktop]
[Intel iGPU] -[hidden]d- [plasma desktop]
[1tb NVMe] -[hidden]d- [plasma desktop]
[2tb SATA SSD] -[hidden]d- [plasma desktop]
' [HDMI2] -[hidden]l- [USB out]
[Intel iGPU] -[hidden]r- [1tb NVMe]

attachment:desktop-diagram.png

The four cables between the Tower and Desk are cable-tied together with a power cable which drives the monitor, peripherals, etc. With proper nudging, I can move my Windows 10 installation in to QEMU-KVM and have access to gamer mode and goblin mode side by side, rather than dual-booting and dealing with Nvidia-X11.

Setting up VFIO PCI Passthrough for virtualizing GPU accelerated Windows

This is disabled right now because vfio-pci was not robust enough. See Okay maybe I was wrong about 6.x

I started by reading the Arch Linux docs as well as Gentoo's on setting this up.

Most of the work here came in enabling VT-d and VT-x in the BIOS, and collating the devices properly. VT-x is the general "Intel hardware virtualization" support which lets qemu-kvm work, a bunch of CPU instructions and support for entering in to code that thinks it is in ring 0 when it is not, it's really clear in whether this works or not, and whether it's enabled or not. VT-d is the Intel-specific technical name for an virtualization-aware "IOMMU", basically allowing the OS to re-map hardware memory to main-memory addresses and in to the VMs' space. Your motherboard has to support it too, but if they do your VM can have PCI devices attached to it. My Intel i7-7700k and Asus 270AR motherboard both claimed to support VT-d but it was a bear to find that it was under a different firmware configuration menu than the VT-x feature was.2 I spent quite some time trying to debug this after mistaking the generic "intel virtualization" option to only by VT-x, looking in ARK whether my motherboard chipset and CPU supported VT-d, spelunking the menu, eating dinner, getting lost in the woods, etc etc.

but with that enabled, it should be possible to see which IOMMU groups your hardware is set in to, hopefully the groupings make sense. qemu-kvm can only forward an entire IOMMU group, so maybe move some stuff around in the chassis if they group oddly. Here we see IOMMU group 1 has my Nvidia card in it, and the PCIe Controller it's plugged in to.

nix-shell -p pciutils
for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V);
do
    echo "IOMMU Group ${g##*/}:";
    for d in $g/devices/*;
    do
        echo -e "\t$(lspci -k -nns ${d##*/})";
    done;
done | grep -A15 "IOMMU Group 1:"

IOMMU Group 1: 00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 05) Subsystem: ASUSTeK Computer Inc. Device [1043:872f] Kernel driver in use: pcieport 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] [10de:1c03] (rev a1) Subsystem: eVga.com. Corp. Device [3842:6267] Kernel driver in use: vfio-pci Kernel modules: nvidiafb, nouveau 01:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1) Subsystem: eVga.com. Corp. Device [3842:6267] Kernel driver in use: vfio-pci Kernel modules: snd_hda_intel IOMMU Group 2: 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 630 [8086:5912] (rev 04) DeviceName: Onboard IGD Subsystem: ASUSTeK Computer Inc. Device [1043:872f]

You can see that I have already instructed the system to use vfio-pci as the kernel driver to use for them. This is done below.

Start out making sure this nvidia thing has no chance of loading:

boot.blacklistedKernelModules = [
  "nouveau"
];

With this terminal output, the following NixOS configuration can be created. It will 1) tell the Linux kernel to refuse loading the nouveau graphics driver for my X11 session 2) enable the IOMMU in the kernel 3) configure the "stub" vfio-pci module to attach to the devices I want forwarded to the VM 4) modify my initrd to load vfio-pci and attach it to my Nvidia driver early in the boot process before roam:UDEV can auto-configure anything. I only needed to do this last part on the graphics card, the ethernet controller and USB controller do not need this treatment. You'll notice these IDs in the IOMMU Groups script-let or in the output of lspci -k -nn

boot.kernelParams = [
  "intel_iommu=on"
  # nvidia gtx 1060, realtek pcie ethernet controller, USB
  "vfio-pci.ids=10de:1c03,10de:10f1,10ec:8168,1b21:2142"
];
boot.initrd.availableKernelModules = [ "vfio-pci" ];
boot.initrd.preDeviceCommands = ''
  echo "Enabling vfio-pci"
  # USB controller, GPU, GPU audio, Eth
  DEVS="0000:82:00.0 0000:83:00.0 0000:83:00.1 0000:02:00.0"
  for DEV in $DEVS; do
    echo "vfio-pci" > /sys/bus/pci/devices/$DEV/driver_override
  done
  modprobe -i vfio-pci
'';
boot.kernelPackages = config.boot.zfs.package.latestCompatibleLinuxPackages;

But make sure to see the heading below where I have to fix this. It's enabled by default on Window Smoke, if I want to disable all the VFIO stuff I just have to go set config.boot.enableVFIO = false in the host configuration.

I make the VFIO stuff optional, and then configure libvirtd in a pretty bog-standard way. Instruct NixOS to start the service, install virt-manager to set up the VM by hand … but I would like to make this declarative some day.

{ config, lib, pkgs, ... }:

let
dump_iommu =
  (pkgs.writeScriptBin "dump_iommu" ''
    for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V);
    do
      echo "IOMMU Group ''${g##*/}:";
      for d in $g/devices/*;
      do
          echo -e "\t$(${pkgs.pciutils}/bin/lspci -k -nns ''${d##*/})";
      done;
    done
  '');
in
{
  options = {
    boot.enableVFIO = lib.mkEnableOption {
      name = "vfio-pci passthrough";
    };
  };
  config = lib.mkIf config.boot.enableVFIO 
    {
      <<nouveau>>
      virtualisation.libvirtd = {
        enable = true;
        onBoot = "start";
        onShutdown = "shutdown";
      };
      environment.systemPackages = with pkgs; [
        virt-manager
        dump_iommu
      ];
      <<vfio>>
    };
}

This is essentially what is happening in Alexander Bakker's Notes on PCI Passthrough on NixOS using QEMU and VFIO blog post/literate configuration. I'll have to consider setting up Looking Glass and Scream soon, though it's not a big concern once I have the USB routing done.

Fixing vfio-pci after 6.0.4 et al broke it

Sooo, the patch which landed in 6.0.4 which broke vfio-pci got backported to 5.15, I'm going to just revert it and let loose the dogs of war, though I am not looking foward to compiling my desktop's kernel for the rest of existence. At least NixOS makes this easy…

This is included in the optional boot.enableVFIO optioned section of the role configuration along with the VFIO configuration itself.

boot.kernelPatches = [
#   {
#     name = "revert-vfio-breakage";
#     patch = /home/rrix/org/cce/data/20/230530T001233.513103/v3-3-3-vfio-pci-Remove-console-drivers.patch;
#   }
];

This patch reverts torvalds/linux@d173780620792c725506b0f3c5ec52c7fbac1db0:

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index a0d69ddaf90d..756d049bd9cf 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -10,7 +10,6 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
-#include <linux/aperture.h>
 #include <linux/device.h>
 #include <linux/eventfd.h>
 #include <linux/file.h>
@@ -1793,10 +1794,6 @@ static int vfio_pci_vga_init(struct vfio_pci_core_device *vdev)
 	if (!vfio_pci_is_vga(pdev))
 		return 0;
 
-	ret = aperture_remove_conflicting_pci_devices(pdev, vdev->vdev.ops->name);
-	if (ret)
-		return ret;
-
 	ret = vga_client_register(pdev, vfio_pci_set_decode);
 	if (ret)
 		return ret;

Running Windows 10 in libvirtd

Instruct it to use dev/sda or something in /dev/disk/by-uuid as the "system image" but be careful if you delete the domain that you don't let libvirtd try to rm /dev/sda … it offered to, i'm not sure it actually would have, but that is spooky action.

UEFI boot support via OVMF is installed by default in the virtualisation.libvirt module, but the virt-manager GUI will BIOS boot by default. Change the boot mode in "overview configuration" in the install/setup phase, it's not editable once the VM is created in libvirtd..

Then attach PCI devices for the Nvidia card, a USB-3 PCIe card, an Ethernet PCIe card. The virtualized Windows will have its own USB ports exposed on the back of my case, it won't need to rely on bridge networking but will be connected directly to my switch, and the desktop will be displayed in front of me through the GPU and not the SPICE virtual display.

Forwarding USB devices one by one is fine during setup and fiddling around, but the USB controller will allow me to run another USB cable to my display and use the display as a KVM+USB switch between the host OS and the guest OS. I am waiting on the cables and controller to be delivered, it's been snowy this month.

NEXT record video of boot + swapover + art of rally @ 60FPS + swap back

NEXT Investigate declarative libvirtd domains

NEXT what else should only be installed here?

  • CUDA toolkit
  • kind of want to play with stable diffusion on local GPU

1

That said, since I have been using the [[id:20220810T163745.293071

2

I should write down which menu since it was a PITA to find but woops i'm typing on this machine right now!