Intro

This is an ambitious project that aims to run three OS w/ GPU simultaneously on an sff ITX machine.
Each running OS will have its own GPU that it completely owns.
Host OS will run with ryzen APU and using its igpu,
the two guest OSs will each get its own quadro GPU.
This is difficult as f*ck for a number of reasons

ITX only has a single PCIE and to have three GPU,
you need PCIE x8x8 bifurcation motherboard so you can support igpu + 2 * dgpu setup.
There are SR-IOV and alike tech that supports one gpu multiple vms, but look at the price and give up.
Intel’s GVT-g might seems perfect, except you only got a weak gpu to split with.
ITX case is too small for most modern GPUs.
For sff pc, you are likely limited to getting two 1-slot GPUs and there aren’t many choices.
Either get water cooled GPU or get some thin quadro GPUs.
Either will work to fit the case and do not block another GPU’s cooling.

WHY TORTURE YOURSELF?

I want a small and portable machine that can run multiple agents.
However, many applications do not support multiple running instances in the same OS.
To add to the pain, a lot of them needs full GPU to run smooth or to even start running.
Not to mention those dev decides that they only supports NVIDIA GPUs
so getting multiple mini pc builds (which is smaller even if you stack a lot of them) will not work…

Update:
There is something called vGPU that can remove the need of housing multiple GPUs in the machine.
It will not require bifurcation and isolated iommu groups support on motherboard,
which are the major reasons why this endeavour becomes a torture.
It also easily works with ITX motherboard as most tesla / quadro cards are smaller.
Of course it also comes with great cost to get those cards.

Preparation

Hardware

This is my setup, but you can choose something similar if the noted requirement meets.

Louqe ghost s1
Asrock B550M-ITX/ac, supports x8x8 bifurcation
Ryzen 5700G, is an APU
64GB RAM, more RAM = good for VM
x8x8 bifurcation adapter
2 * Quadro k2000, single slot and the fan is designed for stacking GPUs
HDMI to USB capture card

The HDMI capture card is very useful:

You don’t need extra monitors to debug issues
The guest will think that it has a real screen and enable display output, just like a hdmi dummy plug.
The host can see what the guest sees, which hdmi dummy plug cannot do and VNC will not always work.

Pitfalls

Here are a few key pitfalls to keep in mind.

Hopefully you have not yet started plugging in everything before you starts next section.
The motherboard will take some time to boot after hardware change,
be patient and be prepared to force reset if it exceeds 3 minutes without sign of booting.
Use the HDMI port on your motherboard if you are using APU’s display output.
Install ubuntu >= 22.04 for best KVM support.
You will get weird incompatibility issues if you use previous versions.
Choose boot from UEFI when booting the installation medium and make sure you install with UEFI.
You are recommended to use Startup Disk Creator to create the USB on Ubuntu.
It is so easy that you simply do not waste time to deal with specific config issues of other methods.
UEFI is highly recommended here because dual Quadro setup only works with UEFI.
It is weird because Quadro+Geforce or dual Geforce boots fine without UEFI.
It is recommended to use XOrg if you are going Nvidia.
Note that I am using Wayland on the host without seeing issue.
In case you want to switch, you can choose between XOrg and Wayland in your login screen.
When using OBS to read guest screen from HDMI capture card,
choose Video Capture Device (V4L2) and format as BGR3/YU12/YV12.
YUYV is laggy and Motion-JPEG does not update the screen.
No, cheese does not work.

Configuration

Note that I do not remember which display port to plug in for each step, so just try to reboot and plug in different port to find out which one you should use if it does not work. Note that the display ports are only activated if

you have a working monitor plugged into when computer boots.
the monitor choose that connection as signal source when computer boots. Plugging in HDMI dummy when booting and then switching to monitor after boot will not always work because a different display mode is activated for the plugs, so you will not see any meaningful thing when you switch to monitor.

Now:

Keep one and only one dGPU, do not have another dGPU plugged in. Have the display connected to dGPU’s port. Install the OS with just the APU and one dGPU. Reboot and verify installation. You should see llvm pipe in Settings => About => Graphics Go to bios and enable multiple gpu with iGPU so that both iGPU and dGPU are active. Remember to allocate enough VRAM for iGPU Have the display connected to motherboard’s port. Reboot and verify installation. You should see AMD XXX / llvm pipe in Settings => About => Graphics Now you are simultaneously using iGPU and dGPU, sadly that is only for the host OS. Go to bios and enable x8x8 bifurcation. Power off completely, then plug in the second dGPU. Reboot and verify installation. You should see AMD XXX / llvm pipe / llvm pipe in Settings => About => Graphics Now you are simultaneously using iGPU and two dGPU, sadly that is only for the host OS.

Now just follow the guide in https://mathiashueber.com/passthrough-windows-11-vm-ubuntu-22-04/

You will end up with VM that has these:

Overview
OS information
Performance
CPUs
Memory
Boot Options
VirtIO Disk 1
SATA CMROM 1
NIC :xx:xx:xx
Mouse
Keyboard
Sound ich6
Serial 1
PCI xxxx:xx:xx.x
PCI xxxx:xx:xx.x
Controller USB 0
Controller SATA 0
Controller PCIe 0
Controller PCI 3
Controller VirtIO Serial 0
USB Redirector 1
USB Redirector 2

Consider this as a checklist for what it should look like, YMMV

ACS Override Patch (AOP)

My motherboard does not do iommu group separation, which prevents the second gpu from being passed to VM. AOP is a kernel feature that break different devices into different iommu group. It is not enabled or built-in for most common kernel due to security concerns. It is like forcing devices that share the same hardware controller to appear as different groups, where one of the groups are passed to VM while the others are still on host/another VM. But then the VM can actually write into other devices on the same hardware controller. This is a kind of jailbreak where VM is now able to write into arbitrary memory of another system. The potential risk is serious if the VM is compromised, but of course there is so few people using AOP that malicious parties probably do not have interest in targeting it. Some motherboard comes with better iommu grouping, whilst supporting bifurcation, but there is little information about it so it is a gamble with limited budget. The brute force way is just to use AOP.

For Arch, the zen kernel already has AOP waiting for a boot option to be enabled. For Ubuntu, get the almost equivalent liquorix kernel.

# For older version of ubuntu, you can consider queuecumber's kernel
# at https://queuecumber.gitlab.io/linux-acs-override/
# It is now discontinued in favor of zen/liquorix so it is not available for newer kernels

# Note: xanmod might work but it seems to break with nvidia driver.
sudo add-apt-repository ppa:damentz/liquorix
# Ubuntu 22.04 jammy is supported
# Ignore the warning coming from debian repo,
# or you may remove that repo link to get rid of the warning
sudo apt-get install -y linux-image-liquorix-amd64 linux-headers-liquorix-amd64

# To allow easier checking boot kernel and setting desired default kernel
# get grub-customizer and move the kernel labeled with liquorix as the first option
# For ubuntu 20.04, no need to add repo. For ubuntu 22.04, it is needed.
sudo add-apt-repository ppa:danielrichter2007/grub-customizer
sudo apt install grub-customizer
sudo grub-customizer
# For any kernel, add the boot option pcie_acs_override=downstream,multifunction
# Your boot option should look like: (note that other options are optional)
#     amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction quiet splash
# You may use grub-customizer General settings tab to change this, then **save** it
sudo reboot
# You should now see a lot of groups with the IOMMU Group bash script in mathiashueber guide

# Now just isolate the target GPU group and add it to VM