PCI Pass-through: NVIDIA Tesla L4 – Home Lab

1. Why PCI Pass-through?

The primary goal is to use the Tesla L4/P4 at 100% capacity within a Virtual Machine for high-speed document parsing/inference without the complexity of vGPU.

Zero Licensing: PCI Pass-through is native and free, bypassing the need for NVIDIA License System (NLS) servers.
Lean Host: The Host remains a “carrier,” avoiding proprietary drivers that could break during kernel updates.

2. Host Configuration

We isolate the GPU hardware, so the Guest VM can claim it exclusively.

Identify the Hardware

# Locate the card's address and Vendor ID
lspci -nnk | grep -i nvidia

Address: 82:00.0 (The physical PCI slot on the motherboard).
Vendor ID: [10de:1bb3] (The “DNA” of the Tesla P4).

Kernel & GRUB Setup

We use the 6.8 HWE kernel for modern VFIO stability.

sudo apt install --install-recommends linux-generic-hwe-22.04 -y

Edit /etc/default/grub and add the isolation parameters: GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on iommu=pt vfio-pci.ids=10de:1bb3"

intel_iommu=on: Enables hardware memory isolation.
iommu=pt: Direct “pass-through” for better performance.
vfio-pci.ids: Locks the card to the VFIO driver at boot.

Apply changes:

sudo update-grub
sudo reboot

3. The “Intelligent” Guest Image (.qcow2)

This image automatically activates the GPU only if it is attached to the VM.

Install Guest Drivers

sudo apt install -y nvidia-driver-580-server nvidia-utils-580-server

The Auto-Activation Tripwire

We use two Systemd files to manage the driver.

File 1: /etc/systemd/system/gpu-activate.service This script unmasks and initializes the persistence daemon.

[Unit]
Description=Initialize Tesla P4 for Inference

[Service]
Type=oneshot
ExecStartPre=/usr/bin/systemctl unmask nvidia-persistenced.service
ExecStart=/usr/bin/systemctl start nvidia-persistenced.service
ExecStartPost=/usr/bin/nvidia-smi -pm 1
RemainAfterExit=yes

File 2: /etc/systemd/system/gpu-activate.path This “watcher” triggers the service only when the GPU device file appears.

[Unit]
Description=Watch for NVIDIA GPU Device Creation

[Path]
PathExists=/dev/nvidia0

[Install]
WantedBy=multi-user.target

Final Setup in Guest:

# Ensure drivers don't run on non-GPU VMs
sudo systemctl mask nvidia-persistenced
# Enable only the watcher
sudo systemctl enable gpu-activate.path
# Keep the service itself disabled (the path unit will trigger it)
sudo systemctl disable gpu-activate.service

4. Cleaning the Base Image

Before converting to a template, we “seal” the VM to prevent identity leaks.

# Clear machine-ids to avoid DHCP/Network conflicts
sudo truncate -s 0 /etc/machine-id
sudo rm /var/lib/dbus/machine-id

# Clear logs and shell history
history -c
sudo rm -rf /var/log/*.log

5. Notes on Compression

If virt-sparsify fails with a libguestfs error, it is likely due to restricted kernel permissions on Ubuntu.

# Grant read permissions to the host kernel
sudo chmod 0644 /boot/vmlinuz*
sudo chmod 0644 /boot/initrd.img-6.8.0-90-generic

# Compress the final image using the direct backend
export LIBGUESTFS_BACKEND=direct
virt-sparsify --compress /source/ubuntu22g.qcow2 /destination/baseg.qcow2

6. Libvirt XML Configuration

Attach the device to your VM via virsh edit:

<hostdev mode='subsystem' type='pci' managed='yes'>
    <driver name='vfio'/>
    <source>
        <address domain='0x0000' bus='0x82' slot='0x00' function='0x0'/>
    </source>
</hostdev>