TechSetupGuides
Advancedproxmoxgpu-passthroughvfionvidiaubuntuqemu

NVIDIA GPU passthrough to an Ubuntu VM on Proxmox

Pass an RTX 3090 through to a dedicated Ubuntu 24.04 VM using VFIO. Covers IOMMU verification, driver blacklisting, VM creation strategy (install OS first, attach GPU after), and a snapshot discipline tailored to PCI-passthrough VMs.

  1. Step 1

    Prerequisites

    This guide assumes you've already finished Proxmox VE 9 host setup for AI workloads. That guide covers the underlying setup this one builds on. If you're picking up mid-stack, scan its 'Overview' step to confirm your environment matches.

  2. Step 2

    Overview

    GPU passthrough (via VFIO) assigns the RTX 3090 exclusively to the Ollama VM. The host cannot use the GPU — it is owned entirely by the VM, which sees it as a native PCI device and runs full NVIDIA drivers.

  3. Step 3

    Step 1: Verify IOMMU Groups

    Expected output for this build (Z370 chipset — PCIe bridge shares Group 1 with GPU): Note: The PCIe bridge (00:01.0) shares the IOMMU group with the GPU on Z370 boards. Attempting to pass through the bridge causes an error. Pass only the GPU (01:00.0) and audio (01:00.1) devices — this works correctly on this hardware.

    for d in /sys/kernel/iommu_groups/*/devices/*; do
      n=${d#*/iommu_groups/*}; n=${n%%/*}
      printf 'Group %s: ' "$n"
      lspci -nns "${d##*/}"
    done | sort -V | grep -A2 -B2 -i nvidia
    Group 1: 00:01.0 PCI bridge   Intel 6th-10th Gen PCIe Controller [8086:1901]
    Group 1: 01:00.0 VGA          NVIDIA GA102 [GeForce RTX 3090] [10de:2204]
    Group 1: 01:00.1 Audio        NVIDIA GA102 HD Audio [10de:1aef]
  4. Step 4

    Step 2: Blacklist GPU Drivers on Host

    Prevent the host from claiming the GPU before vfio-pci can bind it:

    echo "blacklist nouveau
    blacklist nvidia
    blacklist nvidiafb
    blacklist nvidia_drm
    blacklist nova_core" > /etc/modprobe.d/blacklist-gpu.conf
    # nova_core is a new in-kernel NVIDIA driver in Linux 6.x
    # It must be blacklisted or it may claim the GPU before vfio-pci
  5. Step 5

    Step 3: Bind GPU to vfio-pci

    Tell the kernel that the RTX 3090 and its HDMI audio device belong to the VFIO driver. The ids= values come from the lspci output captured earlier. The modules-load.d file ensures the VFIO modules load at boot.

    # Use the vendor:device IDs from lspci output
    echo "options vfio-pci ids=10de:2204,10de:1aef" > /etc/modprobe.d/vfio.conf
    echo "vfio
    vfio_iommu_type1
    vfio_pci
    vfio_virqfd" > /etc/modules-load.d/vfio.conf
    update-initramfs -u -k all
    reboot
  6. Step 6

    Step 4: Verify vfio-pci Claimed the GPU

    Step 4: Verify vfio-pci Claimed the GPU.

    lspci -nnk -d 10de:2204
    lspci -nnk -d 10de:1aef
    # Both should show: Kernel driver in use: vfio-pci
  7. Step 7

    Step 5: Add GPU to VM

    After creating the VM (see Ollama VM Deployment), add the GPU and audio devices: Note: x-vga=1 tells QEMU to treat this as the primary display adapter. This causes the virtual console (noVNC/SPICE) to go blank since display output goes to the physical GPU. Use serial console (--serial0 socket --vga serial0) for VM access during and after install.

    qm set 100 --hostpci0 01:00.0,pcie=1,x-vga=1
    qm set 100 --hostpci1 01:00.1,pcie=1
    qm config 100 | grep hostpci
    # Should show exactly two entries — no duplicates
  8. Step 8

    Strategy: Install OS First, Add GPU After

    Recommended approach: install Ubuntu without the GPU attached. With x-vga=1, the virtual console goes dark because display output is handed to the physical RTX 3090. Installing via serial console without the GPU avoids this entirely, then the GPU is added to the working VM afterward.

  9. Step 9

    Download Ubuntu ISO

    Pull the Ubuntu 24.04 Live Server ISO into Proxmox's ISO directory. Proxmox watches this directory and auto-registers any ISO files placed there — no separate import step.

    cd /var/lib/vz/template/iso
    wget https://releases.ubuntu.com/24.04.2/ubuntu-24.04.2-live-server-amd64.iso
    # Proxmox auto-detects ISOs in this directory — no import needed
  10. Step 10

    Create the VM (No GPU)

    Create the VM via qm create. The q35 machine type and OVMF (UEFI) firmware are required for PCIe passthrough. The serial console keeps the install screen accessible later — once the GPU is attached, the virtual display will go blank because x-vga hands output to the physical card.

    qm create 100 \
      --name ollama \
      --machine q35 \
      --bios ovmf \
      --efidisk0 local-lvm:1 \
      --cdrom local:iso/ubuntu-24.04.4-live-server-amd64.iso \
      --ostype l26 \
      --cpu host \
      --cores 8 \
      --memory 16384 \
      --net0 virtio,bridge=vmbr0 \
      --scsihw virtio-scsi-pci \
      --scsi0 local-lvm:64 \
      --serial0 socket \
      --vga serial0 \
      --boot "order=ide2;scsi0"
    # Key flags:
    # --machine q35       Required for PCIe passthrough
    # --bios ovmf         UEFI required for GPU passthrough
    # --cpu host          Passes through CPU flags needed by NVIDIA drivers
    # --serial0 + vga     Enables serial console (virtual display works without GPU)
  11. Step 11

    Take Base Snapshot

    Snapshot before installing anything. If the install corrupts or you want to start over, rolling back takes seconds versus recreating the VM from scratch.

    qm snapshot 100 base --description "Fresh VM, pre-install, no GPU"
    # Snapshot before any install = safe rollback point
  12. Step 12

    Install Ubuntu

    Installer settings:

    • Install type: Ubuntu Server (not minimized)
    • Network: DHCP — note the assigned IP
    • Storage: Entire disk, default LVM layout
    • Server name: ollama
    • Username: ollama
    • SSH: ✓ Install OpenSSH server
    • Featured snaps: skip all Note: Ubuntu 24.04 installer hangs at 'EFI stub: Loaded initrd' on some q35/OVMF configurations. Fix: press e at the boot menu, add nomodeset console=ttyS0,115200n8 to the linux line, then Ctrl+X to boot. This is a known issue with certain UEFI/initrd combinations. When the installer shows ‘Installation complete! Reboot now’ — eject the ISO first:
    qm start 100
    qm terminal 100
    # Ubuntu installer appears in terminal
    # On Proxmox host (separate SSH session)
    qm set 100 --ide2 none,media=cdrom
    # Then reboot in the installer
  13. Step 13

    Configure GRUB for Serial Console

    SSH into the VM after install, then make the serial console permanent:

    sudo nano /etc/default/grub
    # Set:
    GRUB_CMDLINE_LINUX_DEFAULT="nomodeset console=tty0 console=ttyS0,115200n8"
    GRUB_TERMINAL="serial console"
    GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"
    sudo update-grub
  14. Step 14

    Install QEMU Guest Agent

    Install the QEMU guest agent inside the VM so Proxmox can issue clean shutdowns, gather IP info, and quiesce disk writes during snapshots. Then enable the agent flag on the VM config.

    sudo apt-get install -y qemu-guest-agent
    # Starts automatically via socket activation on Ubuntu
    # Enable in Proxmox
    qm set 100 --agent enabled=1
    # Verify
    qm agent 100 ping  # no error = working
  15. Step 15

    Take Post-Install Snapshot

    Snapshot the working pre-GPU baseline. If GPU passthrough breaks the VM later, this lets you roll back to a confirmed-good Ubuntu install rather than redoing the OS install.

    qm snapshot 100 post-install --description "Ubuntu installed, SSH working, pre GPU driver"
  16. Step 16

    Add GPU Passthrough

    Shut down the VM, attach the GPU (PCI ID 01:00.0) and its HDMI audio function (01:00.1), then bring it back up. The first SSH after restart should show the RTX 3090 in lspci.

    qm shutdown 100
    qm set 100 --hostpci0 01:00.0,pcie=1,x-vga=1
    qm set 100 --hostpci1 01:00.1,pcie=1
    qm start 100
    ssh ollama@192.168.20.110
    lspci | grep -i nvidia
    # Should show RTX 3090 and HD Audio
  17. Step 17

    Install NVIDIA Drivers

    Install the recommended proprietary NVIDIA driver via ubuntu-drivers. After reboot, nvidia-smi should show the RTX 3090 with the full 24576 MiB of VRAM visible.

    sudo apt-get update
    sudo apt-get install -y ubuntu-drivers-common
    sudo ubuntu-drivers install
    # Auto-detects RTX 3090 and installs recommended driver
    sudo reboot
    # Verify after reboot
    nvidia-smi
    # Expected: Driver 595.58.03, CUDA 13.2, 24576MiB VRAM
  18. Step 18

    Maintaining NVIDIA Drivers Across Kernel Updates

    Ubuntu's unattended-upgrades can update the kernel between reboots. If the disk is full when the kernel-specific NVIDIA module package (linux-modules-nvidia-*) is installed, the package fails silently — the driver appears installed but the modules are absent on the next boot.

    Verify modules are loaded after every kernel update or unexpected reboot:

    If any check fails, reinstall the kernel module package and rebuild the initramfs:

    Prevention: keep at least 2 GB free on the root partition. Monitor with df -h /. A full root disk is the most common cause of silent module install failures on Ubuntu.

    # Verify NVIDIA modules are loaded
    lsmod | grep nvidia          # should list nvidia, nvidia_uvm, nvidia_drm, etc.
    ls /dev/nvidia*              # should show /dev/nvidia0, /dev/nvidiactl, /dev/nvidia-uvm
    nvidia-smi                   # full driver + GPU info
    
    # --- Recovery if modules are missing ---
    
    # 1. Reinstall the kernel-specific NVIDIA module package
    sudo apt-get install -y linux-modules-nvidia-595-open-generic
    
    # 2. Load modules manually (no reboot needed to verify)
    sudo modprobe nvidia nvidia_uvm nvidia_drm
    
    # 3. Rebuild the initramfs so modules load at next boot
    sudo update-initramfs -u -k $(uname -r)
    
    # 4. Restart Ollama
    sudo systemctl restart ollama
    nvidia-smi  # confirm driver is back
  19. Step 19

    Take GPU Snapshot

    Snapshot the working GPU passthrough configuration. PCI passthrough requires the VM to be shut down for snapshotting, so this is a deliberate stop point worth taking.

    # Must be done with VM shut down (PCI passthrough limitation)
    qm shutdown 100
    qm snapshot 100 passthrough-ok --description "GPU passthrough working, NVIDIA 595.58.03"
    qm start 100
  20. Step 20

    Snapshots vs Backups

    Snapshot Backup Seconds Minutes Same disk as VM backup-hdd (separate disk) No Yes Quick rollback before risky changes Disaster recovery

    Speed
    Location
    Survives disk failure
    Use for
  21. Step 21

    Snapshot Limitation with GPU Passthrough

    VMs with PCI passthrough devices cannot be snapshotted while running. The VM must be shut down first. This is a QEMU limitation, not a Proxmox one.

    qm shutdown 100
    qm snapshot 100 <name> --description '<description>'
    qm start 100
  22. Step 22

    Snapshot Limitation with Directory Storage

    Disks stored on directory-based storage (like model-storage) do not support snapshots. Exclude them:

    qm set 100 --scsi1 model-storage:100/vm-100-disk-0.raw,size=500G,backup=0,snapshot=0
  23. Step 23

    Naming Convention

    Snapshot name When to take it Description After VM creation, before OS install Fresh VM, pre-install After OS install + SSH verified Ubuntu installed, working SSH After GPU driver verified in VM NVIDIA driver + CUDA working After Ollama running + GPU inference Full stack working Before any risky operation Rollback point for <change>

    base
    post-install
    passthrough-ok
    ollama-ok
    pre-<change>
  24. Step 24

    Key Commands

    Day-to-day snapshot operations on VM 100. Snapshot operations require the VM to be shut down whenever PCI passthrough devices are attached.

    qm listsnapshot 100              # list all snapshots
    qm snapshot 100 <name> --description '<desc>'
    qm rollback 100 <name>           # restore to snapshot
    qm delsnapshot 100 <name>        # delete snapshot
  25. Step 25

    Next: continue building the stack

    With this layer in place, the next guide in the series is Ollama on a dedicated NVMe with performance tuning. Repurpose a second NVMe as Ollama model storage, install Ollama, and tune it for a 24 GB RTX 3090 — KV cache quantization, FlashAttention, model context windows, and a curated model stack with routing guidance.

Feature requests

Sign in to suggest features or vote on existing ones.

No feature requests yet.

Discussion

0 people marked this as worked·Sign in to mark your own.

Sign in to join the discussion.

No comments yet.