This document describes the process for creating a reusable, deterministic Ubuntu 24.04 QCOW2 base image optimized for GPU-accelerated AI workloads. The resulting image is intended to be cloned and customized by automation/orchestration tooling.
1. Base OS Installation
- Operating System: Ubuntu Server 24.04 LTS
- Kernel: GA kernel (6.8.0) — avoid HWE
- Disk Size: 50 GB (future-proof; base image only)
Why avoid the HWE kernel in the base image
- GA kernel (6.8.0) provides maximum stability and predictable behavior.
- Reduces DKMS rebuild failures during NVIDIA driver installation.
- Avoids kernel churn across clones.fstrin
2. System Update
apt update -y
apt upgrade -y
3. SSH Configuration (Remote Root Access)
Update SSH daemon configuration
Edit /etc/ssh/sshd_config and ensure the following are enabled:
PermitRootLogin yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys .ssh/authorized_keys2
Update SSH client configuration
Edit /etc/ssh/ssh_config:
StrictHostKeyChecking no
Set root password
passwd
Generate SSH key (optional, for automation)
ssh-keygen
4. Clean Login Noise (MOTD)
Disable MOTD messages
Edit /etc/pam.d/ssh and comment out:
#session optional pam_motd.so motd=/run/motd.dynamic
#session optional pam_motd.so noupdate
#session optional pam_mail.so standard noenv
Disable MOTD news
Edit /etc/default/motd-news:
ENABLED=0
5. Remove Snap and snapd (and plymouth)
snap list
snap remove lxd
snap remove core20
snap remove snapd
apt purge --remove snapd
rm -rf /root/snap/
apt remove --purge plymouth
6. Disable Swap (AI-friendly, deterministic memory behavior)
systemctl list-units | grep swap
systemctl stop swap.img.swap swap.target
systemctl disable swap.img.swap swap.target
systemctl mask swap.img.swap swap.target
swapoff -a
rm -f /swap.img
Remove swap entries from /etc/fstab.
8. Cloud-Init Networking Cleanup
rm -f /etc/cloud/cloud.cfg.d/90-installer-network.cfg
Disable cloud-init networking entirely:
echo "network: {config: disabled}" > /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg
9. Limit Systemd Journal Size
sed -i 's/#SystemMaxFileSize=.*/SystemMaxFileSize=512M/' /etc/systemd/journald.conf
10. systemd-resolved Configuration
Edit /etc/systemd/resolved.conf:
DNS=<Your DNS Server IP>
FallbackDNS=8.8.8.8
Domains=<Your domain>
DNSStubListener=no
Apply:
ln -fs /run/systemd/resolve/resolv.conf /etc/resolv.conf
systemctl restart systemd-resolved
11. Install Base Utilities
apt install -y \
net-tools rsyslog bc fio iperf3 gnupg2 \
software-properties-common lvm2 nfs-common jq
12. Disable Unwanted Timers
systemctl list-units | grep timer
Disable:
systemctl stop apt-daily-upgrade.timer apt-daily.timer fwupd-refresh.timer \
motd-news.timer update-notifier-download.timer update-notifier-motd.timer
systemctl disable apt-daily-upgrade.timer apt-daily.timer fwupd-refresh.timer \
motd-news.timer update-notifier-download.timer update-notifier-motd.timer
systemctl mask apt-daily-upgrade.timer apt-daily.timer fwupd-refresh.timer \
motd-news.timer update-notifier-download.timer update-notifier-motd.timer
13. Disable Unattended Services
systemctl stop unattended-upgrades.service apparmor ufw ubuntu-advantage-tools
systemctl disable unattended-upgrades.service apparmor ufw ubuntu-advantage-tools
systemctl mask unattended-upgrades.service apparmor ufw ubuntu-advantage-tools
14. GRUB updates
Edit /etc/default/grub and update the following lines
GRUB_CMDLINE_LINUX_DEFAULT=”net.ifnames=0 biosdevname=0 cpufreq.default_governor=performance”
GRUB_TERMINAL=”console serial”
GRUB_SERIAL_COMMAND=”serial –speed=115200 –unit=0 –word=8 –parity=no –stop=1″net.ifnames=0 & biosdevname=0: Disables predictable naming to force the classic eth0 convention. This ensures that manually injected Netplan configurations “hit” the correct interface without needing to probe for hardware-specific names (like enp0s3) during orchestration.
cpufreq.default_governor=performance: Eliminates CPU frequency scaling latency. The VM operates at maximum clock speed immediately upon boot, which is critical for consistent AI workload performance.
GRUB_CMDLINE_LINUX="": Kept empty to ensure the kernel parameters remain modular and easily overridable via the default string.
GRUB_TERMINAL="console serial": Dual-routes the bootloader output to both virtual VGA and the serial port. This allows orchestration logs to be captured via virsh console even if the VM is headless.
GRUB_SERIAL_COMMAND: Standardizes the serial interface at 115200 baud, ensuring that host-side monitoring scripts can reliably parse boot and kernel messages from the start.
Disable Nouveau (Blacklist)
Even if the driver isn’t fully active, Nouveau can sometimes “touch” the hardware during boot, which interferes with the NVIDIA driver installation or VFIO binding later.
Create a blacklist file: sudo nano /etc/modprobe.d/blacklist-nouveau.conf
Add these lines:
blacklist nouveau
options nouveau modeset=0
Load VFIO Modules at Boot
To ensure the guest can properly handle the passed-through hardware, load the VFIO modules into the kernel early.
Open the modules file: sudo nano /etc/modules
Append these lines:
vfiovfio_iommu_type1vfio_pcivfio_virqfd
Final Image Sync
Since you’ve modified modules and blacklists, you must rebuild the initramfs and update GRUB to ensure these settings are baked into the early boot process:
sudo update-initramfs -u
sudo update-grub
reboot
15. Kernel Headers and Toolchain
apt install -y linux-headers-$(uname -r) build-essential dkms gcc make libboost-program-options-dev cmake ninja-build
16. GPU Drivers
Driver selection
- Use the Ubuntu-recommended NVIDIA driver (via
nvidia-detector) - Driver version 580 chosen for:
- NVIDIA L4 compatibility
- Stability on kernel 6.8
- Forward CUDA compatibility
Install driver and utilities
apt install -y nvidia-driver-580-server nvidia-utils-580-server
CUDA runtime is intentionally not installed system-wide.
17. Python Runtime, IO utility
Based on trial-and-error with dependencies (specifically flash-attn/vllm), CUDA Toolkit 12.8 was identified as an ideal candidate.
# 1. Download the Keyring
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
# 2. Install it
dpkg -i cuda-keyring_1.1-1_all.deb
# 3. Update apt cache
apt update -y
apt upgrade -y
apt install -y cuda-toolkit-12-8
Add the following to /etc/bash.bashrc (after adding logout and login once)
export CUDA_HOME=/usr/local/cuda-12.8
# Update PATH: Put CUDA first to override system defaults
export PATH=${CUDA_HOME}/bin:${PATH}
apt install -y python3 python3-pip poppler-utils libaio-dev
18. Add environmental variables, Create Virtual Env
Create directories used by Hugging Face
mkdir -p /var/lib/huggingface/{hub,transformers}
chmod -R 755 /var/lib/huggingface
Add the following to /etc/environment
HF_HOME=/var/lib/huggingface
TRANSFORMERS_CACHE=/var/lib/huggingface/transformers
HF_HUB_CACHE=/var/lib/huggingface/hubHF_HOME=/var/lib/huggingface
TOKENIZERS_PARALLELISM=false
Create venv
# 1. Install the venv tool
apt install -y python3-venv
# 2. Create the environment
python3 -m venv /opt/ai-env
# Add activation to the end of the root bash profile
echo 'source /opt/ai-env/bin/activate' >> ~/.bashrc
# Reload the profile for the current session
source ~/.bashrc
19. Install llama server
cd /opt
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build-cuda -DGGML_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES="89" -DLLAMA_BUILD_EXAMPLES=OFF -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_SERVER=ON
cmake --build build-cuda --config Release -j $(nproc)
cmake --install build-cuda
sudo ldconfig
cd ~
rm -rf /opt/llama.cpp
20. Disk Resize on First Boot
Create /usr/local/bin/resizedisk:
#!/bin/bash
growpart /dev/vda 2
partx --update /dev/vda2
resize2fs /dev/vda2
systemctl stop guestfs-firstboot.service
systemctl disable guestfs-firstboot.service
rm -f /root/*.log
Set permissions:
chmod +x /usr/local/bin/resizedisk
21. GPU Temperature monitoring script.
Script updates the current GPU temperature on the host in/opt/nvidia/gputempt.txt. The value is periodically read by a script in the host, based on temperature. FAN Speeds are adjusted to keep the temperature within limits. Note that passwordless SSH must be enabled before the crontab entry that runs this script is added.
Create /opt/nvidia/updatetemp.sh
#!/bin/bash
# Define variables
REMOTE_HOST="serverxxxx"
REMOTE_FILE="/opt/nvidia/gputemp.txt"
# Get the temperature
# nounits removes the 'C' so you just get the number (easier for parsing later)
TEMP=$(nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader,nounits | head -n1)
# Send to remote server
# The quotes around 'cat >> ...' ensure the redirection happens on the REMOTE server, not local.
echo "$TEMP" | ssh "$REMOTE_HOST" "cat > $REMOTE_FILE"
22. Filesystem Optimization and Compaction
e4defrag /
fstrim -av
dd if=/dev/zero of=/zero.fill bs=1M status=progress
rm -f /zero.fill
fstrim -av
23. Final Cleanup and Shutdown
truncate -s 0 /etc/machine-id (Essential for DHCP—many servers use this ID rather than the MAC address to assign IPs).
history -c
shutdown -h now
24. Export Base QCOW2 Image
Compress and finalize the base image:
virt-sparsify --compress \
/var/lib/libvirt/images/ubuntu24g.qcow2 \
/root/kvm-local/ubuntu24/baseg.qcow2