This document describes the process for creating a reusable, deterministic Ubuntu 22.04 QCOW2 base image optimized for GPU-accelerated AI workloads. The resulting image is intended to be cloned and customized by automation/orchestration tooling.
1. Base OS Installation
- Operating System: Ubuntu Server 22.04 LTS
- Kernel: GA kernel (5.15) — avoid HWE
- Disk Size: 50 GB (future-proof; base image only)
Why avoid the HWE kernel in the base image
- GA kernel (5.15) provides maximum stability and predictable behavior.
- Reduces DKMS rebuild failures during NVIDIA driver installation.
- Avoids kernel churn across clones.
- HWE (6.x) can be selectively enabled later if required, but should not be baked into the golden image.
2. System Update
apt update -y
apt upgrade -y
3. SSH Configuration (Remote Root Access)
Update SSH daemon configuration
Edit /etc/ssh/sshd_config and ensure the following are enabled:
PermitRootLogin yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys .ssh/authorized_keys2
Update SSH client configuration
Edit /etc/ssh/ssh_config:
StrictHostKeyChecking no
Set root password
passwd
Generate SSH key (optional, for automation)
ssh-keygen
4. Clean Login Noise (MOTD)
Disable MOTD messages
Edit /etc/pam.d/ssh and comment out:
#session optional pam_motd.so motd=/run/motd.dynamic
#session optional pam_motd.so noupdate
#session optional pam_mail.so standard noenv
Disable MOTD news
Edit /etc/default/motd-news:
ENABLED=0
5. Remove Snap and snapd
snap list
snap remove lxd
snap remove core20
snap remove snapd
apt purge --remove snapd
rm -rf /root/snap/
6. Disable Swap (AI-friendly, deterministic memory behavior)
systemctl list-units | grep swap
systemctl stop swap.img.swap swap.target
systemctl disable swap.img.swap swap.target
systemctl mask swap.img.swap swap.target
swapoff -a
rm -f /swap.img
Remove swap entries from /etc/fstab.
7. Deterministic Network Interface Naming
Edit /etc/default/grub:
GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0"
Apply and reboot:
update-grub
reboot
8. Cloud-Init Networking Cleanup
rm -f /etc/cloud/cloud.cfg.d/90-installer-network.cfg
Disable cloud-init networking entirely:
echo "network: {config: disabled}" > /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg
9. Limit Systemd Journal Size
sed -i 's/#SystemMaxFileSize=.*/SystemMaxFileSize=512M/' /etc/systemd/journald.conf
10. systemd-resolved Configuration
Edit /etc/systemd/resolved.conf:
DNS=<Your DNS Server IP>
FallbackDNS=8.8.8.8
Domains=<Your domain>
DNSStubListener=no
Apply:
ln -fs /run/systemd/resolve/resolv.conf /etc/resolv.conf
systemctl restart systemd-resolved
11. Install Base Utilities
apt install -y \
net-tools rsyslog bc fio iperf3 gnupg2 \
software-properties-common lvm2 nfs-common jq
12. Disable Unwanted Timers
systemctl list-units | grep timer
Disable:
systemctl stop apt-daily-upgrade.timer apt-daily.timer fwupd-refresh.timer \
motd-news.timer update-notifier-download.timer update-notifier-motd.timer
systemctl disable apt-daily-upgrade.timer apt-daily.timer fwupd-refresh.timer \
motd-news.timer update-notifier-download.timer update-notifier-motd.timer
systemctl mask apt-daily-upgrade.timer apt-daily.timer fwupd-refresh.timer \
motd-news.timer update-notifier-download.timer update-notifier-motd.timer
13. Disable Unattended Services
systemctl stop unattended-upgrades.service apparmor ufw ubuntu-advantage-tools
systemctl disable unattended-upgrades.service apparmor ufw ubuntu-advantage-tools
systemctl mask unattended-upgrades.service apparmor ufw ubuntu-advantage-tools
14. Kernel Headers and Toolchain
apt install -y linux-headers-$(uname -r) build-essential dkms gcc make
15. GPU Drivers
Driver selection
- Use the Ubuntu-recommended NVIDIA driver (via
nvidia-detector) - Driver version 580 chosen for:
- NVIDIA L4 compatibility
- Stability on kernel 5.15
- Forward CUDA compatibility
Install driver and utilities
apt install -y nvidia-driver-580-server nvidia-utils-580-server
CUDA runtime is intentionally not installed system-wide.
16. Python Runtime
apt install -y python3 python3-pip
17. Add environmental variables
Create directories used by Hugging Face
mkdir -p /var/lib/huggingface/{hub,transformers}
chmod -R 755 /var/lib/huggingface
Add the following to /etc/environment
HF_HOME=/var/lib/huggingface
TRANSFORMERS_CACHE=/var/lib/huggingface/transformers
HF_HUB_CACHE=/var/lib/huggingface/hubHF_HOME=/var/lib/huggingface
TRANSFORMERS_CACHE=/var/lib/huggingface/transformers
HF_HUB_CACHE=/var/lib/huggingface/hub
TOKENIZERS_PARALLELISM=false
18. Core Deep Learning Framework
PyTorch (CUDA 12.4 runtime via wheels)
pip install torch torchvision torchaudio \
--index-url https://download.pytorch.org/whl/cu124
Rationale:
- PyTorch wheels ship the CUDA runtime internally.
- Avoids system-level CUDA toolkit dependency.
- Decouples framework upgrades from the OS image.
19. Scientific Python Stack
pip install numpy scipy pandas
20. Hugging Face Ecosystem (Minimal Core)
pip install transformers tokenizers safetensors
21. GPU-Aware Utilities
pip install accelerate
22. Disk Resize on First Boot
Create /usr/local/bin/resizedisk:
#!/bin/bash
growpart /dev/vda 2
partx --update /dev/vda2
resize2fs /dev/vda2
systemctl stop guestfs-firstboot.service
systemctl disable guestfs-firstboot.service
rm -f /root/*.log
Set permissions:
chmod +x /usr/local/bin/resizedisk
23. Filesystem Optimization and Compaction
e4defrag /
fstrim -av
dd if=/dev/zero of=/zero.fill bs=1M status=progress
rm -f /zero.fill
fstrim -av
24. Final Cleanup and Shutdown
history -c
shutdown -h now
25. Export Base QCOW2 Image
Compress and finalize the base image:
virt-sparsify --compress \
/var/lib/libvirt/images/ubuntu22.qcow2 \
/root/kvm-local/ubuntu22/base.qcow2