VLLM Service

The objective is to start the VLLM service based on QWEN3-VL-8B-FP8; this will be used initially for AI-assisted purchase order processing. Rationale for model selection: NVIDIA L4 GPU (One number) available in the server. Download model hf auth loginhf download Qwen/Qwen3-VL-8B-Instruct-FP8 –local-dir /opt/models/Qwen3-VL-8B-FP8 Create a systemd unit file to start the service (/etc/systemd/system/vllm.service) [Unit]Description=vLLM Qwen … Read more

Custom Cloud Image for AI Workloads

This document describes the process for creating a reusable, deterministic Ubuntu 24.04 QCOW2 base image optimized for GPU-accelerated AI workloads. The resulting image is intended to be cloned and customized by automation/orchestration tooling. 1. Base OS Installation Why avoid the HWE kernel in the base image 2. System Update 3. SSH Configuration (Remote Root Access) … Read more

Installing PostgreSQL 16

PostgreSQL 16 is the ideal choice for this environment because it delivers major improvements in performance, parallel query execution, index efficiency, and write-ahead logging throughput—all of which directly benefit Git metadata workloads and AI/ML-related query patterns. PG16 also includes enhanced vacuum performance, faster sorting, and better handling of high-concurrency workloads, making it extremely well-suited for … Read more