Infrastructure overview – Home Lab

This blog provides a detailed overview of the physical infrastructure supporting the lab and production-grade deployment environment. The goal of this platform is to create a high-performance, high-availability foundation for virtualized workloads, container-based services, storage clusters, and management applications. The hardware has been selected and organized to deliver predictable performance, strong isolation, and reliable scaling.

1. Dedicated Management Server

A single Dell PowerEdge R630 is allocated exclusively for core data-center management functions. This server hosts all essential internal management services, ensuring they remain isolated from compute and storage workloads.

Key responsibilities of the management server:

Internal Ubuntu package mirror
Management database (for orchestration, tracking, and operational metadata)
Web and FTP services
Private Docker/OCI registry
Backup and archival workflows
Various supporting services required for cluster bootstrap and ongoing operations

This server is equipped with a hardware RAID-10 array built using 8 × 1 TB SSDs, providing both high throughput and redundancy for critical management data.

2. Compute, Virtualization & Storage Cluster

The core of the environment consists of four high-performance servers, designed to host:

A Ceph storage cluster
KVM hypervisors
An RKE2-based Kubernetes cluster

Each server plays a dual role—delivering compute capacity for virtual machines and Kubernetes workloads while simultaneously participating in a distributed Ceph storage pool.

Two dedicated storage tiers are defined:

NVMe pool for high-IOPs workloads
SSD pool for capacity and balanced throughput

All VM disk requirements are fulfilled using Ceph RBD (RADOS Block Device), enabling high availability, live migration flexibility, and unified storage management across the cluster.

3. Network Architecture

Every server in the cluster is equipped with the following network interfaces:

4 × 10 GbE NICs
2 × 40 GbE NICs (Mellanox)

The network has been segmented with clear traffic isolation to avoid contention and ensure low-latency performance across storage, compute, and management layers.

Network Interface Allocation

Interface	Purpose
40G NIC #1	Dedicated Ceph cluster backend network
40G NIC #2	Application database cluster synchronization traffic
10G NIC #1	Ceph north-bound access (client access to storage)
10G NIC #2	Application DB north-bound access
10G NIC #3	VM management and hypervisor control plane
10G NIC #4	Worker-node and Kubernetes overlay traffic

This separation ensures deterministic performance even under load, with each subsystem receiving its own physical bandwidth and path.

4. Switching & Connectivity

A single Arista 7050QX-30S switch forms the backbone of the environment. The switch provides 40G and 10G capabilities to meet both high-speed storage traffic and standard data-plane requirements.

Key connectivity features:

40G to 4×10G breakout AOC cables for attaching 10G server ports directly to the Arista switch
40G QSFP Active Optical Cables for direct connections between Mellanox 40G NICs and the switch
All interfaces configured with MTU 9000 (jumbo frames) to optimize Ceph, VM migration, and Kubernetes overlay performance

5. Firewall & Gateway

A Ubiquiti UDM Pro is deployed as the edge firewall and primary gateway for the environment. It provides:

WAN routing
Firewall segmentation
VPN access
Traffic monitoring
Internal VLAN gateway services

The UDM Pro uplinks to the Arista switch via a 10G SFP+ link, allowing high-throughput east-west and north-south traffic.

6. Virtualization Platform

KVM is the hypervisor of choice for all virtual machine workloads.
A custom-built Hypervisor Management Solution orchestrates:

VM creation and lifecycle
vCPU and memory allocation
Image handling
vGPU support (where applicable)
RBD-backed storage assignments
Automated configuration for networks and bridge mappings

This approach provides full control of the virtualization environment without relying on heavy external platforms.

7. Storage Layer: Ceph Cluster

The Ceph cluster spans the four compute nodes, forming a distributed storage backend with:

NVMe OSDs for high-performance, low-latency operations
SSD OSDs for general storage pools

All storage for virtual machines, container workloads, and application services is exposed as Ceph RBD block devices, enabling:

Redundant and self-healing storage
Transparent failover
Uniform storage consumption across compute nodes
Simplified scaling by adding OSDs or nodes

This architecture ensures resilience and performance for both VM and Kubernetes workloads.

Conclusion

This hardware platform provides a robust foundation for infrastructure automation, virtual machine orchestration, container platform deployment, and distributed storage operations.
By combining high-speed networking, redundant storage, compute density, and strong isolation between management and workload traffic, the environment is designed to scale predictably and operate efficiently under demanding usage.

A clean separation of roles—management, compute, storage, networking, and control—ensures the long-term maintainability and reliability of the entire system.