Skip to content

Home Lab

Notes from my learning sessions

Menu
Menu

Ceph + KVM : 1. Planning and preparing for Ceph Storage

Posted on September 22, 2024April 7, 2025 by sandeep

                                                                                                                                                                          Next :  Installing Ceph

I have been working on developing microservices as part of an application deployed on K8S. More time and effort goes into building resiliency and handling failure than building application features. One of the key areas where challenges were observed was related to storage, which is the backbone of any application.  

Searching the net, I identified that a robust distributed storage system that is scalable and highly available, along with support for integrating with K8S-based applications easily, will abstract out the storage-related challenges we had encountered. A further limited search on the net identified Ceph Storage as a prime candidate. Deployed on decent hardware, it would provide a production-grade platform for on-premise deployments of SMEs.

Key objectives:

  • The storage cluster should satisfy the applications’ IOPS requirements.
  • Time-tested CSI driver support for the storage is a must.
  • The CSI driver’s over-the-wire encryption support will be an added advantage.
  • The hardware requirements should not be high or prohibitive for SMEs planning to opt for on-premise deployments.
  • It should be open-source with a decent number of production deployments.
  • Active community support and decent documentation are required.
    • Initial deployment complexities are not an issue as they would be one-time and can be documented.
  • Version upgrades and security patch applications should be possible.
    • It would be an added advantage if they were simple and time-tested.
  • Should support dynamic volume expansion
  • Though a K8S feature, migrating pods/stateful sets should be automatic and consistent in case of server or worker node failures.

The decision is to use Ceph RBD and not Ceph FS. I am new to using Ceph and am starting with this decision. All worker node VMs will be backed by block storage in Ceph RBD. The Ceph CSI Driver (RBD) will enable block objects for PV requirements.

Available hardware: 4 x Dell R630 with the following configuration

  • Intel X710 daughter board – 4 x 10G SFP+
  • Intel X520 – DA2 – Adapter – 2 x 10G SFP+ (Slot 2)
  • Dual M.2 NVME PCIe 3 Adapter (x4 Bifurcation) – Slots 1 & 3
    • Samsung 970 EVO Plus NVMe – 4 numbers
  • 4 x 10K RPM 1.2 TB SAS Drive in RAID 10 (PERC Mini 730p) for Boot / OS

MicroTik CRS-326-24S+2Q+RM

  • Bridge Mode
  • 9000 MTU
    • Ensures Maximum utilisation of available network throughput
    • Consistent 9.86 Gbps in iperf3 test results
  • 20 SFP+ ports connected to servers (5 ports each)
    • Direct Attach Cable
  • 1 SFP+ port connected to TP-Link Router (uplink)
    • Direct Attach Cable

TP-Link ER8411

  • 10G SFP+ LAN Port connected to Cloud Router Switch
  • 10G SFP+ WAN Port connected to Gateway UTM device
    • UTM device gateway is 1G RJ45, used Microtik S+RJ10 Copper Module

Deployment

2 x 10 G NICs dedicated to Ceph Storage, one for public network and one for cluster network

  • 10.0.4.0 /24 – public network
  • 10.0.5.0/24 – cluster network

2 x 10 G NICs dedicated to Percona XDB cluster, one for application access and one for cluster synch

  • 10.0.2.0/24 – Service, application access
  • 10.0.3.0/24 – Cluster sync

1 x 10G NIC for server management and accessing VMs

  • 10.0.1.0/24 – Servers and VM Management / Access

Note: I initially attempted to use Ubuntu 24.04, as the default repositories included Ceph-Squid release packages, however after understanding that Ceph-Reef was the official stable release and my affinity towards Debian, I decided to go with Debian 12 and Ceph-Reef

Install Debian 12.7 on the server(s)

Login/ssh into the server with the user account configured during installation

Switch to the root user account

su -

Enable remote login for root user account

sed -i "s/#PermitRootLogin prohibit-password/PermitRootLogin yes/g" /etc/ssh/sshd_config
sed -i "s/#PubkeyAuthentication/PubkeyAuthentication/g" /etc/ssh/sshd_config
sed -i "s/#AuthorizedKeysFile/AuthorizedKeysFile/g" /etc/ssh/sshd_config
sed -i "s/# StrictHostKeyChecking ask/ StrictHostKeyChecking no/g" /etc/ssh/ssh_config
sed -i "s/session optional pam_motd.so/#session optional pam_motd.so/g" /etc/pam.d/sshd
sed -i "s/session optional pam_motd.so/#session optional pam_motd.so/g" /etc/pam.d/sshd
service ssh restart

Remove CDROM from the apt sources list.

sed -i '/deb cdrom/d' /etc/apt/sources.list

Logout and log in as root user

Install required packages

apt -y install net-tools systemd-resolved fio iperf3 gnupg2 software-properties-common lvm2 nfs-common

Configure DNS server IP – Let the system resolve and manage DNS server configuration.

ln -fs /run/systemd/resolve/resolv.conf /etc/resolv.conf
sed -i "s/^\#DNS.*/DNS=8.8.8.8/g" /etc/systemd/resolved.conf
systemctl restart systemd-resolved

Disable daily update timers.

systemctl stop apt-daily-upgrade.timer apt-daily.timer apparmor
systemctl disable apt-daily-upgrade.timer apt-daily.timer apparmor

Configure NTP server and restart NTP services

timedatectl set-timezone "Asia/Kolkata"
sed -i "s/#NTP=/NTP=time\.google\.com/g" /etc/systemd/timesyncd.conf
sed -i "s/#FallbackNTP=ntp.ubuntu.com/FallbackNTP=ntp\.ubuntu\.com/g" /etc/systemd/timesyncd.conf
systemctl stop systemd-timesyncd.service
systemctl start systemd-timesyncd.service

Configure max file size of the journal

sed -i "s/#SystemMaxFileSize.*/SystemMaxFileSize=512M/g" /etc/systemd/journald.conf

Configure the maximum number of files open and the maximum number of processes

echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nproc 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536" >> /etc/security/limits.conf

Disable IPV6 and enable huge pages

sed -i 's/GRUB_CMDLINE_LINUX=""/GRUB_CMDLINE_LINUX="ipv6.disable=1 default_hugepagesz=1G hugepagesz=1G hugepages=64 transparent_hugepage=never"/g' /etc/default/grub
update-grub

Configure network interface configuration – update /etc/network/interfaces (Sample provided here is from server1 – Change it as required)

source /etc/network/interfaces.d/*

auto lo
iface lo inet loopback

allow-hotplug eno1
iface eno1 inet static
address 10.0.1.1/16
gateway 10.0.0.1
mtu 9000

allow-hotplug eno2
iface eno2 inet static
address 10.0.2.1/24
mtu 9000

allow-hotplug eno3
iface eno3 inet static
address 10.0.3.1/24
mtu 9000

allow-hotplug eno4
iface eno4 inet static
address 10.0.4.1/24
mtu 9000

allow-hotplug enp129s0
iface enp129s0 inet static
address 10.0.5.1/24
mtu 9000

Reboot the server

Enable key-based, passwordless SSH between servers

echo "Host ceph1" > ~/.ssh/config
echo " Hostname ceph1" >> ~/.ssh/config
echo " User root" >> ~/.ssh/config
echo "Host ceph2" >> ~/.ssh/config
echo " Hostname ceph2" >> ~/.ssh/config
echo " User root" >> ~/.ssh/config
echo "Host ceph3" >> ~/.ssh/config
echo " Hostname ceph3" >> ~/.ssh/config
echo " User root" >> ~/.ssh/config
echo "Host ceph4" >> ~/.ssh/config
echo " Hostname ceph4" >> ~/.ssh/config
echo " User root" >> ~/.ssh/config

ssh-keygen -q -N ""

ssh-keygen -f '/root/.ssh/known_hosts' -R 'ceph1'

ssh-keygen -f '/root/.ssh/known_hosts' -R 'ceph2'
ssh-keygen -f '/root/.ssh/known_hosts' -R 'ceph3'
ssh-keygen -f '/root/.ssh/known_hosts' -R 'ceph4'
ssh-copy-id ceph1
ssh-copy-id ceph2
ssh-copy-id ceph3
ssh-copy-id ceph4

Basic iperf3 check between two servers (1 and 3)

root@server1:~# iperf3 -s -B 10.0.5.1
-----------------------------------------------------------
Server listening on 5201 (test #1)
-----------------------------------------------------------
Accepted connection from 10.0.5.4, port 41562
[ 5] local 10.0.5.1 port 5201 connected to 10.0.5.4 port 41578
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 1.15 GBytes 9.88 Gbits/sec
[ 5] 1.00-2.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 2.00-3.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 3.00-4.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 4.00-5.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 5.00-6.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 6.00-7.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 7.00-8.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 8.00-9.00 sec 1.15 GBytes 9.90 Gbits/sec
[ 5] 9.00-10.00 sec 1.15 GBytes 9.90 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec receiver
-----------------------------------------------------------
Server listening on 5201 (test #2)
-----------------------------------------------------------

root@server4:~# iperf3 -c 10.0.5.1
Connecting to host 10.0.5.1, port 5201
[ 5] local 10.0.5.4 port 41578 connected to 10.0.5.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 1.15 GBytes 9.91 Gbits/sec 0 1.44 MBytes
[ 5] 1.00-2.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.51 MBytes
[ 5] 2.00-3.00 sec 1.15 GBytes 9.89 Gbits/sec 0 1.51 MBytes
[ 5] 3.00-4.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.59 MBytes
[ 5] 4.00-5.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.59 MBytes
[ 5] 5.00-6.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.59 MBytes
[ 5] 6.00-7.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.59 MBytes
[ 5] 7.00-8.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.59 MBytes
[ 5] 8.00-9.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.59 MBytes
[ 5] 9.00-10.00 sec 1.15 GBytes 9.90 Gbits/sec 0 1.59 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec receiver

iperf Done.
root@server4:~#
root@server1:# cat /proc/meminfo | grep "HugePages"
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 32000
HugePages_Free: 32000
HugePages_Rsvd: 0
HugePages_Surp: 0
root@server1:~#

Some notes on selecting NVME
  • Check for PLP (Power loss protection)
  • Prefer TLC over QLC
  • TBW – the higher, the better

Recent Posts

  • Ceph + KVM: 4. Orchestrating Ceph RBD backed VMs on KVM Hosts
  • Rabbit MQ Cluster + HAProxy + Keepalived
  • Install and configure MariaDB / Galera cluster
  • Ceph + KVM : 3. Installing KVM, cutsomized monitoring scripts
  • Ceph + KVM : 5. Service checks and CLI commands
  • Ceph + KVM : 2. Installation – Ceph Storage
  • Ceph + KVM : 1. Planning and preparing for Ceph Storage
  • Openstack Xena on Ubuntu 20.04 – Cinder
  • Preparing custom Debian 11 MATE image
  • Setup Ubuntu 20.04 repository mirror server

Archives

  • April 2025
  • March 2025
  • October 2024
  • September 2024
  • April 2022
  • March 2022
  • February 2022
  • December 2021
  • October 2021
  • September 2021
  • October 2020
  • February 2020
  • January 2020
  • December 2019
© 2025 Home Lab | Powered by Minimalist Blog WordPress Theme