I observed some challenges in my tests when using RMQ version 4, downloading required packages from specific repos.
I decided to go with Rabbit MQ 3.10. I launched 3 Ubuntu 22.04 VMs as packages are maintained as part of the distro.
All the VMs have 3 NICs bridged to physical host NICs, which are dedicated for specific purposes. NIC 1 is for VM management (10.0.1.x), NIC 2 is for RabbmitMQ access (10.0.2.x), and NIC is for RabbitMQ cluster traffic (10.0.3.x). The DNS server is updated with host resolution for all IPs; for VM management, it would be dbx.domain.net; for RabbitMQ access, it would be rmqx.domain.net; and for cluster traffic, it would be rsynchx.domain.net.
Deciding to use only quorum queues.
Install rabbitmq-server in all nodes with a simple “apt install rabbitmq-server” command.
Create /etc/rabbitmq/rabbitmq.conf in all VMs (Replace x with the relevant IP octet)
# Client traffic (AMQP)
listeners.tcp.default = 10.0.2.x:5672
# Management UI (optional)
management.listener.ip = 10.0.2.x
management.listener.port = 15672
# Clustering communication
cluster_formation.peer_discovery_backend = rabbit_peer_discovery_classic_config
# The node names match what is configured in DNS server for RabbitMQ access
cluster_formation.classic_config.nodes.1 = rabbit@rmq1
cluster_formation.classic_config.nodes.2 = rabbit@rmq2
cluster_formation.classic_config.nodes.3 = rabbit@rmq3
default_queue_type = quorum
cluster_partition_handling = pause_minority
Update /etc/rabbitmq/rabbitmq-env.config in all VMs (Replace x with relevant IP octet)
# Defaults to rabbit. This can be useful if you want to run more than one node
# per machine - RABBITMQ_NODENAME should be unique per erlang-node-and-machine
# combination. See the clustering on a single machine guide for details:
# http://www.rabbitmq.com/clustering.html#single-machine
# Replace node name with DNS name configured for cluster traffic rabbit@rsynch1 / rabbit@rsynch2 / rabbit@rsynch3
NODENAME=rabbit@rsynch1
# By default RabbitMQ will bind to all interfaces, on IPv4 and IPv6 if
# available. Set this if you only want to bind to one network interface or#
# address family.
NODE_IP_ADDRESS=10.0.3.x
# Defaults to 5672.
RABBITMQ_DIST_PORT=25672
Create /etc/systemd/system/rabbitmq-server.service.d/limits.conf
with following contents
Configure file open limits for the service “systemctl edit rabbitmq-server.service” and configure the following overriding configuration.
[Service]
LimitNOFILE=64000
Enable the rabbitmq_management plugin, configure an Erlang cookie, ports and restart rabbitmq_server (all nodes)
rabbitmq-plugins enable rabbitmq_management
echo "SOMEALPHANUMERICCOOKIE" | sudo tee /var/lib/rabbitmq/.erlang.cookie
systemctl restart rabbitmq-server
One node 1, create an admin user (dcuser) and delete the default ‘guest’ user
rabbitmqctl add_user dcuser somepassword
rabbitmqctl set_permissions -p / dcuser ".*" ".*" ".*"
rabbitmqctl set_user_tags dcuser administrator
rabbitmqctl delete_user guest
On the other two nodes, stop the application, join the cluster and start the application.
rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@rmq1
rabbitmqctl start_app
Verify cluster status
root@rmq1:~# rabbitmqctl cluster_status
Cluster status of node rabbit@rsynch1 ...
Basics
Cluster name: rabbit@rmq1.domain.net
Total CPU cores available cluster-wide: 6
Disk Nodes
rabbit@rsynch1
rabbit@rsynch2
rabbit@rsynch3
Running Nodes
rabbit@rsynch1
rabbit@rsynch2
rabbit@rsynch3
Versions
rabbit@rsynch1: RabbitMQ 3.12.1 on Erlang 25.3.2.8
rabbit@rsynch2: RabbitMQ 3.12.1 on Erlang 25.3.2.8
rabbit@rsynch3: RabbitMQ 3.12.1 on Erlang 25.3.2.8
CPU Cores
Node: rabbit@rsynch1, available CPU cores: 2
Node: rabbit@rsynch2, available CPU cores: 2
Node: rabbit@rsynch3, available CPU cores: 2
Maintenance status
Node: rabbit@rsynch1, status: not under maintenance
Node: rabbit@rsynch2, status: not under maintenance
Node: rabbit@rsynch3, status: not under maintenance
Alarms
(none)
Network Partitions
(none)
Listeners
Node: rabbit@rsynch1, interface: 10.0.2.19, port: 15672, protocol: http, purpose: HTTP API
Node: rabbit@rsynch1, interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication
Node: rabbit@rsynch1, interface: 10.0.2.19, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0
Node: rabbit@rsynch2, interface: 10.0.2.20, port: 15672, protocol: http, purpose: HTTP API
Node: rabbit@rsynch2, interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication
Node: rabbit@rsynch2, interface: 10.0.2.20, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0
Node: rabbit@rsynch3, interface: 10.0.2.21, port: 15672, protocol: http, purpose: HTTP API
Node: rabbit@rsynch3, interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication
Node: rabbit@rsynch3, interface: 10.0.2.21, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0
Feature flags
Flag: classic_mirrored_queue_version, state: enabled
Flag: classic_queue_type_delivery_support, state: enabled
Flag: direct_exchange_routing_v2, state: enabled
Flag: drop_unroutable_metric, state: enabled
Flag: empty_basic_get_metric, state: enabled
Flag: feature_flags_v2, state: enabled
Flag: implicit_default_bindings, state: enabled
Flag: listener_records_in_ets, state: enabled
Flag: maintenance_mode_status, state: enabled
Flag: quorum_queue, state: enabled
Flag: restart_streams, state: enabled
Flag: stream_queue, state: enabled
Flag: stream_sac_coordinator_unblock_group, state: enabled
Flag: stream_single_active_consumer, state: enabled
Flag: tracking_records_in_ets, state: enabled
Flag: user_limits, state: enabled
Flag: virtual_host_metadata, state: enabled
Installing and configuring HAProxy
Note: The IP allocated as a VIP for RMQ is 10.0.2.2, and the node IPs are static with values 10.0.2.19, 10.0.2.20 and 10.0.2.21. We can use DNS names for backend configurations, but we are not doing so now.
Install haproxy in all nodes
apt install haproxy
Add the following at the end of /etc/haproxy/haproxy.cfg
# AMQP load balancer
frontend rabbitmq_amqp
bind 10.0.2.22:5672
default_backend rabbitmq_amqp_nodes
backend rabbitmq_amqp_nodes
balance roundrobin
option tcp-check
server rmq1 10.0.2.19:5672 check
server rmq2 10.0.2.20:5672 check
server rmq3 10.0.2.21:5672 check
# RabbitMQ Management UI load balancer
frontend rabbitmq_ui
bind 10.0.2.22:15672
default_backend rabbitmq_ui_nodes
backend rabbitmq_ui_nodes
balance roundrobin
option httpchk GET /api/overview
server rmq1 10.0.2.19:15672 check
server rmq2 10.0.2.20:15672 check
server rmq3 10.0.2.21:15672 check
Restart haproxy services. Note 10.0.2.22 VIP will not be bound to any node until keepalived gets configured in the next step (below).
systemctl restart haproxy
Installing and configuring keepalive.
Install keepalived in all nodes.
apt install keepalived
IP allocated for use as a VIP for RMQ is 10.0.2.22.
unicast_peer
: Reliable, avoids multicast pitfallsnopreempt
on backup: Avoids race during failbackpreempt_delay
: Delay takeover to ensure clean failovergarp_master_delay
: Quick VIP takeover by network
chk_haproxy : HAProxy process pid check
chk_peer_vip : If a node sees HAProxy is running, but it also sees the VIP already active it will refuse to claim VIP
Create /usr/local/bin/check_peer_vip.sh in all nodes with the following contents.
#!/bin/bash
# Ping the VIP via the correct interface
ping -c1 -W1 -I eth1 10.0.2.22 | grep '1 received' >/dev/null
Add execute permissions
chmod +x /usr/local/bin/check_peer_vip.sh
Create /etc/keepalived/keepalived.conf file in each of the VMs. A sample from node1 is provided here. The unicast_peer IPs will change, and also the priority will be different for each of the VM,
vrrp_instance VI_RMQ {
state MASTER
interface eth1
virtual_router_id 51
priority 101
advert_int 1
preempt_delay 5
garp_master_delay 1
authentication {
auth_type PASS
auth_pass rabbitHA
}
virtual_ipaddress {
10.0.2.22
}
unicast_peer {
10.0.2.20
10.0.2.21
}
track_script {
chk_haproxy
chk_peer_vip
}
}
vrrp_script chk_haproxy {
script "pidof haproxy"
interval 2
weight -20
}
vrrp_script chk_peer_vip {
script "/usr/local/bin/check_peer_vip.sh"
interval 2
weight -10
}
Restart keepalived and haproxy in all nodes.
systemctl restart keepalived
systemctl restart haproxy
Configuring logrotate
Create /etc/logrotate.d/rabbitmq with following contents
/var/log/rabbitmq/*.log {
daily
rotate 14
compress
delaycompress
missingok
notifempty
copytruncate
}
Edit /etc/logrotate.d/haproxy and update the retention period from 7 to 14
/var/log/haproxy.log {
daily
rotate 14
missingok
notifempty
compress
delaycompress
postrotate
[ ! -x /usr/lib/rsyslog/rsyslog-rotate ] || /usr/lib/rsyslog/rsyslog-rotate
endscript
}