HA LB Cluster on CentOS5 – Without actual heartbeat :P
Last month I wrote a howto for highly available load balanced Piranha cluster using Red Hat’s cluster suite. Until then it was quite not obvious why one should use the Debian styled network load balanced cluster in the production environment when actual “heartbeat” package and service creates a lot of havoc on Red Hat machines. But my reckoning of doing classic things more manually kept me interrogative and I found the flexible way of doing load balanced clustering without needing the actual heartbeat service. Reasons why I’m so much against of having it are numerous:
- Running heartbeat snatches the independence of managing virtual IP addresses on load balancer by hand.
- Thus restricting expansion of the pools!
- Ldirector’s daemon must be managed by heartbeat when its running.
- Waste of resources in utilization; such with a sheer restart of heartbeat service and it just sits on waiting and waiting,…
- And above all, I don’t need a “second” load balancer for a failover. All that glitters is one load balancer running ldirectord in a simple environment and as for the job, it does most of heartbeat’s when acting as a divider and a monitor for distributing requests between web servers.
Environment
Requirements: At least three systems, each with a minimum of one IP (CentOS in my case). Packages ‘heartbeat’, ‘heartbeat-ldirector’ for load balancing and ‘ipvsadm’ for Linux IP Virtual Server. I know you’re thinking that why the ‘heartbeat’ when actually we’re not going to run it. In fact, we’re not going to run it; its just for a dependency resolution, rather a service startup requirement – I should say (/etc/ha.d/shellfuncs is the file needed)! And I swear we won’t run it
! So these are the packages which shape into a project Ultramonkey when combined and it describes the different topologies of a functional HA LB cluster but that’s not our concern, anyway
(perhaps yours if you think you’ve a bit of free time)
Virtual IP: 12.12.12.60
Load Balancer: 12.12.12.61 aka VM1.
Cluster Nodes/Real Servers:
Web Server1: 12.12.12.62 aka VM2
Web Server2: 12.12.12.63 aka VM3.
And we’ll be using LVS-DR (direct routing) approach for clustering; its most widely used and has lesser downsides.
Lets start by configuring the web servers first.
Cluster Nodes Configurations
1. On both web servers VM2 and VM3, apache should be running having a common serving file (for purpose of get checked by ldirectord).
# yum install httpd -y # echo foo > /var/www/html/test.html # service httpd start # chkconfig httpd on
And to distinguish both of the web servers during test loading, create at least a one unique file on each of web servers.
[root@VM2 ~]# echo "This is VM2" > /var/www/html/index.html [root@VM3 ~]# echo "This is VM3" > /var/www/html/index.html
2. Virtual IP needs to be terminated on both web servers so we’ll create a second network interface on each of it. Because eventually all three NICs on all three servers would have to have the same VIP so this would cause a problem with ARP as it resolves MACs against IPs. There are different solutions to this problem. Some may refer to use iptables or arptables_jf. Many would recommend changing default gateway route or hiding the network interface (by the way don’t use iptables or change default gateway for this; Red Hat discourages both of these methods as they cause a lot of overhead). But the most flexible approach I’ve found is:
a. create a loopback interface so it doesn’t communicate with your network gateway/router directly.
b. instruct Linux kernel to announce ARP requests with preference to be taken from local address when matching for communication instead preference from the destination address.
c. instruct Linux kernel to send ARP responses only to the requests originating from same sender address to same local addresses’ subnet. Details here, if you’re really curious about it.
# vi /etc/sysconfig/network-scripts/ifcfg-lo:0 DEVICE=lo:0 IPADDR=12.12.12.60 NETMASK=255.255.255.255 ONBOOT=yes NAME=loopback # # vi /etc/sysctl.conf net.ipv4.conf.all.arp_ignore = 1 net.ipv4.conf.eth0.arp_ignore = 1 net.ipv4.conf.all.arp_announce = 2 net.ipv4.conf.eth0.arp_announce = 2 # sysctl -p # ifup lo:0
Load Balancer Configuration
We’ll be going through:
a. installing required packages
b. enabling IP forwarding,
# yum install heartbeat heartbeat-ldirector ipvsadm -y # chkconfig --add ldirectord # chkconfig --del heartbeat # sed -i 's/net.ipv4.ip_forward = 1/net.ipv4.ip_forward = 0' /etc/sysctl.conf # sysctl -p
c. configure secondary eth0 for VIP as its going to be exposed to outside world or your local gateway and
# vi /etc/sysconfig/network-scripts/ifcfg-eth0:0 DEVICE=eth0:0 BOOTPROTO=none ONBOOT=yes HWADDR=3a:5d:71:ad:67:47 NETMASK=255.255.255.0 IPADDR=12.12.12.60 GATEWAY=12.12.12.1 TYPE=Ethernet
d. then creating ldirector.cf, the configuration file of our load balancer, respectively!!
# vi /etc/ha.d/ldirectord.cf checktimeout=10 checkinterval=2 autoreload=no logfile="/var/log/ldirectord.log" quiescent=no virtual=12.12.12.60:80 real=12.12.12.62:80 gate real= 12.12.12.63:80 gate service=http request="test.html" receive="foo" scheduler=wrr protocol=tcp checktype=negotiate # service ldirectord start
Option ‘quiescent’ just removes the real server from ipvs table whom ldirectord doesn’t recieve any response from, when querying for test.html within ten seconds, marking that real server as dead; until its available again. Note that the “gate” switch in ‘real’ server’s parameter value which testifies the usage of LVS Direct Routing method. The rest of the two methods are masq and ipip the details of which along with the other options available, particularly the scheduler parameters, for this configuration file can be found in ‘man ldirectord’.
Testing
Use ‘ipvsadm’ to list down current statistics of ldirectord. Make sure that both real servers IPs are listed there and have non-zero value in weight (since we’ve this default setup, it should be 1). If not, then try checking the log file, tcpdump on ldirector and apache logs on real servers.
If everything works good, you’ll see changing content when browsing to http://12.12.12.60/ multiple times (from another system outside these cluster nodes). Then stop httpd on one web server, browse to the URL again and all requests should now be served from the other web server.
[root@VM1 ~]# ipvsadm -l IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 12.12.12.60:http wrr -> 12.12.12.63:http Route 1 0 0 -> 12.12.12.62:http Route 1 0 0
For a more meaningful testing
$ for i in $(seq 6); do curl http://12.12.12.60/index.html; done This is VM3 This is VM2 This is VM3 This is VM2 This is VM3 This is VM2
I’ll be posting a couple of optimizations techniques soon when I’ll be getting some more free time. Stay tuned and take care
checkinterval=2
autoreload=no
logfile=”/var/log/ldirectord.log”
quiescent=yes
virtual=12.12.12.60:80
real=12.12.12.62:80 gate
real= 12.12.12.63:80 gate
service=http
request=”index.html”
receive=”hi”
scheduler=wlc
protocol=tcp
checktype=negotiate

January 8th, 2010 at 3:50 AM
This is a great job, ty for share.
January 8th, 2010 at 3:52 AM
Glad to help out!
February 20th, 2010 at 8:13 PM
I have done everything like you did but it always forwards to my second web server. When I run ipvsadm -l everything looks fine. I can access the first web server directly and it works great. But every time I use my virtual IP it just forwards to the second web server.
February 21st, 2010 at 3:16 AM
Grant, whats the output of ipvsadm -l? Make sure that test file test.html you’ve specified in ldirectord.cf is reachable on both web servers and is echoing the word “foo” in it.
I’ve found a typo in my post above when enabling ip forwarding. Edit /etc/sysctl.conf on load balancer and make sure that the value of net.ipv4.ip_forward is 1 (not 0 as zero means disabled). Use systcl -p to reload the values then. I would also want to double check step#2 of configuring web servers if I find load balancer forwarding requests to only one web server.