HA LB Cluster on CentOS5 – Without actual heartbeat :P

Last month I wrote a howto for highly available load balanced Piranha cluster using Red Hat’s cluster suite. Until then it was quite not obvious why one should use the Debian styled network load balanced cluster in the production environment when actual “heartbeat” package and service creates a lot of havoc on Red Hat machines. But my reckoning of doing classic things more manually kept me interrogative and I found the flexible way of doing load balanced clustering without needing the actual heartbeat service. Reasons why I’m so much against of having it are numerous:
- Running heartbeat snatches the independence of managing virtual IP addresses on load balancer by hand.
- Thus restricting expansion of the pools!
- Ldirector’s daemon must be managed by heartbeat when its running.
- Waste of resources in utilization; such with a sheer restart of heartbeat service and it just sits on waiting and waiting,…
- And above all, I don’t need a “second” load balancer for a failover. All that glitters is one load balancer running ldirectord in a simple environment and as for the job, it does most of heartbeat’s when acting as a divider and a monitor for distributing requests between web servers.

Environment

Requirements: At least three systems, each with a minimum of one IP (CentOS in my case). Packages ‘heartbeat’, ‘heartbeat-ldirector’ for load balancing and ‘ipvsadm’ for Linux IP Virtual Server. I know you’re thinking that why the ‘heartbeat’ when actually we’re not going to run it. In fact, we’re not going to run it; its just for a dependency resolution, rather a service startup requirement – I should say (/etc/ha.d/shellfuncs is the file needed)! And I swear we won’t run it ;) ! So these are the packages which shape into a project Ultramonkey when combined and it describes the different topologies of a functional HA LB cluster but that’s not our concern, anyway :D (perhaps yours if you think you’ve a bit of free time)

Virtual IP: 12.12.12.60
Load Balancer: 12.12.12.61 aka VM1.
Cluster Nodes/Real Servers:
Web Server1: 12.12.12.62 aka VM2
Web Server2: 12.12.12.63 aka VM3.

And we’ll be using LVS-DR (direct routing) approach for clustering; its most widely used and has lesser downsides.
Lets start by configuring the web servers first.

Cluster Nodes  Configurations

1. On both web servers VM2 and VM3, apache should be running having a common serving file (for purpose of get checked by ldirectord).

# yum install httpd -y
# echo foo > /var/www/html/test.html
# service httpd start
# chkconfig httpd on

And to distinguish both of the web servers during test loading, create at least a one unique file on each of web servers.

[root@VM2 ~]# echo "This is VM2" > /var/www/html/index.html
[root@VM3 ~]# echo "This is VM3" > /var/www/html/index.html

2. Virtual IP needs to be terminated on both web servers so we’ll create a second network interface on each of it. Because eventually all three NICs on all three servers would have to have the same VIP so this would cause a problem with ARP as it resolves MACs against IPs. There are different solutions to this problem. Some may refer to use iptables or arptables_jf. Many would recommend changing default gateway route or hiding the network interface (by the way don’t use iptables or change default gateway for this; Red Hat discourages both of these methods as they cause a lot of overhead). But the most flexible approach I’ve found is:

a. create a loopback interface so it doesn’t communicate with your network gateway/router directly.
b. instruct Linux kernel to announce ARP requests with preference to be taken from local address when matching for communication instead preference from the destination address.
c. instruct Linux kernel to send ARP responses only to the requests originating from same sender address to same local addresses’ subnet. Details here, if you’re really curious about it.

# vi /etc/sysconfig/network-scripts/ifcfg-lo:0
DEVICE=lo:0
IPADDR=12.12.12.60
NETMASK=255.255.255.255
ONBOOT=yes
NAME=loopback
#
# vi /etc/sysctl.conf
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.eth0.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.eth0.arp_announce = 2
# sysctl -p
# ifup lo:0

Load Balancer Configuration

We’ll be going through:

a. installing required packages
b. enabling IP forwarding,

# yum install heartbeat heartbeat-ldirector ipvsadm -y
# chkconfig --add ldirectord
# chkconfig --del heartbeat
# sed -i 's/net.ipv4.ip_forward = 1/net.ipv4.ip_forward = 0' /etc/sysctl.conf

# sysctl -p

c. configure secondary eth0 for VIP as its going to be exposed to outside world or your local gateway and

# vi /etc/sysconfig/network-scripts/ifcfg-eth0:0
DEVICE=eth0:0
BOOTPROTO=none
ONBOOT=yes
HWADDR=3a:5d:71:ad:67:47
NETMASK=255.255.255.0
IPADDR=12.12.12.60
GATEWAY=12.12.12.1
TYPE=Ethernet

d. then creating ldirector.cf, the configuration file of our load balancer, respectively!!

# vi /etc/ha.d/ldirectord.cf
checktimeout=10
checkinterval=2
autoreload=no
logfile="/var/log/ldirectord.log"
quiescent=no
virtual=12.12.12.60:80
 real=12.12.12.62:80 gate
 real= 12.12.12.63:80 gate
 service=http
 request="test.html"
 receive="foo"
 scheduler=wrr
 protocol=tcp
 checktype=negotiate

# service ldirectord start

Option ‘quiescent’ just removes the real server from ipvs table whom ldirectord doesn’t recieve any response from, when querying for test.html within ten seconds, marking that real server as dead; until its available again. Note that the “gate” switch in ‘real’ server’s parameter value which testifies the usage of LVS Direct Routing method. The rest of the two methods are masq and ipip the details of which along with the other options available, particularly the scheduler parameters, for this configuration file can be found in ‘man ldirectord’.

Testing

Use ‘ipvsadm’ to list down current statistics of ldirectord. Make sure that both real servers IPs are listed there and have non-zero value in weight (since we’ve this default setup, it should be 1). If not, then try checking the log file, tcpdump on ldirector and apache logs on real servers.
If everything works good, you’ll see changing content when browsing to http://12.12.12.60/ multiple times (from another system outside these cluster nodes). Then stop httpd on one web server, browse to the URL again and all requests should now be served from the other web server.

[root@VM1 ~]# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  12.12.12.60:http wrr
-> 12.12.12.63:http             Route   1      0          0
-> 12.12.12.62:http             Route   1      0          0

For a more meaningful testing

$ for i in $(seq 6); do curl http://12.12.12.60/index.html; done
 This is VM3
 This is VM2
 This is VM3
 This is VM2
 This is VM3
 This is VM2

I’ll be posting a couple of optimizations techniques soon when I’ll be getting some more free time. Stay tuned and take care :D

checktimeout=10
checkinterval=2
autoreload=no
logfile=”/var/log/ldirectord.log”
quiescent=yes
virtual=12.12.12.60:80
real=12.12.12.62:80 gate
real= 12.12.12.63:80 gate
service=http
request=”index.html”
receive=”hi”
scheduler=wlc
protocol=tcp
checktype=negotiate

4 Responses to “HA LB Cluster on CentOS5 – Without actual heartbeat :P”

  • JaR Says:

    This is a great job, ty for share.

  • Abbas Says:

    Glad to help out!

  • Grant Says:

    I have done everything like you did but it always forwards to my second web server. When I run ipvsadm -l everything looks fine. I can access the first web server directly and it works great. But every time I use my virtual IP it just forwards to the second web server.

  • Abbas Says:

    Grant, whats the output of ipvsadm -l? Make sure that test file test.html you’ve specified in ldirectord.cf is reachable on both web servers and is echoing the word “foo” in it.
    I’ve found a typo in my post above when enabling ip forwarding. Edit /etc/sysctl.conf on load balancer and make sure that the value of net.ipv4.ip_forward is 1 (not 0 as zero means disabled). Use systcl -p to reload the values then. I would also want to double check step#2 of configuring web servers if I find load balancer forwarding requests to only one web server.

Leave a Reply