Last month I wrote a howto for highly available load balanced Piranha cluster using Red Hat’s cluster suite. Until then it was quite not obvious why one should use the Debian styled network load balanced cluster in the production environment when actual “heartbeat” package and service creates a lot of havoc on Red Hat machines. But my reckoning of doing classic things more manually kept me interrogative and I found the flexible way of doing load balanced clustering without needing the actual heartbeat service. Reasons why I’m so much against of having it are numerous:
- Running heartbeat snatches the independence of managing virtual IP addresses on load balancer by hand.
- Thus restricting expansion of the pools!
- Ldirector’s daemon must be managed by heartbeat when its running.
- Waste of resources in utilization; such with a sheer restart of heartbeat service and it just sits on waiting and waiting,…
- And above all, I don’t need a “second” load balancer for a failover. All that glitters is one load balancer running ldirectord in a simple environment and as for the job, it does most of heartbeat’s when acting as a divider and a monitor for distributing requests between web servers.
I – Environment
Requirements: At least three systems, each with a minimum of one IP (CentOS in my case). Packages ‘heartbeat’, ‘heartbeat-ldirector’ for load balancing and ‘ipvsadm’ for Linux IP Virtual Server. I know you’re thinking that why the ‘heartbeat’ when actually we’re not going to run it. In fact, we’re not going to run it; its just for a dependency resolution, rather a service startup requirement – I should say (/etc/ha.d/shellfuncs is the file needed)! And I swear we won’t run it
! So these are the packages which shape into a project Ultramonkey when combined and it describes the different topologies of a functional HA LB cluster but that’s not our concern, anyway
(perhaps yours if you think you’ve a bit of free time)
Virtual IP: 10.10.10.60
Load Balancer: 10.10.10.61 aka VM1.
Cluster Nodes/Real Servers:
Web Server1: 10.10.10.62 aka VM2
Web Server2: 10.10.10.63 aka VM3.
And we’ll be using LVS-DR (direct routing) approach for clustering; its most widely used and has lesser downsides.
Lets start by configuring the web servers first.
II- Cluster Nodes Configurations
1. On both web servers VM2 and VM3, apache should be running having a common serving file (for purpose of get checked by ldirectord).
# yum install httpd -y # echo foo > /var/www/html/test.html # service httpd start # chkconfig httpd on
And to distinguish both of the web servers during test loading, create at least a one unique file on each of web servers.
[root@VM2 ~]# echo "This is VM2" > /var/www/html/index.html [root@VM3 ~]# echo "This is VM3" > /var/www/html/index.html
2. Virtual IP needs to be terminated on both web servers so we’ll create a second network interface on each of it. Because eventually all three NICs on all three servers would have to have the same VIP so this would cause a problem with ARP as it resolves MACs against IPs. There are different solutions to this problem. Some may refer to use iptables or arptables_jf. Many would recommend changing default gateway route or hiding the network interface (by the way don’t use iptables or change default gateway for this; Red Hat discourages both of these methods as they cause a lot of overhead). But the most flexible approach I’ve found is:
a. create a loopback interface so it doesn’t communicate with your network gateway/router directly.
b. instruct Linux kernel to announce ARP requests with preference to be taken from local address when matching for communication instead preference from the destination address.
c. instruct Linux kernel to send ARP responses only to the requests originating from same sender address to same local addresses’ subnet. Details here, if you’re really curious about it.
# vi /etc/sysconfig/network-scripts/ifcfg-lo:0 DEVICE=lo:0 IPADDR=10.10.10.60 NETMASK=255.255.255.255 ONBOOT=yes NAME=loopback # # vi /etc/sysctl.conf net.ipv4.conf.all.arp_ignore = 1 net.ipv4.conf.eth0.arp_ignore = 1 net.ipv4.conf.all.arp_announce = 2 net.ipv4.conf.eth0.arp_announce = 2 # sysctl -p # ifup lo:0
III – Load Balancer Configuration
We’ll be going through:
a. installing required packages
b. enabling IP forwarding,
# yum install heartbeat heartbeat-ldirector ipvsadm -y # chkconfig --add ldirectord # chkconfig --del heartbeat # sed -i 's/net.ipv4.ip_forward = 0/net.ipv4.ip_forward = 1' /etc/sysctl.conf # grep 'ip_forward' /etc/sysctl.conf (just make sure that value of net.ipv4.ip_forward is equal to 1 and if not, manually edit it) # sysctl -p
c. configure secondary eth0 for VIP as its going to be exposed to outside world or your local gateway and
# vi /etc/sysconfig/network-scripts/ifcfg-eth0:0 DEVICE=eth0:0 BOOTPROTO=none ONBOOT=yes HWADDR=3a:5d:71:ad:67:47 NETMASK=255.255.255.0 IPADDR=10.10.10.60 GATEWAY=12.12.12.1 TYPE=Ethernet
d. then creating ldirector.cf, the configuration file of our load balancer, respectively!!
# vi /etc/ha.d/ldirectord.cf
checktimeout=10
checkinterval=2
autoreload=no
logfile="/var/log/ldirectord.log"
quiescent=no
virtual=10.10.10.60:80
real=10.10.10.62:80 gate
real= 10.10.10.63:80 gate
service=http
request="test.html"
receive="foo"
scheduler=wrr
protocol=tcp
checktype=negotiate
# service ldirectord start
Note: Note down the spaces in front of all variables after ‘virtual’ variable’s line. Yes, this is how ldirectord reads it. Each virtual IP’s pool needs this indentation for configuration recognition.
Option ‘quiescent’ just removes the real server from ipvs table whom ldirectord doesn’t recieve any response from, when querying for test.html within ten seconds, marking that real server as dead; until its available again. Note that the “gate” switch in ‘real’ server’s parameter value which testifies the usage of LVS Direct Routing method. The rest of the two methods are masq and ipip the details of which along with the other options available, particularly the scheduler parameters, for this configuration file can be found in ‘man ldirectord’.
IV – Testing
Use ‘ipvsadm’ to list down current statistics of ldirectord. Make sure that both real servers IPs are listed there and have non-zero value in weight (since we’ve this default setup, it should be 1). If not, then try checking the log file, tcpdump on ldirector and apache logs on real servers.
If everything works good, you’ll see changing content when browsing to http://10.10.10.60/ multiple times (from another system outside these cluster nodes). Then stop httpd on one web server, browse to the URL again and all requests should now be served from the other web server.
[root@VM1 ~]# ipvsadm -l IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.10.10.60:http wrr -> 10.10.10.63:http Route 1 0 0 -> 10.10.10.62:http Route 1 0 0
For a more meaningful testing
$ for i in $(seq 6); do curl http://10.10.10.60/index.html; done This is VM3 This is VM2 This is VM3 This is VM2 This is VM3 This is VM2
I’ll be posting a couple of optimizations techniques soon when I’ll be getting some more free time. Stay tuned and take care
checkinterval=2
autoreload=no
logfile=”/var/log/ldirectord.log”
quiescent=yes
virtual=10.10.10.60:80
real=10.10.10.62:80 gate
real= 10.10.10.63:80 gate
service=http
request=”index.html”
receive=”hi”
scheduler=wlc
protocol=tcp
checktype=negotiate



#1 by JaR on January 8, 2010 - 3:50 AM
This is a great job, ty for share.
#2 by Abbas on January 8, 2010 - 3:52 AM
Glad to help out!
#3 by Grant on February 20, 2010 - 8:13 PM
I have done everything like you did but it always forwards to my second web server. When I run ipvsadm -l everything looks fine. I can access the first web server directly and it works great. But every time I use my virtual IP it just forwards to the second web server.
#4 by Abbas on February 21, 2010 - 3:16 AM
Grant, whats the output of ipvsadm -l? Make sure that test file test.html you’ve specified in ldirectord.cf is reachable on both web servers and is echoing the word “foo” in it.
I’ve found a typo in my post above when enabling ip forwarding. Edit /etc/sysctl.conf on load balancer and make sure that the value of net.ipv4.ip_forward is 1 (not 0 as zero means disabled). Use systcl -p to reload the values then. I would also want to double check step#2 of configuring web servers if I find load balancer forwarding requests to only one web server.
#5 by Mir on May 7, 2010 - 1:28 PM
Sorry if this is a stupid question, but how do you configure the web servers (VM2 and VM3) to contain the same data?
#6 by Abbas on May 8, 2010 - 5:13 AM
You can use ‘scp‘ to copy the data between the virtual host directories on both real servers (VM2 and VM3). And if the content being served is dynamic enough to change rapidly, you can setup a cronjob to ‘rsync‘ data between two web servers regularly or as soon as it changes.
And by the way, no question is stuipd.
#7 by Tleej on May 10, 2010 - 11:15 PM
Hi,
Thanks for sharing this. I did the same for a DNS server, and get the same output as you do, but the ldirector host does not listen on port 53 whatever I do, same thing. I tried with http, same thing…
Any idea ?
Thanks
#8 by Abbas on May 10, 2010 - 11:23 PM
Hi Taleej,
That’s likely because the virtual IP hasn’t been terminated properly on real (web or dns) servers. You may double check step#2c of configuring real servers. If it still behaves the same, try checking and posting your ldirectord, apache and dns log files.
#9 by Noveck Gowandan on August 5, 2010 - 6:29 AM
Much thanks. Got it to work.
Step 2b – heartbeat-ldirector was not installing via yum. I simply installed heartbeat*, to determine the package name is heartbeat-ldirectord
Step 2b again – the sed was not working properly, i just used nano or vi to edit the actual sysctl.conf file to change the ipforwarding to 1.
Under the Loadbalancer config, part d.
the sections following virtual config was not loading when I attempted to restart the ldirectord.
Upon investigation, I realized all those subsettings need a minimum of 4 spaces or 1 tab.
Are there any tweaks or tuning tips for the loadbalancer to scale towards a site catering to 100,000 users?
Again, much thanks for this tutorial, it helped tremendously.
#10 by Abbas on August 25, 2010 - 10:19 AM
Not an early reply. Sorry. I am not this late usually in replying but the antispam plugin I am using, tagged a flag on the comment so I just took it out of there. Guess what the first false-positive ever in the blog.
Step 2b, yes, package heartbeat itself is an installation dependency of heartbeat-ldirectord so its definitely needed although that heartbeat daemon is never used in this very setup.
For enabling IP forwarding, I think sed command didn’t work properly because I had a typing error in there. Wrote 0 instead of 1 and one instead of zero. Or could it be spacing format of default /etc/sysctl.conf that likely may have changed. I certainly doubt the former.
That’s another burning fact. Indeed ldirectord daemon uses tabs to differentiate between number of real server configurations while it reads them clear.
At this point of time, the only tweak which would definitely help you in distributing the load properly is choosing a proper ‘scheduler’ method in ldirectd.conf. Since you’ve come this far, I would encourage you to read this http://kb.linuxvirtualserver.org/wiki/Category:Job_Scheduling_Algorithms. I have been using this LB clustering approach of LVS in a production environment that being an easy three systems’ approach, by far is too much stable than Red Hat’s Piranha (by the way I am partial to it as I don’t like RHCS except its GFS part and never found it error prone for myself). So I really am fond of this setup’s features and provided that you’ve enough resources and you’ve set it up accurately, it can serve twice more users than the number you mentioned. Of course, other part of it may include optimizing Apache/MySQL depending upon the type of applications your website is having.
Thanks for bringing attention to concerning errors. I would add these in the post and hopefully it would save people running into multiple errors that others have had!
#11 by Jan on September 2, 2010 - 2:59 AM
Enabling routing on the lb might just be the wrong thing to do. Therefore I think the given sed line is correct: It DISables routing. Reading the LVS documentation and various posts I found out the hard way (took me a couple of hours and a headaching protocol analysis) that routing is actually done by lvs itself.
I use DVR on my environment, so that http requests are directly answered by the http servers and not rerouted thru the lb. This works OK with or without routing enabled. But when it comes to HTTPS (or more generally SSL-based connections) these protocols will not work when routing is enabled due to duplicate arp acks send by the lb and the https servers.
Check http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.arp_problem.html
Are you actually doing a loadbalancing with https?
#12 by Abbas on September 4, 2010 - 9:31 AM
@Jan,
This particular approach having VIPs terminated on loopback interfaces does need the IP forwarding enabled for proper job handling of incoming requests. This indeed is called ‘routing’ but it should be kept in mind that its only ‘IP masquerading’ or in other words, a from of NATing. However packets handling done by LVS-DR method is different and shouldn’t be confused with its term ‘routing’. I would recommend that you check vendors’ sites directly and not third party posts for any value of good references. LVS-DR page of linuxvirtualserver.org and RHEL’s LVS documentation suggest turning on IP forwarding for all sort of LVS-DR topologies.
http://www.linuxvirtualserver.org/VS-DRouting.html
http://www.centos.org/docs/5/html/5.1/pdf/Virtual_Server_Administration.pdf
To confirm, no, I am not using this method for any HTTPS purposes. Indeed, HTTPS is a big pain in field of clustering not only in Linux’s LVS but also in Windows NLB. Simply, cause HTTPS handles requests’ caches and cookies separately and hence may need special adjustments or synchronizations in applications on cluster nodes, you’re using, to get them to work!