How I may help
LinkedIn Profile Email me!
Call me using Skype client on your machine

Reload this page Load Balancing

Here are my notes on routing (distributing) work and ensuring high availability of servers.



  • Why Load Balance?
  • Troubles
  • Approaches
  • Mechanisms
  • Network LB
  • MS NLB Config
  • Sclar Clouds
  • Appliances
  • More Resources
  • Your comments???
  • Site Map List all pages on this site 
    About this site About this site 
    Go to first topic Go to Bottom of this page

    Set screen Why - Who Needs Load Balancing?

      Server load balancing scales "horizontally" to accomodate higher volume work by distributing (balancing) client requests among several servers. This allows more servers (hosts) to be added to handle additional load.

      Even if a single server can handle current load, it is still a good idea to design and test your system for scalability.

      Having "fault resilience" means that an alternative server can take over when, inevitably, a server crashes or just needs maintenance work done.

      Having your applications on more than one machine allows operations personnel to work on one machine while the other is busy working. This is called ensuring uninterrupted continuous avialability of mission-critical applications.

      It's also a good idea to test your system for its ability to handle growth when it's still fresh on developer's minds, since load balancing and fail-over issues can be quite complex.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen The Trouble with Load Balancing

      A load balancer may not evenly distribute load among machines in the cluster. This can happen for several reasons:

      • Session stickiness. Once the IP of a client gets assigned to a particular server (at Login), it stays with that server until the session ends. This is the case with SAP SAPGUI R/3 application performance tuninganother page on this site

        Since each session poses different levels of load, load balacers which allocate based on frequency of allocation (such as round-robin) may over-allocate sessions to a server which ended up with users and sessions who consume a higher-than-average amount of resources.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Approaches for High Server Availability

      There are several basic types of load balancing:

      1. Round robin DNS (RRDNS) involves client DNS servers to resolve a URL to multiple ip addresses (and thus multiple machines).

        The problem with "round robin" is that it blindly hits each server regardless of its ability to accept work. This scheme requires every server to be homogeneous (nearly identical).

        RRDNS is not recommended because it places load balancing outside the organization to authoritative DNS servers which caches DNS A entries. Changes to those entries can take a long time to propagate throughout the internet.

      2. Network-based load balancers offer more sophisticated allocation algorithms. Network Load Balancing operates at the NIC driver level to detect the failure of a server and reassigns client computer traffic among the remaining servers.

        Microsoft's NLB service can take up to eight seconds to redirect load. How To Set Up TCP/IP for Network Load Balancing in Windows Server 2003 notes that

        TCP/IP must be the only network protocol present on the cluster adapter. You must not add any other protocols (for example, Internetwork Packet Exchange [IPX]) to this adapter.

        Because traffic enters the network, network load balancing cannot withstand moderate Dos (Denial of Service) attacks as well as large-capacity front-end network appliances which detect ad filter out malicious traffic before it gets to the server.

      3. Dispatcher or proxy / switch based load balancers resolves requests from clients directed to a single virtual IP address (VIP) . The load balancer then uses a load balancing algorithm to distribute work to real IP addresses.

        Such "server-based" load balancing is done either by an application running on a regular server or an appliance.

        • Using regular PC machines (on standard operating systems such as Windows or UNIX) has the advantage that it can be replaced (after configuration) with another machine that is familiar to the staff. There is less hassle from dealing with another vendor.
        • tool Vendors of such an approach include Resonate, Rainfinity, and Stonebeat.

        • Using dedicated appliance hardware has the advantage of speed, since they use specialized Application Specific Integrated Circuit (ASIC) chips. They "have internal-bandwidth backbones capable of handling a Gbps worth of traffic."[2,p31]

          tool Vendors of such an approach include F5, and Radware.

          "Firewall load balancers tend to max out at around 70 to 80 Mbps"[2,p9,60]

          BigIP from f5 also compresses (for HTTP 1.1), encrypts SSL network traffic, and centralizes the SSL key store onto a single machine.

        However, an individual central dispatcher can become a single point of failure. So:

          A hot spare is a machine that's configured as a mirror image of the machine it replaces if that fails. It's also called a passive node if it sits unused until it's need to support a failover.

          Both the active and standby load balancers occassionally send out heartbeat messages such as VRRP (Virtual Router Redundancy Protocol, RFC 2338) format UDP packets to multicast IP port 1985. (in the range from to

          Cisco has its HSRP (Hot Standby Routing Protocol) and Extreme Networks has its ESRP (Extreme Standby Router Protocol).

          This communication can occur over a serial cable between two machines.

          Multicast can span several subnets, but they must reach their final destination after a network latency of no more than 200 to 300 milliseconds.

          The passive node takes over (assumes "master" status) when it does not hear the "I'm alive" heartbeat from the other machine.

          Because failover can cause a VIP to be the destination address at different devices at different times, a VIP can be called "floating".

          Failback occurs when the failed server comes back online and takes load back from the failover node. This happens after a transfer of client state information.

          To avoid "bridging loops" when each load balancer thinks the other is inactive, the ISO Level 2 Spanning-Tree Protocol (STP) is used to set a cost for each port. This provides a way to block multiple paths by opening only the lowest-cost port (the highest-priority port). However, "STP is almost never used since it can take 10 seconds or more to react. Typically, a proprietary variation of a hot-standby protocol is used."[2,p59]

        Caution! Proxies forward both client requests to servers and server responses to clients, so unless they can handle as much throughput as all the machines in its subnet, it can easily become a throughput bottleneck.

      Note: Although not necessary for production work, "Individual Pass-Though VIPs" are requested/defined so that for performance measurement and troubleshooting, individual servers can be reached through the load balancer.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Mechanisms

      There are several distinguishing features among load balancers.

    1. "NAT-based" IP firewall.

      "Flat-based" IP typology means the VIP and real IPs are on the same subnet (usually behind a firewall). FTP and streaming applications use this typology along with Return Path DSR.

      Many load balancers also have firewall capabilities (packet filtering, stateful inspection, intrusion detection, etc.)

      More advanced load balancers enable the VIP to be on a different (more private/secure) subnet than IPs on servers receiving traffic. In this case, the load balancer acts as a gateway between the two LANs.

      This is achieved with NAT (Network Address Translation). (RFC 1631) which enables real servers to hide themselves as non-routed RFC 1918 addresses.

      But this only allows one pair of load balancer machines.

    2. Load allocation algorithms within daemons (or windows services), which dynamically:

        1) listens for a "heartbeat"; or
        2) periodically sends ICMP ECHO_REQUEST to measure RTT (Round Trip Time) "pings"; or
        3) detect the status of bandwidth usage or CPU utilization on each server; or
        4) counts the number of connections active on each server.

      BEA Loogic 8.1 is server affinity that reduces the number of RMI sockets.

      SAP R/3's login balancing (SMLG) clustering technology makes the decision of which server receives a user for login based on the number of users currently logged in and response time, (not how many work processes are used at any moment).

      Once a SAP user is logged on a particular app server, the user session is pinned to that server. Background jobs are also assigned to a server, but a new job might get assigned to another server.

    3. failover automatic repartioning of client traffic among the remaining servers in the cluster so that users experience continuous performance Quality of Service (QoS).

      Within the BEA WebLogic 8.1 HttpClusterServlet Plug-in, sessions are attached to a particular server by the client's IP address. This means that a server must stay up until all sessions finish (which can take many minutes).

      A failover mechanism must have an awareness of the progress of all activities so that client sessions can continue to complete if processing stops.

      IP Multicasts are sent by each WebLogic server to broadcast their status. Each Weblogic server listens for the one-to-many messages to update its own JNDI tree.

      Cisco's LoadDirector uses a "Cookie Sticky" feature that redirects traffic to the same physical machine by examining cookies.

      "Virtualization" software (such as EMC VMware) automatically installs application software to meet demand.

    4. Global load balancing that routes requests among several data centers over the WAN (Wide Area Network). "With cross-country latency at around 60 ms or more" [2,p10]

    5. Intelligent response to denial of service flooding attacks.

    6. Direct Server Return of traffic back to clients

      The least sophisticated load balancers use "Route-path", where the load balancer acts as a Layer 3 router. But this allows 3 or more load balancers.

      More sophisticated load balancers use "Bridge-path", where the load balancer acts as a Layer 2 bridge between two LANs. However, this only works with flat-based nets.

      Most sophisticated of all are when real servers (especially streaming and FTP servers) can bypass going back to the Load Balancer and send responses directly to clients by using Direct Server Return (DSR). This is desirable because "web traffic has a ratio of about 1:8, which is one packet out for every eight packets in."[2,p26]

      F5 calls this "nPath". Foundry calls it "SwitchBack".

      DSR works by configuring the IP alias on the server's loopback interface "lo" with the IP address of the VIP. This is done using the ifconfig command on Unix.

      The server needs to still bind to the real IP address as well so the load balancer can perform health checks on the sever.

      The default route path of the server needs to point to the router rather than the load balancer.

      ??? using MAC Address Translation (MAT)

    7. "Sticky" sessions



    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Load Balancing in the Cloud

      The internal processes (life-cycle) of a "redundant, self-curing, and self-scaling" computing facility are

      1. Authorizing those who need to perform various activities with rights for their role (who can do what when)
      2. Imaging os and software apps into server snapshot images templates used to create working instances
      3. Provisioning (condensing) instances (with unique URLS and OS level settings)
      4. Configuring each instance (with unique application-level settings) [Puppet]
      5. Persisting data in and out each instance (with our without encryption)
      6. Storing logs eminating from instances into a centralized location
      7. Monitoring health and other metrics by instance type
      8. Deciding when alerts are appropriate, when an instance should be registered or un-registered with its load balancer, when instances should be added or removed
      9. Switching among instances from a static IP address [Sclar]
      10. De-Provisioning (evaporating) instances
      11. Analyzing logs for trends (cost per transaction over time) and implications to adjust number and size of instances most appropriate for each instance type/purpose and time frames) [capacity management like vkernel]

    • Sclar (a $50/month open-source offering from Intridea, a Washington DC Ruby on Rails development firm) illustrates its approach for a polling controller this way:

      Sclar event block diagram

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen MS NLB Services

      After Microsoft acquired, in 1998, from Valence Research the "Convoy Cluster" component of Windows NT 4.0 Enterprise Edition, Microsoft introduced its Network Load Balancing (NLB) service as a free add-on service to Advanced and Datacenter versions of Windows 2000 Server (known as Enterprise Edition of Windows 2003).

      The acronymn for Windows Load Balancing Service (WLBS) is also the name of the utility which verifies whether load balanced hosts "converge":

        wlbs query
        wlbs stop
        wlbs start

      Unlike appliances which sit in front of servers in the cluster,
      MS-NLB works inside a server as a driver that intercepts traffic from the NIC card.

      NLB is "fully distributed" in that it is on all host servers serving a web site on a single subnet.

      Like other load balancers, clients make requests using a single primary virtual IP address (VIP) for the cluster.

      However, all incoming IP packets are received by the cluster adapter of each host because NLB assigns listening cluster adapter on each host with the same MAC address based on the VIP.

      One of the hosts on the public subnet sends a MAC-layer (Layer 2) multicast to hosts on the local broadcast subnet of the cluster through another adapter. These are received by a separate dedicated IP adapter on each host.

      The NLB service runs on every host as an intermediate driver between the TCP/IP (HTTP & FTP) protocol and network adapter drivers within the Windows 2000 protocol stack.

      The service ignores unwanted packets and doesn't reroute packets to individual hosts.

      Because potentially any host on the cluster can respond to a client, a Web browser may receive various images within a single Web page from different hosts on a load-balanced cluster.

      The host that responds to a request is the host which know that it's the default host. This is determined by the host priority number unique to every host. That number is in the range of 1 to 32 because that's the NLB product's limit on hosts per cluster.

      Each NLB server emits a heartbeat message to other hosts in the cluster, and listens for the heartbeat of other hosts.

      A host knows it's the default host if it doesn't hear another host with a higher priority. This allows any host to be taken offline (for preventive maintenance, upgrading, etc.) without disturbing cluster operations.

      Each host responds to requests through its own unique dedicated IP address so that other hosts know the source of the request.

      Traffic arriving through a cluster adapter is redirected to the dedicated IP address of a specific host. The dedicated IP address is always entered first on the list of IPs so that outgoing connections from the cluster host are sourced with this IP address instead of a virtual IP address.

      The load-balancing server uses its algorithm to choose the best available server for the client. To allow session state to be maintained in host memory, NLB directs all TCP connections from one client IP address to the same cluster host. This behavior is controlled by the client affinity parameter.

      Changing load percentages allow hosts with higher capacity to receive a larger fraction of the total client load. NLB doesn't take advantage of Win2K's performance information for load balancing.

      MS-NLB doesn't support delayed binding. Each server provides failover for every other server, thus the software provides load-balancing redundancy in an active-and-active implementation.

    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen MS NLB Configuration

      Netlibrary eBook $35 Windows 2000 & Windows Server 2003 Clustering & Load Balancing (McGraw-Hill Osborne: April 9, 2003) by Robert J. Shimonski

      $35 Windows 2000 Clustering and Load Balancing Handbook (Prentice Hall PTR: January 15, 2002) by Joseph M. Lamb

      V. Cardellini, M. Colajanni, and P. S. Yu, Dynamic Load Balancing on Web-server Systems, IEEE Internet Computing, 3(3):28--39, May/June 1999.

      NLB capabilities come built-in with high-end Windows servers.

      Use either the NIC properties page or the "Administrative Tools\Network Load Balanacing Manager" tool.

        A. After right-clicking "My Network Places", select "Properties".

        1. In the "Networking Connections" dialog:

          1. Rename "Local Area Connection" to your name for the private interface for NLB heartbeat messages.
          2. Rename "Local Area Connection2" to your name for the public interface.

            Select the "Network Load Balancing" check box.

            In the "Network Load Balancing Properties" dialog box, click the "Cluster Parameters" tab.

            Enter the public IP address in the Cluster IP configuration area. (This should be the same for all nodes in the cluster.)

            Enter the Fully Qualified Domain Name (FQDN) and network address handling incoming client requests.

        2. To verify whether the hosts "converge":

            wlbs query

        B. To use the "Administrative Tools\Network Load Balanacing Manager" wizard:

        1. Add a new cluster. The cluster's IP is the one routing outside traffic to (10.38.24.x), enter your mask, and the DNS name. Select multicast. Click Next.
        2. Add additional NLB IPs (you can always do this later). Click Next.
        3. Setup port rules. click next.
        4. Enter the IP for the first host. Usually, this is the machine you are currently logged into. This CAN be either the "outside" or "inside/heartbeat" IP. I just stick with the heartbeat IP. select the heartbeat IP from the list. Next.
        5. Select "1" as the priority. Click Finish.
        6. Add a host to the cluster. Enter the info for the second host. Click Finish.
        7. Wait a few minutes while the second host goes from "misconfigured" to "converged" status.
        8. Click the "refresh" button or Run command: nlb query

      Click the "Host Parameters" tab to supply the dedicated IP address and subnet mask for the private interface.

        On the first NLB node you’re configuring, leave the Priority (unique host identifier) value at the default of "1".

        On subsequent nodes, set the priority value to the next highest unused number.

      To stop an individual host:

        wlbs stop

      To start a host:

        wlbs start

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen SAP ITS Load Balancing

      To activate SAP's ITS integrated service with load balancing ABAP on WAS 6.40:

      1. Set up the ICF Service Server Group.
      2. Invoke tcode SICF.
      3. Select the ICF Service.
      4. Under Service Option enter the Server Group.
      5. Maintain Load Balancer settings using transaction ICM.

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen BEA WebLogic HttpClusterServlet Proxy Plug-in


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Load Balancing Hardware Appliances

    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Load Balancing Products

      "Application traffic management" providers:

    • f5 (NASDAQ) Big-IP is the most focused on enterprise data centers.
    • Kemp Technologies' $2,500 Load Balancer 1500 3port product has 100Mbps and 100 TPS SSL. has ISO Layer 7 application content switching. Their $9,000 product has 800Mbps throuput and 1000 TPS SSL.

    • Coyote Point $10,000 "Equalizer Extreme" load balancers run within 1U form factor Dell PowerEdge 1750 servers which have Intel Hyper-Threading technology. 3port product has 100Mbps and 100 TPS SSL. has Layer 7 content switching. Their $9,000 product has 800Mbps throuput and 1000 TPS SSL.

    • XRIO in the UK's Neteyes® Cyclone routers combine multiple connections and VPN tunnels.

    • Radware's routers Prioritize traffic and control bandwidth usage Protect your network from malicious attack signatures, denial of service and intrusions End to end monitoring of applications.

      Apache Web server can be setup so that it intelligently redirects clients to a secondary server by altering the Web server configuration files for the mod_rewrite module that load balances client requests to resolve logical host names to physical hosts.

    • Nortel

    • Foundry Networks ServerIron

    • Cisco CSS Series (formerly ArrowPoint)

    • 3Com Superstack 3 Server Load Balancer

    • PolyServe Database Utility enables SQL Server consolidation by virtualization of SQL Server machines.

    • Zeus XTMLB

      Reminder Load balance testing requires "IP address spoofing" to an allocated range of IP addresses.

      Scripts conducting load balance tests also need to vary the value of cookies.

      Linux Virtual Server


    Go to Top of this page.
    Previous topic this page
    Next topic this page
    Set screen

    Go to Top of this page.
    Previous topic this page
    Link to Performance Engineer More for Performance Engineers...


    How I may help

    Send a message with your email client program

    Your rating of this page:
    Low High

    Your first name:

    Your family name:

    Your location (city, country):

    Your Email address: 

      Top of Page Go to top of page

    Thank you!

    Human verify:
    Please retype:

    Visitor comments are owned by the poster.
    All trademarks and copyrights on this page are owned by their respective owners.
    The rest ©Copyright 1996-2011 Wilson Mar. All rights reserved.
    | | | |