Using NLB with ISA Server, Part 1: How Network Load Balancing Works
by Thomas W Shinder, M.D.
One of the least understood of all the Microsoft networking technologies is the Network Load Balancing (NLB) service included with Windows 2000 Advanced Server and Windows 2000 Datacenter server (as well as the upcoming Windows 2003 Advanced and Datacenter servers). This is a real shame, because NLB isn’t that difficult to implement and can provide great added value to those organizations who need increased uptime for network services. This is especially the case for mission critical services such as Microsoft ISA Server firewall services that allow access to inbound VPN, published servers, and outbound access for internal clients.
Poor old NLB suffers from an old Microsoft problem – marginal documentation. Windows 2000 saw the quality of documentation improve by leaps and bounds, but unfortunately NLB got left out of this wave of improvement. That’s cool, because it gives guys like me something to do with my time. NLB provides the perfect opportunity to put the Microsoft .doc’s words to the old Shinder Rosetta stone.
NLB allows you to provide load balancing and fault tolerance for incoming connection requests. Load Balancing is the process of dividing requests for services among multiple NLB array members. For example, you might want to evenly distribute inbound VPN requests among three ISA/VPN servers you have connected to the Internet. This distribution prevents any one VPN server from becoming overburdened with network traffic and encryption processing. Life is easy to the VPN clients because they can all connect to the same IP address and be automatically load balanced to the appropriate VPN server in the NLB array.
Fault tolerance is the second powerful NLB feature. You might not be interested so much in splitting the traffic among multiple servers as you are in high availability. You’re more interested in making sure the service is always available. For example, you have about 200 external users that must have VPN services available to them. These users do not tax your ISA/VPN server resources. You just need to make sure that VPN services are always available to these users. You can use NLB to make sure that another ISA/VPN automatically takes over if one of the ISA/VPN servers fails (at least until all members of the NLB cluster fail).
You can also configure NLB to provide both fault tolerance and load balancing. So in the case of our VPN servers, you can configure NLB to evenly distribute the requests among multiple servers and automatically fail over in the event that one of the servers in the NLB array becomes unavailable. This is the best of all possible worlds because the impact on each NLB array server is minimized and client connections are automatically redistributed when a server becomes unavailable.
The remainder of this article will go over some of the basics of NLB. My goal is that you’ll be able to dive into the Microsoft documentation on NLB and not have to spend to or three days trying to figure out what the heck they’re saying. You’ll be able to use this information to provide context for your further reading. In part 2 of this article, I’ll go over the details of unicast and multicast modes and in part 3 of this three part article, I’ll finish up with a description on how to configure the NLB interfaces.
How Network Load Balancing Works
Network Load Balancing works at the network layer (layer 3), the transport layer (layer 4) and the data link layer (layer 2). The NLB service uses layer 2 awareness to deliver packets to all servers in the NLB array and uses its layer 3 and 4 awareness to handle load balancing based on source and destination IP addresses and UDP/TCP port numbers. All frames are delivered to each host in the NLB array. The NLB driver sits on top of the network interface drivers and it makes decisions regarding which packets should be passed up the network stack and which packets should be dropped at layer two. Some people refer to this as "spraying" the frames onto each array members. The NLB service then decides which frames will "stick".
NLB only works for TCP, UDP, ICMP and GRE messages, and you can only configure customized load balancing rules for TCP and UDP communications. Other protocols will fail over but be load balanced via NLB port rules (we’ll talk more about port rules in later in this article in parts 2 and 3 of this series).
All members of the NLB array use the same rules regarding how to assign packets to individual array members. All NLB array members must have the same set of rules. If they don’t, they go through a process called "convergence". Convergence allows the members of the NLB array to reconfigure the array properties so that incoming packets distributed correctly. Convergence also takes place members are removed or added to the NLB array.
NLB arrays consist of between 2 and 32 servers (inclusive). All members of the array listen on one or more common IP addresses. These common IP addresses are referred to as the Virtual IP addresses or "VIPs". Array members can optionally have what are called "dedicated" IP addresses. The dedicated IP addresses are specific for each array member and are not duplicated within the array or subject to NLB port rules. Connection requests to a VIP will be subject to NLB’s load balancing and fault tolerance configuration. Connection requests to the dedicated IP address is to a specific server and is not shared by servers in the array.
Look at the figure below. All the ISA Servers have the IP address 220.127.116.11 bound to the external interface. NLB allows you to do this; if NLB weren’t installed on each of the interfaces, you would receive a TCP/IP error because the ARP that takes place when each machine starts up with detect a duplicate IP address and prevent the duplicate addresses from being bound to the other servers. Each server in the array also has a dedicated IP address bound to the external interface. Notice that the dedicated IP addresses are different on each server. Hosts connecting to the dedicated IP address connect to a specific server. NLB will not load balance or fail over connections to the dedicated IP address.
The dedicated IP address is of vital importance to our ISA Servers, since it’s the IP address we need to use as the source address for outgoing packets.
In the figure below, an external clients attempts to connect to a published Web server on the internal network by sending a request to http://18.104.22.168. Because this is an NLB virtual IP address, the NLB software will intercept the request and load balance the request based on any port rules that apply to it. If there is a port rule configured to load balance incoming TCP 80 connection requests, the NLB algorithm will assign the request to one of the hosts based on the rule configuration. If there is no port rule, the connection will be assigned to the NLB array member with the highest host priority (we’ll talk about host priority numbers in part 3 of this article).
NLB Heartbeat Messages and Convergence
All members of the an NLB array must have a consistent configuration. The port rules must be the same on each server and you must configure host IDs (also known as host priority numbers or IDs) so that there is no duplication. If there is any inconsistency in the NLB configuration among the array members, the array will enter a convergence state.
NLB array members assess the array configuration state by sending out heartbeat messages every second (by default, it can be configured in the Registry). If you do a packet trace, you’ll be able to see these heartbeat messages. If you use Network Monitor for your trace, you’ll see a bunch of unknown type frames with a Ethertype value of 0x886F.
Convergence also takes place when a new member is added to the array. When a new member is added, the array configuration changes so that requests can be assigned to the new array member based on port rules and the host ID numbers of the new and current members. I’ll talk more about host ID numbers later in this article.
Existing TCP connections remain intact when a new member is added to an array because NLB can track the connection state of an ongoing session. UDP "sessions" may be broken when the new array member is added because UDP is a "stateless" protocol that doesn’t include flags indicating the current state of the connection (such as the TCP FIN flag indicating the end of a connection). This is why you’ll see L2TP/IPSec connection lost when a new array member comes online. The UDP 1701 L2TP control channel may be reassigned to another server, which breaks the L2TP/IPSec connection. Because of this and for other reasons, L2TP/IPSec VPN connections are not supported by the Windows 2000 NLB service.
You’ll also see convergence take place when an array member drops away from the NLB array. Once again, the TCP connections continue unaffected, but UDP connections may be remapped to another NLB array member. This is generally not a problem because UDP connections are not session oriented. Convergence will take place only after the downed array member misses 5 heartbeat messages (this is configurable in the Registry as well).
The heartbeat frame looks like what you see below on an unmodified Network Monitor trace. Note that you can use tools included in the Windows 2000 Resource Kit to fully decode NLB heartbeat messages. The protocol is listed as unknown because there is no parser for heartbeat messages with the stock Network Monitor. (Note that the array was configured in unicast mode at this time)
29 8.572326 0201AC100001 *BROADCAST ETHERNET ETYPE = 0x886F : Protocol = Unknown
Frame: Base frame properties
Frame: Time of capture = 1/28/2003 13:36:56.756
Frame: Time delta from previous physical frame: 701008 microseconds
Frame: Frame number: 29
Frame: Total frame length: 1510 bytes
Frame: Capture frame length: 1510 bytes
Frame: Frame data: Number of data bytes remaining = 1510 (0x05E6)
ETHERNET: ETYPE = 0x886F : Protocol = Unknown
ETHERNET: Destination address : FFFFFFFFFFFF
ETHERNET: .......1 = Group address
ETHERNET: ......1. = Locally administered address
ETHERNET: Source address : 0201AC100001 (other address is 0202AC10001)
ETHERNET: .......0 = No routing information present
ETHERNET: ......1. = Locally administered address
ETHERNET: Frame Length : 1510 (0x05E6)
ETHERNET: Ethernet Type : 0x886F
ETHERNET: Ethernet Data: Number of data bytes remaining = 1496 (0x05D8)
Load Balancing Algorithm
The NLB load balancing algorithm makes assignments to NLB array members based on the number incoming packets per unit time. NLB does not asses the CPU or memory load on any of the NLB array members when it makes decisions on how to load balance incoming connections. You need to keep this in mind when assigning load proportions to members in the array, because the NLB algorithm does not take these factors into account. Load proportions are configured on a per port rule basis and we’ll cover the details later in this article.
The basic NLB algorithm uses the source IP address and source port information on an incoming request and performs a hash on the result. The resultant hash value is associated with a host ID number. Each member of the array is assigned a host ID number as well. The packet is forwarded to the host with the host ID number that is the same as the host ID number assigned to the packets hash value. However, port rules and affinity settings can adjust how the algorithm assigns incoming packets and determine the "stickiness" of the association between a client and an NLB array member.
Affinity defines the relationship between the source host and the destination array member. There are three types of affinity: None, Single and Class C.
The incoming packet can be sent to any host in the cluster using the default NLB algorithm described above. Both the source port number and IP address of the external client request is used to determine which member of the NLB array receives the packets. Different array members can handle requests from the same client IP address as long as the source port number in the request is different. Setting the Affinity to None is useful in non-firewall related NLB clusters that have been Web Published. When you use Web Publishing Rules, the source IP address is always the same, so load balancing takes place based on source port number of the incoming request to the published NLB array. We would not use this option when configuring ISA Server to use NLB but would use this type of affinity if we published an NLB Web server farm on the internal network.
Only the IP address is used to determine how client requests are load balanced among cluster members. When using Single Affinity, the same source IP address is always sent to the same server in the cluster. This allows all connections from the same IP address on the external network host to connect to the same server in the NLB array. This is especially important in cases where you need to maintain session state between a client and specific array member, such as when the client makes a VPN connection. It wouldn’t work very well to use the None Affinity with PPTP VPN connection, since you need both the GRE and TCP streams assigned to the same server. If they were assigned to different servers in the array, the VPN link could never be established. The same situation exists with SSL connections. The SSL session needs to be "linked" to a single ISA Server, since multiple connection requests are sent through the established SSL tunnel. Like the VPN server example, it wouldn’t work to create an SSL link with one ISA Server and have connection requests load balancing among all servers in the array.
Class C Affinity is similar to Single Affinity, except that instead of using the entire IP address, it only uses the W, X and Y octets of the IP address. This allows any IP address with the same high order 24 bits to connect to the same cluster member each time a connection request is created. This is useful when the clients are located behind a proxy array that contains several proxy servers with external IP addresses in the same class C network ID. For example, IP addresses 22.214.171.124 and 126.96.36.199 would be assigned to the same NLB array member because the first 24 bits of the IP address is the same.
In this article I went over some basic concepts on what the Windows 2000 Network Load Balancing services does and how it works. You should now have a better idea of how incoming connections are assigned to members of the array and what happens when an array member is added or removed. With the knowledge gained in this article, you’ll be well prepared for part 2, where I’ll discuss the mind-numbing topic of multicast versus unicast mode. In part 3, I’ll discuss the NLB configuration parameters. After we finish the introduction to NLB, I’ll post articles on using NLB for inbound VPN access, using NLB and server publish rules, using NLB with Web Publishing Rules, and using NLB for outbound access for SecureNAT, Firewall and Web Proxy clients, and finally an article on how to use NLB together with CARP to create a powerful fault tolerance, load balanced and very high performed caching array. I’ll cover all this information in my Dallas ISA Server seminars and provide cool demos.
I hope you enjoyed this article and found something in it that you can apply to your own network. If you have any questions on anything I discussed in this article, head on over to http://forums.isaserver.org/ultimatebb.cgi?ubb=get_topic;f=2;t=007686 and post a message. I’ll be informed of your post and will answer your questions ASAP. Thanks! –Tom