I learned something new yesterday. It kind of flipped me out, but now it almost makes sense.
You can try this to confirm.
- From a client, ping the IP address of your NLB cluster.
- From the same client, run arp -a fom the command prompt.
You should see something like this (I will assume 192.168.2.11 for the NLB cluster IP address):
Internet Address Physical Address Type
192.168.2.11 02-bf-c0-a8-02-0b Dynamic
It will list other addresses and their MACs as well, but we are only interested in the NLB address. 02-bf-c0-a8-02-0b breaks down into nice little components like so:
- The first number is the type of NLB configuration: 01=IGMP, 02=Unicast, 03=Multicast
- The second number, (bf), is unknown in its origin, but it is the same for all NLB configurations
- The next four numbers are the IP address, i.e. c0=192, a8=168, 02=2, 0b=11 and thus the IP of 192.168.2.11.
OK, I already knew all of this. It is the following that was new to me.
It is the second set of numbers, bf, that is interesting to me. I can't find anything that tells me why bf is used, but it is always used when arp requests the MAC from the NLB IP address. Why I find it interesting is that it is not used at all when the NLB nodes send GARPs or when they return traffic. What each NLB node does, when sending traffic, is it spoofs the MAC as above except it replaces BF with the priority number. For example, if the NLB cluster node were configured with the number three as its priority (unique) number, then it would identify itself to the switch as being MAC address 02-03-c0-a8-02-0b. This allows the switch to happily enter the MAC Address in its table and have a one to one mapping of MAC Addresses to ports.
So, when an NLB client tries to connect to the IP address of the NLB cluster and does an ARP on the IP to identify the MAC Address, the switch fabric flips out because it can't find any ports that contain that MAC address and thus flood the fabric. The use of the priority number stops the switch fabric from trying to learn the actual MAC address of the NLB cluster and provides a bit of sanity/reality for the switch so that it is happy.
So, to summarize, each client connecting to the NLB cluster will use the bf MAC address as the destination which causes the switches to flood all ports with the traffic. Each NLB node sends data using the priority number instead of bf to stop the switch from learning the bf MAC address and trying to map it to a single port.
Of course, all of this leads us to the question about switch flooding and how to limit it. For this information see my blog entry on Unicast vs. Multicast.