[Why is this under "Programmer Hubris"? Because it's about developers who find "an easy fix" and apply it, without trying to figure out why it made things appear to work better.]
I like to read Larry Osterman and Raymond Chen's blogs, because they've seen most things, and learned most of the lessons right. Today, Larry posted on network optimisation - reminding me once again:
Trying to fix network speed problems by enabling TCP_NODELAY is almost always wrong. Setting TCP_NODELAY disables the Nagle algorithm
When you do this, it's an indication that either your program is broken, or your protocol is broken, or quite possibly both.
"But what about FTP?" people say.
Yes, the protocol is broken. It requires a send / send / recv exchange, and you have to enable TCP_NODELAY (disabling the Nagle algorithm in the process) to make it work properly, so that you can have more than five files transfer per second.
So what does Nagle do, exactly? Well, John Nagle spends his time pushing dead bodies down staircases. The Nagle algorithm, named after John despite his humble protestations, does a couple of simple things every time you try to send data:
If there is unacknowledged data in the queue, then don't send until:
- all data is acknowledged, or
- you have a segment's worth to send.
- Uh... that's it.
The idea is to take a program that, rather stupidly, sends one character at a time, resulting in a 1:40 data:framing ratio, and turn it into one that sends several characters at a time, using the network bandwidth more efficiently and not becoming a network hog.
Way back when, this was a perfect idea, and could have remained perfect, if it weren't for the delayed ACK algorithm, whose author is not remembered so fondly as to name the algorithm after him. Delayed Ack states that you should not send an unaccompanied ack until either:
- you receive two segments
- 200ms expires since the first piece of data was received.
[An ACK is included in every TCP segment, so it's not an overhead of any kind except when you're sending nothing
but an ACK.]
In most environments, this is still good, because most protocols are "client sends command, server sends response" over and over again, so each side is doing one send, then one recv. Model this behaviour in your head, and you'll see that the Nagle algorithm won't stop any outgoing traffic, and nor will delayed ACK hold up any acknowledgements.
If one side hiccups and calls send() twice (on short data) before receiving, however, things come to (in networking terms) a screeching halt. The second send queues up because the Nagle algorithm doesn't get an acknowledgement from the first send until the delayed ACK algorithm has exhausted its timer.
The answer is always "don't send / send / recv". Always group logically-associated data together in one call to send, unless you're sending large amounts of data, in which case you can happily send / send / send until you've exhausted your data.
Setting TCP_NODELAY will look like it makes your program perform at top speed, but it's now become the network hog that you programmed it to be, and that Nagle was helpfully preventing you from being. Fixing the program will make your program perform faster than it would with TCP_NODELAY alone, and you will find that the TCP_NODELAY setting will have no further effect, on or off. Your program is now working smoothly, a good network citizen, and the Nagle net-cop allows it to go about its business unimpeded.
So, my final piece of advice - if TCP_NODELAY looks like it makes your program perform better, fix your damn program! There's too much crappy networking software out there already, and you don't want to add to it.