How FTP Data Connections Work Part 2 (OR: Fun With Port 20)
As we mentioned in the 1st part of this series, FTP is a more complex protocol than many, using one control connection and one data connection.
A recap of the first post…
In typical Stream Mode operation, a new data connection is opened and closed for each data transfer, whether that’s an upload, a download, or a directory listing. To avoid confusion between different data connections, and as a recognition of the fact that networks may have old packets shuttling around for some time, these connections need to be distinguishable from one another.
In the previous article, we noted that two network sockets are distinguished by the five elements of “Local Address”, “Local Port”, “Protocol”, “Remote Address”, and “Remote Port”. For a data connection associated with any particular request, the local and remote addresses are fixed, as the addresses of the client and server. The protocol is TCP, and only the two ports are variable.
For a PASV, or passive data connection, the client-side port is chosen randomly by the client, and the server-side port is similarly chosen randomly by the server. The client connects to the server.
For a PORT, or active data connection, the client-side port is chosen randomly by the client, and the server-side port is set to port 20. The server connects to the client.
All of these work through firewalls and NAT routers, because firewalls and NAT routers contain an Application Layer Gateway (ALG) that watches for PORT and PASV commands, and modifies the control (in the case of a NAT) and/or uses the values provided to open up a firewall hole.
Isn’t there a totally predictable data connection?
For the default data connection (what happens if no PORT or PASV command is sent before the first data transfer command), the client-side port is predictable (it’s the same as the source port the client used when connecting the control channel), and the server-side port is 20. Again, the server connects to the client.
Because firewalls and NATs open up a ‘reverse’ hole for TCP sockets, the default data port works with firewalls and NATs that aren’t running an ALG, or whose ALG cannot scan for PORT and PASV commands.
Why would an ALG stop scanning for PORT and PASV commands?
There are a couple of reasons – the first is that it doesn’t know that the service connected to is running the FTP protocol. This is common if the server is running on a port other than the usual port 21.
The second reason is that the FTP control connection doesn’t look like it contains FTP commands – usually because the connection is encrypted. This can happen because you’re tunneling the FTP control connection through an encrypted tunnel such as SSH (don’t laugh – it does happen!), or hopefully it’s because you’re running FTP over SSL, so that the control and data connections can be encrypted, and you can authenticate the identity of the FTP server.
So how do you get FTP over SSL to work through a firewall?
In the words of Deep Thought: “Hmm… tricky”.
There are a couple of classic solutions:
- Allow PASV data connections, select a wide range of ports, and open that range for incoming traffic from all external addresses in your firewall configuration; hope that your FTP server can be configured to use only that range of ports (WFTPD Pro can), and that it has protections against traffic stealing attacks (again, WFTPD Pro has). Still, this option seems really risky.
- Block all PASV connections, and make the clients responsible for opening up holes in their firewalls. If you’re convinced the risk is too great to do this on your server, how does it look to convince your users that they should accept that risk?
- After you’ve authenticated the server and provided your username and password in the encrypted control connection, issue the “CCC” (Clear Control Channel) command, to switch the control connection back into clear-text. I dislike this as a solution, because it requires the ALG pay attention to a lot of SSL traffic in the hope that there might be clear-text coming up, and because you may want the control channel to remain encrypted.
Awright, clever clogs, you solve the problem.
The astute reader can probably see where I’m going with this.
The default data port is predictable – if the client connects from port U to port L at the server (L is usually 21), then the default data port will be opened from port L-1 at the server to port U at the client.
The default data port doesn’t need the firewall to do anything other than allow reverse connections back along the port that initiated the connection. You don’t need to open huge ranges at the server’s firewall (in fact you should be able to simply open port 21 inbound to your server).
The default data port is required to be supported by FTP servers going back a long way- at least a couple of decades. Yes, really, that long.
If it’s that simple, why isn’t everyone doing it?
Good point, that, and a great sentence to use whenever you wish to halt innovation in its tracks.
Okay, it’s obvious that there are some drawbacks:
- In stream mode, the data transfer is ended by closing the stream. This means that you have to open a new control connection. Not good, given the number of round-trips you need for a logon, and the work needed to start an SSL connection.
- Most FTP clients view the default data connection as, at best, a fail-over in case the PORT or PASV commands fail to work. Obviously, that means it’s not likely to be a well-tested or favoured solution on these clients.
Even with those drawbacks, there are still further solutions to apply – the first being to use Block-mode instead of Stream-mode. In Stream-mode, each data transfer requires opening and closing the data connection; in Block-mode, which is a little like HTTP’s chunked mode, blocks of data are sent, and followed by an “EOF” marker (End of File), so that the data connection doesn’t need to be closed. If you can convince your FTP client to request Block-mode with the default data connection, and your FTP server supports it (WFTPD Pro has done so for several years), you can achieve FTP over SSL through NATs and firewalls simply by opening port 21.
For the second problem, it’s worth noting that many FTP client authors implemented default data connections out of a sense of robustness, so default data connections will often work if you can convince the PORT and PASV commands to fail – by, for instance, putting restrictive firewalls or NATs in the way, or perhaps by preventing the FTP server from accepting PORT or PASV commands in some way.
Clearly, since Microsoft’s IIS 7.5 downloadable FTP Server supports FTPS in block mode with the default data port, there has been some consideration given to my whispers to them that this could solve the FTP over SSL through firewall problem.
Other than my own WFTPD Explorer, I am not aware of any particular clients that support the explicit use of FTP over SSL with Block-mode on the default data connection – I’d love to hear of your experiments with this mode of operation, to see if it works as well for you as it does for me.