The TCP Window may be one of the most critical parts of the data transfer process to understand, especially in data centers where backups are a daily process. In data centers today, the availability of massive bandwidth and high speed links makes it easier than ever to provide several Gigs of throughput between application servers and backup servers. With WAN connections getting larger, faster, and more efficient, end users have access to more throughput than ever before.
With all this cutting edge technology available, it is a wonder why some applications sill seem to lag. Even backups over 10Gig links can still take hours and hours. One of the causes of delay in some data transfer applications is the TCP Window. Some applications are still using a legacy TCP stack that was written for networks back in the 80’s, when a 2,400bps dialup seemed like a speedy connection.
Some applications today have ignored the impact that the TCP Window has on data transfer, and network engineers may skip the transport layer when troubleshooting slow backups and application delays. In this article we will clearly define the TCP Window, look at how it can impact performance, and show how this can be monitored using Wireshark.
What is a TCP Window?
When discussing TCP Windows, we are most often referring to the TCP Receive Window. Simply put, a TCP Receive Window is a buffer on each side of the TCP connection that temporarily holds incoming data. The data in this buffer is sent to the application, clearing more room for incoming data. If this buffer fills up, the receiver of the data will alert the sender that no more data can be received until the buffer is cleared. There are several more details involved, that that is the basic function. A device advertises the current size of its TCP Window in the TCP Header information.
In the screenshot above, the sender of this packet is telling the other side of the connection that it has a TCP receive buffer of 65,535 bytes. This is the maximum standard TCP Window Size. There are options within TCP to make it bigger, but for now let’s work with this as a maximum.
Each side of the TCP connection has its own TCP Receive Window. So at any point, these two windows may be different. For example, a web server often sends data to users, instead of receiving data from users. For this reason, a web server doesn’t need as large a TCP Window as a user may need. So the web server may advertise a receive window of 8192 bytes, while the client has a window of 65,535.
How can a TCP Window impact performance?
During a file transfer, data is flowing from one machine to another. The receiver of the data needs to keep it’s TCP Window from dropping down to zero, indicating that the windows has filled. If a TCP Window ever goes to zero, or gets close to zero, this alerts the sender of the data that no more room is left in the receiver for more data. File transfer will be halted until an update is sent showing the buffer has been cleared.
In the sample trace file for this article, there is an example of a dropping TCP Window that eventually halts the file transfer. (Note: This trace file was sliced, so Wireshark will tell you that the Packet size was limited on the large packets) In the trace, look at the window size of packet number 72. Data is flowing between the two machines, then in this packet we see the TCP window start to drop from the maximum of 65,535. The traffic doesn’t stop though. As we look at all packets from this machine (the receiver of the data) we see that the TCP Window continues to drop. Finally in packet number 138, it advertises a receive window of 2299. We see one more full size data packet, then the sender halts for 19 msec. It can’t send any more data because the TCP Receive Window on the receiver is full.
We see a 19mSec delay, after which the client sends an ACK with an updated TCP Window of 65,535. It cleared the TCP Receive buffer and it is ready for more data. The data source then fires away with more data. It may not seem like 19mSec is much time, but if we suffer this delay thousands, if not hundreds of thousands of times during a backup, this can cause hours of delay.
What can we do about it? If an application is using a full size window, and this is still not enough, there is little we can do from an analysis standpoint other than put the blame in the right place. That is, unless you are intimately familiar with the code of the application and feel comfortable rewriting it.
A TCP Window is connection based. If other TCP connections are in progress between the two machines, a halted window in one connection will not halt other connections. So in multi-threaded applications, data can still be moving on other connections while another is recovering.
Using Wireshark, how can I monitor the TCP Window?
During large file transfers, keep an eye on the TCP Window being advertised in the TCP ACK packets of the receiver. There are a few ways to do this with Wireshark. By default, the TCP window of the packet sender is displayed in the info summary view for each ACK packet.
Another way to show this is by using the I/O Graphs looking for the TCP Window Size to drop. To do this, use the tcp.analysis.window_update filter.
This graph shows the full size TCP Window dropping to nothing several times. While the window is down near zero, data is halted while the sender waits for the receive buffer to clear. Watch for these dips during large data transfers. The I/O graph makes them easier to see than combing through packet by packet!
If there are any questions about the sample trace file, or more about the function of the TCP Window, feel free to email or comment.
The next article in the TCP Series will feature sequence and acknowledgement numbers.
Thanks for all the feedback on this series!