Hey Network - Leave the Packets Alone!
Or… if you are going to change things, at least don’t break things!
Remember the first time you learned about NAT? Or PAT? Or changing TCP MSS at the router level?
Cool stuff. The network became involved in adjusting, changing, shaping, or translating packet header values as data passed through. These alterations to packet headers were made for several reasons - everything from extending the life of IPv4 address ranges to making header room for WAN acceleration technologies - allowing engineers to keep data flowing in today’s networks at top speed. That is pretty impressive given that the two main data protocols, IP and TCP, are about to turn 40 years old. (Wow!)
These adjustments to header values by the network are very much in use today, which is a great thing - until something goes wrong. At times, the configurations on routers and other network devices that make these adjustments can be either mistyped, misunderstood, miscalculated, or some combination of the three. This can lead to problems in both connectivity and performance of applications, which can appear to lag or break altogether.
Troubleshooting these kinds of problems can be a real pain, especially when digging in at the packet level. It is difficult to know if the header values are the ones that were really sent by the original sender, or if the network made some adjustments along the way.
As an example, I was involved in troubleshooting a problem with an application that was lagging during data transfers. After several rounds of packet capture, we found that the server and client had different ideas of what the TCP MSS was configured to be. Usually, this is a value that is negotiated in the TCP handshake, with the lowest number winning out. However a router between the client and server was changing the MSS in only the return (SYN/ACK) direction, causing a mismatch. The client sent big packets to the server, only to have these dropped by the network with no ICMP warning.
From the packet perspective, it was difficult to know if this was due to a compromised TCP stack or if some mystery device in the middle was mucking with the header values.
The best way to troubleshoot when you suspect the network is adjusting header values
Dual-side, simultaneous capture.
Collecting data at both client and server, as close as possible to the point of transmission without actually being ON the endpoint, goes a long way in analyzing and identifying these problems. A packet can be seen as it is originally sent and finally received. Any changes can be compared and analyzed. If the network is making some adjustments along the way that are causing problems, these can be easier to identify and resolve. Misconfigurations, misunderstandings, and miscalculations can be much easier to find.
These traces are also great for gaining deeper understanding of how these technologies work after implementation. Rather than reading a book about how NAT works, grab a pre and post NAT capture and compare the two. Then you can really see which values are adjusted by the network device and which are not.
Hey network, you are doing a good job with your adjustments and alterations, most of the time. But there are those occasions when you make things difficult.
Then again, you were configured to make those changes… So I shouldn’t shoot the messenger.
Got a tough network problem? Get in touch!