Ever Hear Of A Routing Audit?
I get a lot of pleasure ensuring that things work as advertised.
Regardless if I am local, remote, troubleshooting or teaching, I always have a mental list of things I like to check. The list changes on the fly depending on the topology, equipment and location, but nevertheless I have a rough idea of what I like to look for.
Some of these tests may seem pretty straight forward like a throughput test, measuring how long it takes for a failover to work itself out or hunting for errors. Other tests I propose, gets me those sarcastic, "Why" and “Huh” stares like routing and stability audits. In this article let cover the basics around what I call a routing review or audit. I usually hear the following comments; “why bother, everything is obviously working” or “We aren’t getting any complaints about that”.
Let me walk you through how this typically unfolds; after reviewing the network diagram, or creating one with post-it notes, I sit down with the client to determine how many hops it should take to get from one host to another and which path packets should take. As long as ICMP isn't blocked, a simple traceroute from a client computer will do. In some other examples, we perform a traceroute since some network devices can provide additional diagnostics with its results.
Routing is usually taken for granted in the sense that if you are getting there, obviously it must be working. I am not trying to prove if its working or not, I’m trying to determine how WELL it is working. Let face it, in the past 10 years or so things don’t break as they did in the 90’s, but things sure slow down.
In the past, I have uncovered routing loops, multiple routes and extra hops. The important thing to keep in mind when going through this exercise and you discover something odd, step back perform multiple tests and truly understand why it is happening. Create a plan for your proposed change and a backup. Lastly don’t forget to test to ensure your changes had the intended impact.
Here’s an example of a traceroute from a layer 3 Cisco switch:
1 10.16.30.252 0 msec
10.16.30.254 8 msec
10.16.30.243 0 msec
2 18.104.22.168 16 msec 8 msec
10.16.30.252 4 msec
The routing results from this layer 3 switch identify 3 different routes. Unfortunately this wasn’t the intended design. The client started asking a series of questions to his team like, is that the order of the routes it uses, what is the cost of those routes, what protocols are being used, etc..
You should occasionally verify and validate your routes. This also helps build what I call 'tribal knowledge', which is invaluable when troubleshooting or working on network upgrades.
Don’t forget that there a tremendous amount of value in verifying everything is working as designed.
Just because it works, doesn’t mean it works well