top of page

..Another Upgrade.. well you know

I was working with a long-time client on a fairly major upgrade for a remote site. We were replacing equipment, cabling, and some firmware upgrades.

I was invited to sit in on the planning meetings since I know the network and staff fairly well. During our first meeting, we documented all the equipment that will be 'touched', pros and cons, dependency analysis, and the typical backup plan if things should go south.

A few meetings later the team reviewed the actual action plan, roles and responsibilities, and timelines. It was in this meeting that I suggested that they make the changes in discrete 'phases' so we can better monitor if the change worked and back out more quickly if things should go wrong. the fact this place is fairly remote can complicate troubleshooting further.

I was met with a chorus of teasing and the title of "Grim Reaper" and "Negative Tony". I took it all in good fun since I know them so well but cautioned them that if they make all the changes at once and something goes wrong, it will take a lot longer to diagnose the issue. They were so confident, that they "got it", they even suggested I sit this one out. I gladly responded with "Great, I'll be at the cottage".

Fast forward to the change and they went with their wholesale change and I heard nothing during the change window and went with "no news is good news" ;) I turned off my phone and enjoyed my weekend.

Monday morning I had a ton of voicemails, texts, and emails asking me to call immediately. unfortunately, they ran into issues and have completely backed out but are still down. I immediately hopped in my car and went out to help. During my 2-hour drive, I asked all the typical questions and nothing jumped out at me. They blamed the firmware upgrade, new router, and anything they changed, but if it's all back why is it still down?

I started with my standard "Have you walked the site yet?", and they replied, "Of course we did and found nothing.". I replied, "So it won't take long to do it again with me, right?". Here's where the fun starts...

As we walked around, I immediately noticed there was no grounding for the new outdoor AP and they responded "We'll take care of that later", then I noticed that one of the enclosures was jam-packed with ethernet cables. after further investigation, I realized there were 5x12 foot cables when all we needed was 3 footers. They told me the installer forgot the short cables and will be back. Then I pointed out that the main backhaul enclosure had no power. That's when the finger-pointing started amongst themselves trying to figure out who, if anyone physically checked. I told them we can figure out who to blame later, let's fix the problem first.

I traced the power and main ethernet cable only to find that someone had nicked both cables in a door. This ironically was the first thing on their list for their change.

Believe me, I understand we want a change to be over with as quickly as possible and I totally understand that in the midst of troubleshooting a 'down' scenario, you have to believe what someone tells you. But you must get to a point where you have to start from scratch and validate/verify everything reported.

Check out the cables below.




bottom of page