Go back

Why Is Network Troubleshooting so Hard?

July 27, 2017

Networks are critical in enabling day-to-day business and operations by today’s companies. Unfortunately, this means when an outage occurs, the business can hit a standstill. The result? Revenue loss, damaged customer loyalty, and more.

Even more grave is the common state of troubleshooting for these outages. Network troubleshooting remains notoriously difficult for so many businesses, and unfortunately, networks are difficult to test through their very nature.

The first step towards any solution is fully understanding the problem. Below, we’re going to deeply discuss the manual processes making network troubleshooting scenarios so error prone, and outline each pain point encountered along the way.

Network Troubleshooting Methodology

Let’s get started, shall we?

Critical Network Troubleshooting Questions

A large function of what makes network troubleshooting so hard is that there’s so much information to work with!

As network administrators, we all spend an incredible amount of time stuck in the diagnosis phase, gathering and analyzing data, et cetera. Frankly, there is inadequate visibility at the network level, and this inhibits successful troubleshooting.

Below, we’re going to turn to each of the following network troubleshooting questions in turn and examine how each is “answered” in a conventional network setup:

  1. “What’s the path?”
  2. “How is it configured?”
  3. “What’s happening on the network?”
  4. “What’s changed?”

Network Troubleshooting Question: “What’s the path?”

In a conventional network troubleshooting scenario, you first have to identify what the traffic path is. What is the traffic flow for a given problem?

Network Troubleshooting Question: “How is it configured?”

From here, you attempt to divine how the traffic is designed to flow along the aforementioned path and discover how it’s configured. Each step must be performed in turn before we even attempt to ascertain what’s actually happening on the network.

Network Troubleshooting Question: “What’s happening on the network?”

After you’re able to clear up what’s happening with the traffic path, you turn to “What’s happening on the network?”

Specifically, during this stage of network troubleshooting, you attempt to answer what’s happening on live devices. In addition, you surmise whether there may be issues on any devices or interfaces, up and down and if they’re stable.

Network Troubleshooting Question: “What’s changed?”

The final issue during a conventional network troubleshooting exercise is asking, “What’s changed?” This is a crucial step, as visibility into that changed will deliver you visibility into 50% of the problem.

Now, all of the above may not seem like too much when you gloss over the steps on a document. However, the reality of being in the trenches is quite alarmingly different.

What Happens in the Trenches

Network Troubleshooting Problems
Let’s take a brief look at the common network troubleshooting user experience, in relation to actually dealing with the questions and problems divulged above:

1. First, you’re presented with an automatic monitoring alert indicating there’s a big issue that requires your attention. As part of determining the path, you work with network diagrams.

Even though these diagrams are the best resource you have, you’re stuck asking both “Where is the diagram?” and “Is the diagram up-to-date?” The latter is a particularly tough query because the networks are changing so frequently, and unfortunately, without updated maps, you only have ‘ping/trace route’ to understand how traffic is propagating

2. After this point, you’re stuck “getting in the weeds” with Command Line Interface (CLI) in order to probe configurations in an effort to ascertain design. This entails pulling up design docs, et cetera.

3. Finally, you’re left to frantic consultations with your team to determine “what’s changed?”  This involves pouring over change logs, which is a ton of work.

Network Automation as an Alternative

The alternative to all the time-consuming nonsense described above is leaning on a network automation platform to powerfully transform the process of network troubleshooting.

Intrigued? Learn more about NetBrain’s solution to Network Troubleshooting.