Go back

Using Automation to Avoid the VTP Bomb

April 6, 2018

Early in my career I laughed along with engineers making jokes about the dreaded VTP bomb, but I always thought the disaster stories were more of an exaggeration than reality. Surely VTP hadn’t destroyed networks as much as they suggested? There’s something about the simplicity of the VLAN Trunking Protocol that seems to make it dangerous in the hands of a careless engineer. And, unfortunately, I’ve been that careless engineer.

The VLAN Trunking Protocol, or VTP, is a technology used to make configuring VLANs faster and easier. Dynamic VLAN propagation ensures that all switches within the VTP domain have consistent VLAN configurations, and it also means that adding new switches is simpler because they dynamically inherit VLAN information once connected.

When not handled carefully, VTP can do tremendous damage.

 

VTP is normally found in the access layer more than in other parts of a network. Especially in large organizations, access switches are moved, swapped, added and re-added relatively often. Sometimes it’s to replace a failed closet switch, and sometimes it’s to upgrade an aggregation switch to something with more ports or greater bandwidth. In a large network with many VLANs, this makes VTP a compelling technology for making switch deployment more efficient.

However, when not handled carefully, VTP can do tremendous damage.

Using VTP requires a strong knowledge of a network, including which switches are acting as VTP servers, which are in transparent mode, which are in client mode, and so on. Especially when introducing new switches into a VTP domain, it’s critical to have this type of network awareness in order avoid a VTP incident.

The Old Way to Prevent a VTP Bomb

In the days when the ink on my CCNA certification was still wet, I worked on a switch refresh project for a large school district. The customer gave us several parameters such as default gateways, DNS servers, SNMP community strings, hostnames, and — you guessed it — VTP information, but we never did a discovery of their network. We didn’t have the hours to dig into their infrastructure, so instead of finding out what switch was the VTP server and what revision number it had, we simply slapped configs on switches and planned our cutovers.

The deployment progressed quickly as we spent evenings swapping out closet switches, and I enjoyed it very much. The school was quiet, and we had massive amounts of pizza and coffee to keep us going. At the end of the hall we had a small stereo playing our favorite classic rock station.

While we debated who was the best grunge band of all time, a security guard stopped by to tell us the bus garage was offline. He had no access to the security cameras, email, or the internet. He wasn’t very concerned, so neither were we.

However, when we saw all the APs blinking and wall-phones unregistered, we knew we had a problem. We were working on only one closet which had its share of connections for APs and phones, but it looked like the entire building was down. In fact, the bus garage was a separate building altogether, so we knew something was seriously hosed.

show vtp status cliConfiguring VTP requires only a handful of commands, but manually investigating every switch, one by one, is error-prone and time-consuming.

Thankfully, figuring out the issue didn’t take long. We logged into the core switch and poked around. It didn’t take a room full of CCIEs to see that there were no VLANs on the switch. After more investigation we saw there were none on the entire network, other than VLAN 1, of course.

We had experienced the dreaded VTP bomb.

Someone plugged in a switch with a higher VTP revision number than everything else and wiped the entire network’s VLAN configuration. We didn’t know which switch was the culprit or which one of us did it, but I’m pretty sure it was me because my coworker was focused on cabling.

VTP bombs may not be quite as serious an issue anymore with the advent of VTP version 3, which introduced safeguards against this kind of thing. However, the real solution to VTP incidents is a thorough understanding of the network and proper configuration of new devices.

Configuring VTP requires only a few commands, and getting a good grasp of how VTP is operating on a network likewise requires only a handful of commands. The problem is that the access layer has many devices, making a thorough investigation tedious and easy for an engineer to dismiss.

  •  show vtp status displays information such as vtp version, revision number, VTP domain name, and the operating mode.
  • show vtp devices queries the VTP domain and displays discovered VTP servers and clients.
  • The output from the show vtp counters command will show an engineer the VTP activity on a particular device.

In a network of even modest size, this would require going from switch to switch until an engineer was sure he or she covered every device. At best, doing this manually is error-prone and tedious. At worst, engineers avoid doing it altogether.

The New Way to Prevent a VTP Bomb

NetBrain takes the pain out of doing this — or any other manual configuration on many devices. Built specifically to automate these sorts of tasks, their Runbook technology provides an engineer a framework to gather all the relevant VTP information from a network with only a couple of clicks. In the screenshot below you can see a Dynamic Map of a discovered network on the right and a Runbook on the left used to automate the gathering of VTP information across the entire infrastructure.

check vtp statusThe Dynamic Map highlights VTP roles, VTP server, VTP client, VTP transparent; and VTP domain name, VTP mode, VTP running version, configuration version, and VTP pruning mode are embedded as device-level data tables.

Also consider one of the more common VTP-related issues: a password mismatch. This is incredibly tedious to check one device at a time and is extremely prone to human error. Because a Runbook works from a Dynamic Map of discovered devices and has within it all the commands an engineer would run, NetBrain is able to automate an entire troubleshooting workflow — completing tasks in seconds rather than hours.

In the screenshot below, notice that the Runbook specifically runs actions, called Qapps, to find common VTP configuration problems such as password mismatches.

vtp password mismatchRunbooks perform all the steps you would — only automatically instead of manually command by command, device by device. Here, it checks for password mismatches and highlights the results right on the Dynamic Map.

But the power of Dynamic Maps and Runbooks is in how they enhance an entire workflow. In the next screenshot you can see that after checking for VTP password mismatches, the Runbook moves right into the next action to check VTP interface mismatches. In this way, the Dynamic Map and Runbook work together to create a completely automated environment for an engineer instead of having to use console connections and out-of-date spreadsheets.

vtp interfcae mismatchThen the Runbook automatically runs another Qapp to check for interface mismatches — again, across all devices in one fell swoop.

My co-worker and I were able to get the VLANs back onto the core switch and repair some of the damage, but we still needed to go switch by switch to find the stragglers and paste VLANs into them. We got an earful for our carelessness, but we ultimately got everything up and running, though it took hours of manual configuration.

Because I experienced my own VTP-related incident, I don’t laugh much anymore when someone makes a VTP joke. Unfortunately, I was the cause of it, but it wasn’t because I didn’t understand VTP or didn’t know the commands. I didn’t do my due diligence.

The power of Dynamic Maps and Runbooks is in how they enhance an entire workflow.

 

Make sure to take advantage of NetBrain’s free Instant Trial to see for yourself how powerful Dynamic Maps and Runbooks are in automating an enterprise network. Dozens of pre-built labs focused on specific technologies, including VTP, provide a sandbox to experiment with all of NetBrain’s features.

Discovering a network is tedious and time-consuming, but neglecting it can lead to huge problems. Automation software that mitigates human error and abstracts away the pain not only makes our jobs easier but it safeguards against outages such as the dreaded VTP bomb.

Related