Early in my career I laughed along with engineers making jokes about the dreaded VTP bomb, but I always thought the disaster stories were more of an exaggeration than reality. Surely VTP hadn’t destroyed networks as much as they suggested? There’s something about the simplicity of the VLAN Trunking Protocol that seems to make it dangerous in the hands of a careless engineer. And, unfortunately, I’ve been that careless engineer.
The VLAN Trunking Protocol, or VTP, is a technology used to make configuring VLANs faster and easier. Dynamic VLAN propagation ensures that all the switches within the entire VTP domain have consistent VLAN configurations, and it also means that adding new switches is simpler because they dynamically inherit VLAN information once connected.
VTP is normally found in the access layer more than in other parts of a network. Especially in large organizations, access switches are moved, swapped, added and re-added relatively often. Sometimes it’s to replace a failed closet switch, and sometimes it’s to upgrade an aggregation switch to something with more ports or greater bandwidth. In a large network with many VLANs, this makes VTP a compelling technology for making switch deployment more efficient.
However, when not handled carefully, VTP can do tremendous damage.
Using VTP requires a strong knowledge of a network, including which switches are acting as VTP servers, and on which server mode — whether transparent mode or client mode, and so on. Especially when introducing new switches into a VTP domain, it’s critical to have this type of network awareness in order to avoid a VTP incident.
What is VTP in Networking?
VTP, meaning VLAN Trunking Protocol, is a two-layer protocol that automatically manages and propagates VLAN configurations throughout a switch network. Through VTP commands, data is sent between switches to synchronize their databases, which maintains VLAN configuration consistency in the correct VTP domain. The network technician monitors the VTP advertisements from the server to verify that virtual local area network configuration updates are properly broadcasting to all switches in the domain.
Because VTP networking simplifies and automates multiple switches on the same VTP domain, organizations typically prefer to use it. While individual configurations may be simple enough for small network administrators to do manually, configuring VTP for larger networks is necessary to avoid having to go from switch to switch to configure them.
Imagine having to physically access and manually configure hundreds of switches among devices. VTP protocol simplifies configurations as it works from a central server, synchronizing databases and ensuring consistency through a VTP server mode.
For organizations whose networks keep expanding, VTP ensures the configuration of newly added switches. It automates the switch onboarding process, which eases the administrators’ manual work without errors. When you create a new VLAN, VTP automatically ensures it propagates throughout the entire VTP management domain.
It’s important to note that when you have a VLAN Trunking protocol for automation, its core VTP propagation and mandatory database synchronization capabilities indirectly enable VTP bombing through distorted VLAN configuration updates. The VTP network issues typically occur when VTP mode isn’t transparent, such as when switches are not set to “VTP mode transparent”, posing a significant risk to network stability and security. Read on to find out how to prevent a VTP bomb.
The Old Way to Prevent a VTP Bomb
In the days when the ink on my CCNA certification was still wet, I worked on a switch refresh project for a large school district. The customer gave us several parameters such as default gateways, DNS servers, SNMP community strings, hostnames, and — you guessed it — VTP information, but we never did a discovery of their network. We didn’t have the hours to dig into their infrastructure, so instead of finding out what switch was the VTP server and what revision number it had, we simply slapped configs on switches and planned our cutovers.
The deployment progressed quickly as we spent evenings swapping out closet switches, and I enjoyed it very much. The school was quiet, and we had massive amounts of pizza and coffee to keep us going. At the end of the hall we had a small stereo playing our favorite classic rock station.
While we debated who was the best grunge band of all time, a security guard stopped by to tell us the bus garage was offline. He had no access to the security cameras, email, or the internet. He wasn’t very concerned, so neither were we.
However, when we saw all the APs blinking and wall-phones unregistered, we knew we had a problem. We were working on only one closet which had its share of connections for APs and phones, but it looked like the entire building was down. In fact, the bus garage was a separate building altogether, so we knew something was seriously hosed.
Configuring VTP requires only a handful of commands, but manually investigating every switch, one by one, is error-prone and time-consuming.
Thankfully, figuring out the issue didn’t take long. We logged into the core switch and poked around. It didn’t take a room full of CCIEs to see that there were no VLANs on the switch. After more investigation we saw there were none on the entire network, other than VLAN 1, of course.
We had experienced the dreaded VTP bomb.
Someone plugged in a switch with a higher VTP revision number than everything else and wiped the entire network’s VLAN configuration. We didn’t know which switch was the culprit or which one of us did it, but I’m pretty sure it was me because my coworker was focused on cabling.
VTP bombs may not be quite as serious an issue anymore with the advent of VTP version 3, which introduced safeguards against this kind of thing. However, the real solution to VTP incidents is a thorough understanding of the network and proper configuration of new devices.
Configuring VTP requires only a few commands, and getting a good grasp of how VTP is operating on a network likewise requires only a handful of commands. The problem is that the access layer has many devices, making a thorough investigation tedious and easy for an engineer to dismiss.
- Sshow VTP status displays information such as VTPvtp version, revision number, correct VTP domain name, and the operating mode.
- Sshow VTP devices queries the VTP domain and displays discovered VTP servers and clients.
- The output from the show VTPvtp counters command will show an engineer the VTP activity on a particular device.
In a network of even modest size, this would require going from switch to switch until an engineer was sure he or she covered every device. At best, doing this manually is error-prone and tedious. At worst, engineers avoid doing it altogether.
The New Way to Prevent a VTP Bomb
NetBrain takes the pain out of doing this — or any other manual configuration on many devices. Built specifically to automate these sorts of tasks, their Runbook technology provides an engineer a framework to gather all the relevant VTP information from a network with only a couple of clicks. In the screenshot below you can see a Dynamic Map of a discovered network on the right and a Runbook on the left used to automate the gathering of VTP information across the entire infrastructure.
The Dynamic Map highlights VTP roles, VTP server, VTP client, VTP transparent; and VTP domain name, VTP mode, VTP running version, configuration version, and VTP pruning mode are embedded as device-level data tables.
Also consider one of the more common VTP-related issues: a password mismatch. This is incredibly tedious to check one device at a time and is extremely prone to human error. Because a Runbook works from a Dynamic Map of discovered devices and has within it all the commands an engineer would run, NetBrain is able to automate an entire troubleshooting workflow — completing tasks in seconds rather than hours.
In the screenshot below, notice that the Runbook specifically runs actions, called Qapps, to find common VTP configuration problems such as password mismatches.
Runbooks perform all the steps you would — only automatically instead of manually command by command, device by device. Here, it checks for password mismatches and highlights the results right on the Dynamic Map.
But the power of Dynamic Maps and Runbooks is in how they enhance an entire workflow. In the next screenshot you can see that after checking for VTP password mismatches, the Runbook moves right into the next action to check VTP interface mismatches. In this way, the Dynamic Map and Runbook work together to create a completely automated environment for an engineer instead of having to use console connections and out-of-date spreadsheets.
Then the Runbook automatically runs another Qapp to check for interface mismatches — again, across all devices in one fell swoop.
My co-worker and I were able to get the VLANs back onto the core switch and repair some of the damage, but we still needed to go switch by switch to find the stragglers and paste VLANs into them. We got an earful for our carelessness, but we ultimately got everything up and running, though it took hours of manual configuration.
Because I experienced my own VTP-related incident, I don’t laugh much anymore when someone makes a VTP joke. Unfortunately, I was the cause of it, but it wasn’t because I didn’t understand VTP or didn’t know the commands. I didn’t do my due diligence. The power of Dynamic Maps and Runbooks is in how they enhance an entire workflow.
Discovering a network is tedious and time-consuming, but neglecting it can lead to huge problems. Automation software that mitigates human error and abstracts away the pain not only makes our jobs easier but it safeguards against outages such as the dreaded VTP bomb.