We get alot of interest in our Problem Diagnosis Automation System (PDAS) these days as the very core of how business manage their network operations has change so dramatically over the past couple of years. The global pandemic was just the latest in a series of fundemental changes that are impacting the infrastructure world, so no surprise that CIO’s and their teams are looking for ways of operating networks smarter, not just bigger. The CIO knows that their business transformation has been instrumental to their growth, but has now raised the stakes on the very nature of IT service delivery. They know what they need to deliver from the top-down, it’s the translation of the business’ needs into network components where things get tough. At the end of the day, these CIOs know that when their networks stops or degrades, their business stops or degrades, and the repercussions of any hiccups in IT service delivery can be disasterous to the long-term bottom line.
Given that, one thing that keeps surprising me is how much people want to know about NetBrain’s Dynamic Maps yet they avoid diving into the higher impact automation parts of operational efficiency available now. Take for example NetBrain’s Runbook Automation. It’s one of the NetBrain PDA System’s core technologies and a very compelling feature for network engineers looking to act more efficiently and get in front of the growing chaos regularly appearing in large enterprises. And when you couple network-intent based management, an abstraction layer, along with no-code network automation together, the paradigm of managing networks is stood on its head… top-down management is finally here!
So I get that the word “automation” might be intimidating to network engineers since countless automations have come and gone, with hugh budgets and long development cycles, and in many cases yielding no tangible results. Many network engineers continue to wait for network automation to become usable for THEIR problems, and they think that their problem are so unique that they never even really consider how those problems can be automated. So they continue to solve problems manually, one at a time, over and over, brute-force and ad-hoc. Escalations to SMEs abound and knowledge sharing is not part of their equation.
NetBrain has produced an incredible tool here, which is not only user-defined but is done entirely without scripts. By parsing individual tasks into the Runbook Automation tools…
…suddenly the network engineer can create complex and abstract tasks without ever having to write a line of code. In this post, I’ve gathered and parsed information from our support staff and investigated 6 of the most common automation use cases NetBrain clients are implementing on their network.
#5: Collecting Data for Incident Escalation
Here, the client has created a Runbook to gather OSPF data from their network. Once all of the data is collected and visualized, it will be made available to more senior members of the staff who are troubleshooting a network issue.
This might look like a complex diagram, but let’s break down the process of what exactly is occurring here.
A custom Qapp is highlighting and gathering all of the routing information present on the designated environment, and compiling this information into several notes that are placed on the Dynamic Map. Here, in this example, all of the relevant OSPF information –
A second Qapp is monitoring the running status of OSPF – it’s simply highlighting the neighbor count and OSPF routes on the map.
A batch CLI node (which is the automated execution of CLI commands, more on that in #2) is collecting relevant information about these devices in order to have more useful information to be escalated to higher tier engineers.
This is important for several reasons.
First – this reduces the overhead involved in troubleshooting; team members no longer need to duplicate the retrieval of basic information from the network in order to do their job properly. It also enables multiple teams, like networking and security, to work together in the event of breaches or security incidents since everyone’s working from the same frame of reference.
Second – this Runbook also creates a snapshot of the network at a certain point in time and can be used when planning out future network changes to avoid possible errors or outages.
#4: Integration with ServiceNow / “Just-in-time” Automation
Any IT department that gets large enough is going to need a ticketing system, among other services. Where this gets tough is when people find themselves hopping through different applications and silos in order to find all of the relevant and necessary information in order to troubleshoot an issue, or when a change process becomes stuck because it hasn’t received a stamp of approval yet.
ServiceNow NetBrain integration
NetBrain offers its own internal verification system, but it can also integrate with third-party ticketing systems like ServiceNow.
Being able to link ServiceNow tickets to a map comes with multiple benefits.
Change management events can still use ServiceNow as an integrated approval platform, instead of creating multiple areas where people need to sign off on network modifications.
The map of a problem area or potential network change can be directly linked to the ServiceNow ticket, allowing everyone who has access to the ticket to also have access to the area of the network being affected, creating an enhanced degree of visibility for IT processes
Problems that receive a ticket can trigger NetBrain via API to create a map of the problem area. (aka “Just-in-time” automation)
This diagram is showing a ServiceNow ticket being created and linked together with a NetBrain Dynamic Map, as seen through the NetBrainMapURL in the top right corner. This map represents the ‘problem area’ identified with the ticket. During ‘Just-in-time’ automation, the system performs basic diagnoses such as an Overall Health Monitor and Raw CLI data collection (seen in Runbook Automation Tools) and provides this information to the engineer as they arrive on the scene, eliminating the time they’d otherwise spend performing the same tasks.
The best part is, this isn’t limited to one application. Any ticketing system with an API can be integrated into NetBrain to achieve the same effect.
#3: Compliance Checks
This Runbook is automating and recording the results of compliance checks. In this specific scenario, the client needs to know if the devices contain unsafe community strings, strong password encryption levels, SSH access, and VTY timeouts beyond safe thresholds. Afterwards, it will take a configuration snapshot that will be archived for the user.
This type of Runbook is useful for several reasons
First – If you are ever in a position where you need to provide security information about your network to a third party, this Runbook will streamline your compliance timeline down to a matter of minutes.
Second – This information is incredibly valuable to security and networking teams, as they’re notified of potential security issues that may be lurking in parts of the network they don’t deal with on a daily basis. Once they have this information, the #2 Runbook would come in handy!
#2: Automating Command Line Execution
Here, the user is issuing commands in batches to multiple devices on the network. Once these commands have been executed, they can be saved as their own Runbook. Thanks to a robust CLI parser library, NetBrain can execute a wide range of data collection and troubleshooting commands on just about any vendor that has a CLI. If you need to regularly pull specific information from a device about, for example, interface configurations, ACLs, VLANs, virtualized infrastructure, routing tables, NetBrain likely already knows the specific actions you want to enter, and has them available to you in the ‘Execute CLI Commands’ pane. From there, it logs into the device or devices and collects the information for you.
This is useful because it essentially automates a troubleshooting solution into a button click. If the Runbook itself clearly defines what the commands are meant to do, the Runbook can be effectively passed around the networking team and used to resolve this issue quickly. Speaking of which, this hems very close to the #1 choice in this list.
#1: Troubleshooting Repetitive Complex Issues
An aggregation of the strongest parts of the previous automation use cases, this is a clear front-runner for “most automated task.” Half-documentation, half-troubleshooting, a use case like this also demonstrates the if/then nature of the Runbooks – depending on the results of certain steps, different branches within the decision tree are activated.
This diagram seems to be rather complex, but follows a similar (if more involved) tree of logic that all runbooks follow. Once the original Qapps (Monitor OSPF Neighbors, Highlight OSPF Configuration) are done gathering intelligence, we encounter the first ‘branched’ Runbook.
If the OSPF states of certain devices are not established, the system will present the user with 5 variant Qapps to troubleshoot various ‘non-functioning’ states (Not Established, INIT state 8, Loading, Two-Way State, and Exstart or Exchange State). Based on the output of the data collection, the user can take multiple branching troubleshooting paths through the runbook, each containing a slightly different set of processes and commands to help tackle the issue at hand.
As you can see, NetBrain has a lot of out-of-the-box automation capabilities that can help with troubleshooting, documentation, and service integrations. It is as essential to the feature as its robust discovery and Dynamic Mapping features.
It’s easy enough for beginners to pick up, but robust enough that power users won’t easily reach the upper limits of what it can accomplish. With a Runbook pointed at the appropriate network or business processes, an engineer is empowered beyond what they’d normally be capable of.