Network Automation (PDAS)

Preventive Automation

NetBrain’s Intent-Based Automation proactively validates that your hybrid network is doing the job expected of it and all its applications, catching problems before they impact the business, and conducting immediate root cause analysis to enable faster incident resolution.

Preventative automation is the ability for network operational tasks to be performed continuously without any operator intervention. Capture design intent and troubleshooting knowledge in Network Intents (without any coding expertise) and run them continuously in the background.

At scale, Network Intents look for various changes to conditions (e.g., config drift due to human error, policy changes, performance degradation, security holes) and allow you to proactively resolve them by enacting policy enforcement of design rules, best practices, and security policies. This provides intelligent and proactive network status monitoring and automation of the health check of your entire network. Compare the current configurations of thousands of devices for changes that may have occurred outside of the normal processes or verify the end-to-end performance of the connection available between two business services (network connectivity intents).

Preventive Automation Framework (PAF) Component Overview

NetBrain’s preventive automation consists of the following 2 components:

  • Adaptive Monitoring
  • Network Intent/Network Intent Cluster (NI/NIC) Automation

Adaptive Monitoring

NetBrain’s adaptive monitoring system detects early warning signals by customizable software probes before the end-user/application experiences negative impacts. Examples as below:

  • First Occurrence Issue: Config Change, Failover, etc.
  • Transient Problems: Link Utilization Spikes, Routing Flapping, STP Oscillation, etc.

Adaptive monitoring leverages flash probes to identify certain network alerts. Flash probes can be used via SNMP/CLI parser variable or can receive alerts directly from external systems such as Splunk. Unlike the traditional network monitoring tools which use SNMP for monitoring, you can create more network-specific probes for monitoring and identifying potential problems.

 

Network Intent/Network Intent Cluster (NI/NIC) Automation

Network Intent/Network Intent Cluster (NI/NIC) is the automated diagnosis and rule check triggered by early warning alerts from a set of logic probes.  Together, PAF is different from the traditional monitoring systems, in that:

  • PAF is designed to enforce design and security rule checks automatically, not monitoring system errors
  • PAF must be customized based on network design, not one-size-for-all.
  • PAF can serve as a next-gen compliance check solution – it’s a 24X7 compliance check.

Network Intent

Network Intent provides users a no-code way to define a network design for a specific network device, design baselines, and how verify design enforcement. It provides a way to document network design intent allowing other engineers to quickly understand the device’s design and baseline or normal state of a particular device. More importantly, it provides a way to validate and verify network design without any code. When a network problem occurs, one or multiple NIs are violated. In the postmortem stage of this problem, the violated NIs are coded and automatically monitored. The next time a similar situation occurs, it can be solved automatically significantly reducing MTTR.

Network Intent as Automation Unit

As part of Adaptive Monitoring Automation, a NI automatically triggered by a Flash Probe as a backend process monitors the entire network’s status periodically. When NetBrain detects a flash alert, the system will automatically send notifications to the appropriate NetOps personnel. You can click the Incident/Map hyperlinks to open the map or incident in NetBrain Workstation. A respective runbook with time-of-event data will also be available in the map interface to assist with the root cause analysis. An end-user views the triggered NI results with the flash probe via a Decision Tree.

Monitor Network Intents

NetBrain Adaptive Monitoring Automation uses a set of scalable, hierarchal logical flash probes as monitoring units to detect when a data anomaly occurs on a single device via SNMP/CLI data polling and advanced anomaly analysis algorithms. As soon as one detects an anomaly, the system will take immediate action to execute the pre-defined network intents, the results of which will provide critical references to the root cause analysis and speed up the troubleshooting process.

 

Trigger the Execution of Network Intent (NI)

Flash probes generate flash alerts to trigger the execution of a NI. When an incident occurs, the adaptive monitoring system captures the problem at the time of the incident. The NI executes, comparing the network status with the pre-determined threshold, previous status, or baseline data, and shows the results in the IBA dashboard or shares them with the user. The diagnosis is provided at the time of the event, allowing the user to check the problem before users experience performance degradation or outages preventing a serious network impact.

Scalable Adaptive Monitoring

Adaptive Monitoring System can horizontally scale as:

  • Distributed analysis on front servers: the data retrieval and flash probe calculation are executed on front servers locally, which can be scaled to a very large network with distributed front servers.
  • Hierarchical analysis from Primary Probe -> Secondary Probe-> Network Intent: the hierarchical design allows the system to efficiently use the resources to run Network Intent automation across the entire network.

Intent Library Subscription Service

Examples of common problem diagnosis situations contained in the PDAs Intent Library

NetBrain’s PDAs Intent Library service provides a continuously expanding library of pre-built expertise-based automation units ready to use right out of the box. These automation units address the most common scenarios seen in the vast majority of enterprises for event-driven responses, such as those reported via a network helpdesk service ticket), as well as for proactive design-level compliance, security, and application performance support verifications.

The Intent Library is extensible as well. Through no-code mechanisms built into the PDA System, your own subject matter experts can create additional situation and site-specific automation routines without any coding and add them to the Intent Library. Any network engineer or operator can use the automation routines to quickly and accurately solve problems when they reoccur. Subject Matter Expertise becomes available when the subject matter experts are not.

The NetBrain Intent Library is leveraged throughout the PDA System. When coupled with an ITSM/ITOM system, NetBrain PDAS triggered automation will draw from the Intent Library to implement the most useful set of diagnostics in response to specific events.

 

Enforce Configuration & Design Rules

Each network has its own design intent and configuration standards. By leveraging intent-based automation, you can encode the design intent and configuration standards into the Network Intents. These intents can be shared across the entire team and can be verified periodically.

Proactively Monitor Application Performance

IT operations professionals face a number of challenges in their efforts to proactively monitor IT infrastructure and application performance and mitigate performance degradations. Today’s complex enterprise IT environments are characterized by a mix of physical and virtualized infrastructure located across multiple remote sites and data centers, in addition to cloud environments and “as-a-Service” platforms.

Preventive Automation allows you to define the monitoring parameters via SNMP/CLI, providing you with a variety of ways to achieve problem-based monitoring. And you can easily view the results via the dashboard.

Automate Compliance and Security Check

Preventive Automation allows you to define the compliance and security check rules within Network Intents and expand the check to the entire network via Network Intent Cluster, which the system executes automatically.

Automate Well-Known Diagnosis

Network issues can significantly negatively impact productivity. Every time a network issue occurs, it takes time to identify and fix it. And some problems can occur multiple times, so you need to identify these “well-known” problems in your network and prevent them from happening in the future. Intent-based automation allows you to run network intents automatically without human intervention. This automates the diagnosis of all “well-known” problems in your network and instantly identifies them to reduce the volume of service tasks.

Notify Users or 3rd-party Systems with Alerts

Generate intent-based automation alerts and share them with other users by email. You can subscribe to certain types of alerts and stream them to your alert or incident management system for handling.

Email alerts can be used to create tickets for 3rd-party systems (e.g., ServiceNow). NetBrain creates internal tickets with the NI/NIC diagnosis results. So you can view the ticket created in the ticketing system and use the link to open the incident in NetBrain.

Triggered Automation

Network problems are often organized by a Ticket System in the form of incidents. In the real world, 95% of network problems are repetitive in nature-identical or similar problem happened again and again but is diagnosed the same way each time without automation. NetBrain provides Triggered Automation Framework (TAF) in NetBrain PADS to fill in this gap. The TAF is a set of NetBrain components that can process API calls in various formats, dynamically classified into various incident types, executed related NI/NIC automation, and deliver diagnosis and visualization results.  The diagnosis output of TAF is listed inside NetBrain’s incident pane, as a series of machine-generated messages with hyper-link.

 

Integrated with ITSM systems for Problem Triggered Diagnosis

Integrate NetBrain’s PDAs with ITSM systems such as ServiceNow or BMC Remedy enables the next tickets to be diagnosed by the automation engine.

Integration between NetBrain and the ITSM system could be done in one of the two methods:

  • Through purposely built integration App, such as ServiceNow App, Splunk App.
  • Through Rest API library, such as BMC integration.

The Triggered Automation Framework (TAF) within PDAS is upgraded to 2.0, where the integration to the external ticket system needs only to happen once, and customization of triggered diagnosis can occur inside the NetBrain system continuously, often without any coding. (Integrate once, and Customize any Time)

Using Network Intent Cluster for Triggered Diagnosis

After ticket system integration, a new ticket that has fields matching the trigger rules will send an API call to NetBrain with a pre-defined payload; NetBrain in turn will launch the following actions:

  • Incoming API calls will be classified into a pre-defined “incident type”
  • Incoming Incidents will be merged into an existing NetBrain incident or a newly created one.
  • A dynamic map is created or opened for the incident
  • Matched diagnosis executed

Display Map and Diagnosis Result in NetBrain Incident

The NetBrain incident is used to contain created map and diagnosis result for an incoming ticket, it would be the center place for all troubleshooting related information. Users can log into NetBrain or the NetBrain Incident Portal to view this data.

Besides viewing map and diagnosis results within NetBrain or the NetBrain incident Portal, the integrated ticket system, like ServiceNow, also can display NetBrain incident, map, and diagnosis results as well.

Self-Service Automation

Network performance problems and other service-level incidents may come to the attention of various support teams outside of the typical NetOps channels, so allowing any support team to interact with NetBrain’s intelligence is imperative. Users can trigger problem diagnosis automation directly from ITSM solutions (such as ServiceNow), via Microsoft Teams, or even with nothing more than an email to quickly reduce data gathering time, MTTR, and further escalations.

Self-service automation empowers all levels of the support process. With self-service options, any support personnel, not just network engineers, can participate in the high-level diagnosis of the network and resolve issues long before network engineers are assigned. The system, security, application, and even level-1 helpdesk engineers can access network automation to quickly diagnose issues in real-time- while the problem is being observed. Self-service options can be customized by role and results can be provided either from the management console or the incident portal allowing for access at all levels of the escalation path.

Self-service from ITSM

Self-service automation resides within incident-based collaboration systems including NetBrain’s Incident Portal and incident pane or integrated ITSM tools like ServiceNow. Much like cases where ITSM triggers NetBrain automatically, self-service allows the problem to be remediated in a fraction of the time a typical NetOps response would incur.

Integration with ITSM system serves two resource-related purposes:

  • For non-structured tickets created manually, when automated trigger rules are not activated, users can manually launch a diagnosis with a similar effect to the automated trigger.
  • For senior network engineers resolving an IT problem to share automated diagnosis functions with junior network engineers or non-network engineers without login to the NetBrain system.

 

Microsoft Teams Automation Bot

Initiate automated maps, diagnoses, or rule checks directly from Microsoft Teams with NetBrain’s Automation Teams Bot. Engage the bot from the traditional Microsoft Teams Chat window and follow short prompts to trigger the exact automation desired. See the results directly from the NetBrain console or from the Incident Portal to collaborate with non-NetBrain users with read-only access. The Teams bot goes to work diagnosing problems in real-time during the Chat conversation, making an incident portal available to all support resources that wish to collaborate to resolve the problem.

Customize NetBrian’s Automation Bot for Microsoft Teams for specific user roles. Control the automation and devices available per role.

Email Automation Bot

If Teams is not an option or unreachable for any reason, NetBrain’s Automation Bot is also available via email. Similar to the Microsoft Teams bot, the email bot has a simple send and response protocol. With clear command keywords and formatted inputs, emails can be sent to NetBrain to trigger the same kind of automation available with Teams and ITSM access. Results are provided via email reply from NetBrain with links to the management console for NetBrain users and to the Incident portal for collaboration and for non-NetBrain users.

Interactive Automation

Interactive Automation is when NetBrain intelligence records network engineers’ diagnostic steps to create automation for their own use, enabling them to be more productive by getting data from multiple devices, looking for changes by executing a comparison automatically, and monitoring and getting alerts for threshold changes. It offers guardrails for any operator or engineer to make informed decisions based on network real-time status.

As networks get larger and more complex, automated documentation becomes more important, especially as networks incorporate technologies like SDN, SD-WAN, and public cloud. NetBrain not only provides end-to-end visibility across hybrid networks but also the ability to drill down into each segment and isolate network issues on a Dynamic Map that can be updated in real-time. This greatly accelerates problem identification and resolution. And when new devices are added or removed from the network, the documentation is quickly reflected in an updated network map which is exportable as diagrams to Microsoft Visio and Word.

With NetBrain, NetOps professionals leverage their own expertise to automatically record standardized procedures in a runbook as they detect, diagnose, and fix issues. The steps they use to diagnose issues, including the use of CLI commands on multiple devices, are automated in Runbooks.

Collaborative Automation

Collaborative Automation is where engineers and operators leverage the knowledge of peers with software that captures subject matter experts’ knowledge to create executable automation units that others can then add to their own diagnoses. Expertise is available even when the expert is not. NetBrain helps when troubleshooting a problem, assessing the state of the network, or making sense out of complex technology. It allows the NetOps staff to gather, analyze and visualize 1,000s of KPIs in seconds. It can be coupled with our collaboration capabilities to allow multiple ops teams (SecOps, DevOps, NetOps, ServerOps) to interactively resolve problems that span multiple technology domains without the need for time-consuming handoffs which result in delays. Everyone can get online at the same time and interact and make updates and remediations to the model through a shared analysis console.

NetBrain captures and codifies the SME knowledge using Runbooks, Data Views, and Network Intents. Automatically capturing this information allows experienced engineers to codify and share their knowledge with junior staff, effectively shifting knowledge left, from experienced users to less experienced team members The next time the problem occurs, the runbook is executed by responders without in-depth knowledge or training. Even complicated network issues no longer need to be handled exclusively by experts. You are essentially using the knowledge of these highly skilled level-3 workers when they are otherwise unavailable (due to location or availability).

Incident Portal

NetBrain’s Incident Portal enables collaboration among multiple users working on the same troubleshooting task. An incident represents a ticket in NetBrain to track a network problem or a network change. End users can organize and share maps, devices, individual insights, and findings targeting a specific troubleshooting task and collaborate with more colleagues to resolve issues reducing MTTR. In addition, Incident Portal offers an independent portal page for each incident. External users without NetBrain Workstation seat licenses can access a portal to join the collaboration session by viewing maps and posting messages, etc.

The Function Portal feature enables network engineers to collaborate with their NetOps colleagues and with members of other operational teams who are not initially involved with a service ticket. This is one of the key approaches to achieving the goal of reducing service ticket overhead and improving team productivity and MTTR. With Function Portal, users from multiple teams (IT engineers, security engineers, etc.) work together to resolve complicated problems that would otherwise require hand-offs and wait for resources to become available.

Executable Runbook

Executable Runbooks are a set of visual operational steps that engineers create by capturing their step-by-step workflows to allow the automation of future problem diagnosis, network data collection, and troubleshooting tasks. Runbooks provide a visual way to codify the network troubleshooting process into an executable, reusable, and documentable workflow, to elevate collaboration efficiency. Subject matter experts can digitize their knowledge into a runbook template to capture best practices and remediations that other operators can use. Once Runbooks are executed, the results can be shared with anyone in the organization, facilitating collaboration and enabling higher-level engineers to code their advanced knowledge into repeatable automation units.

Runbooks appear as interactive panels within each Dynamic Map. They contain actions that can perform complex network tasks automatically, providing the user:

  • Command Line Automation
  • Enhanced Incident Collaboration
  • Streamlined Knowledge Sharing

Any network task can be automated in a few clicks. NetBrain’s automation tasks are designed to be vendor-agnostic, so the same action can be executed across heterogeneous, multi-vendor networks without incident.

Executable Runbooks can leverage datasets from third-party tools, enabling users to visualize information from all their existing tools into its Dynamic Map.

Executable Runbooks At-A-Glance
• Automate network operations on a large scale
• Effortlessly collaborate with multiple engineers using the same Dynamic Map
• Share and save network insights as to the groundwork for future troubleshooting

Step-by-step troubleshooting procedures from subject matter experts

Guidebook

Replace static troubleshooting playbooks with digital guidebooks

The manually defined Guidebook is a container that includes Data View Template, Runbook Template, and Network Intent to describe a troubleshooting process for a specific network issue. It can be dynamically qualified to match eligible devices. The guidebook is designed to replace users’ static troubleshooting playbook.

NetBrain also provides a low-code visual programming environment for users that are more comfortable or have previous experience with light programming approaches, writing scripts, and using other programming tools. Over time most users gravitate to NetBrain’s No-Code technology, but both types of automation can co-exist upon the same instance of NetBrain.

Automation Assets

Visual Parser – Gateway to Programmability without Code

NetBrain’s Visual Parser allows you to quickly turn device CLI/SNMP command output or configuration file text into programable variables without coding to enable “What you see is what you can program.” Network engineers can parse the configuration file and CLI/SNMP command output for many problems and produce no-code diagnosis automation that can be used by any network engineer.

NetBrain allows you to create a visual parser to parse your entire network across all devices for status, values, and errors at any time. You can then easily locate the suitable parsers with Parser Discovery and use those variables in other automation assets such as Network Intent, Preventive Automation, and Data View Template to create automation for the diagnosis of specific issues.

Parse CLI output or config into variables, without any coding

The system allows you to define the following multiple types of parser or parser groups to parse variables. For each type of parser group, a set of parser rules work together to define how variables are extracted from raw text. With valid rules, the parser result is displayed instantly in the output pane and the corresponding raw data is highlighted on mouseover.

Data View Template (DVT) – Using Automation to Create a Custom Map View

Data View Template provides the capability to visualize various network information and drill-down actions on a map to help network engineers in interactive troubleshooting scenarios. Variable-based golden baseline and historical data provide the power engineers need to understand the overall status of the data historically. Using NetBrain’s no-code approach, the intuitive UI and associated Visual Parser allow end-users to easily define the Data View Templates.

Drill-down actions can be organized by various types of automation features, like Network Intent, Runbook Template, CLI, Compare, etc. Engineers can leverage drill-down automation actions to perform further diagnoses.

NetBrain’s Single Pane of Glass capability visualizes all network data is powered by the Data View Template and extended by parsers.

Runbook Automation

Executable Runbooks are stored in NetBrain’s Intent Library. You can create Runbooks easily by using NetBrain to capture your subject matter experts’ knowledge in the no-code platform during troubleshooting and other normal operations tasks. Anyone in the organization can leverage Runbooks. Stored Runbook Automation units allow you to execute operational processes and procedures at the speed of a machine and with a higher level of accuracy and consistency.

NetBrain captures IT operations best practices and transforms them into automation called executable Runbooks. Any network operator and support staff can access the automation for similar recurring problems without escalation.

Network Intent Cluster

Network Intent Cluster, which expands Network Intent (NI) scope from a specific network design to one type of network design with similar diagnosis logic. While NI effectively documents and validates a network design, it applies to only one network device or a set of devices at a time. Therefore, it can take many repetitive efforts to create NIs for a large network. NIC is designed to expand the logic of a NI (seed NI) from one or a set of devices to the whole network. Furthermore, NIC can be triggered to run in the Triggered Automation Framework, and its results can significantly reduce the MTTR. NIC requires no coding skills and has an intuitive user interface for creating and debugging.

NIC is composed of a group of NIs (member NI) cloned from Seed NI via a 7-step, no-code process. A NIC may have thousands of Member NIs, corresponding to a specific network diagnosis. A subset of Member NIs can be selected to execute according to the user-defined matching logic based on: (1) devices inside the member NI (member device), 2) unique tag for each Member NI, or 3) signature variables assigned to Member NI.

The following diagram is a sample NIC to clone a seed NI to check the HSRP running status for a network site. By creating a NIC to achieve this, you can expand the Diagnosis of one site to your entire network. Each Member NI has its tag and signature variable, the virtual IP address of HSRP.