Aid first-response engineers with guided troubleshooting
When an engineer first arrives on the scene to troubleshoot, there are a set of common questions they usually ask:
- What’s changed in the network?
- Is the network in a normal or abnormal state?
- What should I do next?
NetBrain offers a set of tools to help answer these questions, first putting critical data at an engineer’s fingertips, then helping to identify abnormalities, and finally guiding next steps with contextualized “recommended actions”.
NetBrain Data View Templates put virtually any network data at your team’s finger tips. Clicking a Data View dynamically turns on and off layers of data on top of a Dynamic Map, making it easy to visualize the network from any perspective. For example, if you’re troubleshooting a routing issue, turn on a BGP Data View to visualize BGP configuration, or neighbor statuses. If you’re diagnosing packet drops, turn on a Data View to visualize interface errors like input drops or CRC errors.
Data Views not only display raw data, but also flag abnormalities in that data, across thousands of parameters. For example, the Golden Baseline may indicate that a BGP router should normally have four active neighbors. If that router loses a neighbor, this would raise as an alert on the map which may be a clue to something wrong.
To guide engineers towards more advanced troubleshooting, and help minimize the need for escalation, you can also define Recommended Actions which guide engineers down a troubleshooting path. For example, if an alert indicates a BGP neighbor dropped, the Recommended Action may offer a BGP Troubleshooting Runbook which the engineer can execute as their next step. This runbook may have been customized by the architect who designed the BGP network.
Improve team collaboration during active troubleshooting
Since many incidents require escalation, troubleshooting is a team event. There is a need to get everyone looking at the same thing, at the same time, to reduce redundancy and streamline collaboration.
Contained within a single NetBrain URL is a Dynamic Map of the area under investigation and all troubleshooting steps performed against it. This troubleshooting record is documented automatically within an Executable Runbook . As teams troubleshoot collaboratively, they can share this URL, along with the history of the incident from the perspective of the network.
This ability to get teams on the same page, facilitates better handoffs and avoids duplication of work.
Automatically push changes and assess the impact
Quickly restoring business services is the primary goal of incident response. But deploying a fix can be time-consuming and introduces risk of collateral damage. Its critical to effectively resolve outages while also mitigating risk during problem remediation.
From design, to implementation, to verification, NetBrain’s Change Management automates the entire change management process. You can push complex changes to multiple devices simultaneously and even integrate with Ansible, if that’s your tool of choice for change orchestration.
NetBrain helps to quickly assess and visualize the impact of a change on the network, and the applications running on it. This is possible by running benchmarks against the environment before and after the change and by leveraging NetBrain’s Application Assurance Engine to validate the impact the change had at the application level. If any problems are discovered within the change window, you can roll back to the previous state with one-click.