Cisco vManage / SD-WAN Manager: The Operator's Guide

Cisco vManage (renamed Cisco Catalyst SD-WAN Manager around the 20.x release in 2023) is the operator-facing component of the Cisco SD-WAN fabric. It is where you configure templates, push policy, monitor the fabric, audit operator actions, push software upgrades, and manage certificates. If your operators are unhappy with vManage, they are unhappy with the SD-WAN deployment.

This article walks through what vManage actually is, the constructs you work with daily (feature templates, device templates, policies, dashboards), the REST API surface for automation, the HA model, and the operational realities of running it at scale. If you are configuring your first vManage cluster, training new operators, or just trying to figure out where a particular setting lives, this is the reference.

What vManage Actually Is

vManage is a heavyweight VM-based application combining several roles:

Web UI for operators (HTTPS, role-based access, IdP integration)
REST API for automation - the SD-WAN API surface used by Ansible/Terraform/custom scripts
Template engine that generates per-WAN Edge configurations from operator-authored templates
Policy engine that compiles intent-style policies into OMP attribute manipulations and per-edge ACLs
Telemetry collector ingesting metrics and logs from every WAN Edge
Monitoring database (the bulk of vManage's resource footprint)
Software image repository for cEdge and vEdge firmware
Certificate authority for the fabric's mutual-TLS authentication
Application Quality of Experience (AppQoE) reporting backend

The deployment model is a clustered set of identical VMs (3 nodes minimum, 6 for large fabrics) with a shared database backend. Each cluster member runs the same application stack; load balancers (or DNS) distribute operator UI sessions across them.

Templates: The Core Operating Concept

vManage's central abstraction is the template. Instead of configuring each WAN Edge directly, you author templates that describe what a class of edges should look like, then attach edges to a template.

Two layers of templates:

Feature template

Describes

One feature on a device (interface config, OMP, BGP, NAT, AAA, NTP, etc.)

GranularityPer-feature

Device template

Describes

The full configuration of a device, composed of feature templates

GranularityPer-device-class

The pattern: build a library of feature templates (one per concern: Interface-WAN, Interface-LAN, OMP-Default, BGP-Branch, etc.), then assemble them into device templates per branch type (Branch-Small, Branch-Large, Hub-Primary, Hub-Backup). Attach the WAN Edges to the appropriate device template.

When you change a feature template, vManage recomputes the configurations for all attached devices and pushes the diffs. A single change to "Interface-WAN" can update 500 branches simultaneously.

The discipline this requires is real. Templates with too many variables become unmanageable; templates with too few variables proliferate. Most production deployments end up with 20-50 feature templates and 5-15 device templates after 12 months of operation.

Policy: Centralized vs Localized

vManage distinguishes between two policy types:

Centralized policy

Lives where

vSmart controllers (pushed via OMP)

Use for

Routing manipulation, traffic engineering, application-aware routing, service chaining

Localized policy

Lives where

WAN Edge (pushed via template)

Use for

QoS, ACL filtering, route maps applied per-edge

Centralized policy is the powerful one. It uses lists (data prefix lists, application lists, site lists, TLOC lists) and policy definitions (control policy, data policy, app-route policy) to express intent. The vSmart compiles these into the OMP route updates and service advertisements that change WAN Edge behavior.

An example centralized policy: "Voice traffic from any site should prefer TLOCs with color=mpls; if those are unavailable, fall back to color=biz-internet." That single policy expression turns into per-flow path selection across thousands of WAN Edges automatically.

Localized policy is more about QoS and per-port ACL hygiene. It is conceptually simpler and looks more like traditional Cisco IOS configuration.

Dashboards and Monitoring

vManage's monitoring surface includes:

Network dashboard - top-level fabric health (sites up/down, alarm counts, control connections)
Application Performance dashboard - per-application metrics across the fabric (Office 365, Salesforce, Zoom, custom apps)
WAN Edge details - per-edge view with tunnel state, BFD sessions, OMP peers, interface stats
Tunnel health - per-tunnel SLA metrics (latency, jitter, loss) over time
Audit log - who changed what, when
Real-time troubleshooting - the "Real Time" menu lets you query a remote WAN Edge for current OMP, BGP, BFD, ARP state without SSHing in

The Real Time view is the most useful operator tool. Instead of opening an SSH session to a remote edge to run show sdwan omp summary, you click into the device in vManage and the GUI runs the command via a backchannel and renders the output. This works for hundreds of show commands.

The REST API

vManage's REST API (the "SD-WAN API") covers virtually everything the GUI does. The API is widely used for:

Bulk template attachment / detachment via Ansible or Python scripts
CI/CD pipelines that validate template changes before pushing them
External monitoring integrations (push fabric metrics into Prometheus, Grafana, Datadog)
Custom dashboards that combine SD-WAN data with other sources
ITSM integrations (auto-create tickets when a tunnel goes down)

Authentication uses session cookies (login with credentials, get a session token, include in subsequent requests). For automation, generate API tokens with restricted scopes; for ad-hoc scripting, the session-cookie pattern is fine.

The API is documented at https://<vmanage>/apidocs on every running instance. Version compatibility matters - the REST API surface evolves with each major release.

High Availability

vManage HA is implemented as a clustered deployment:

3 nodes

Quorum2 of 3

Edge supportUp to 2,000 WAN Edges

6 nodes

Quorum4 of 6

Edge supportUp to 6,000 WAN Edges

The cluster shares a database. Loss of any single node is operationally invisible (other nodes serve UI/API requests). Loss of two nodes in a 3-node cluster (or three in a 6-node) takes vManage down until quorum is restored.

Important: vManage downtime does not affect the data plane. WAN Edges keep forwarding traffic with their last-known policy. Operators just cannot make changes or view fresh telemetry until vManage recovers. Plan around this: vManage maintenance windows can be during business hours because traffic is not affected.

On-Premises vs Cloud-Hosted

Three deployment models:

On-premises VMs. You run vManage in your own VMware/KVM infrastructure. Full control; you manage upgrades and capacity.
Cisco-hosted vManage (Cloud-hosted). Cisco runs the vManage cluster as a service in their cloud. You manage the SD-WAN fabric; Cisco manages the vManage infrastructure. Common for smaller deployments and customers without strong on-prem ops.
Cisco SD-WAN Cloud (formerly Viptela Cloud). Even more managed; Cisco operates the entire control plane.

The on-prem option gives the most control and is what large enterprises typically choose. The hosted options trade control for operational simplicity.

Software Upgrades

vManage manages firmware upgrades for the entire fabric. The flow:

Upload a new cEdge or vEdge image to vManage's image repository.
Select target devices (single edge, group of edges, or whole sites).
Schedule the upgrade. vManage pushes the image to the targets, then activates it.
cEdge images use ISSU (In-Service Software Upgrade) on supported platforms - the upgrade happens with minimal forwarding interruption.

For vManage and vSmart and vBond upgrades, the process is more careful: upgrade vManage first (cluster rolling upgrade), then vSmart (rolling), then vBond, then cEdges. Mixed-version operation is supported but should be a transient state, not a long-term one.

Performance Realities at Scale

vManage is the bottleneck in large fabrics. Common pain points after 12-18 months of growth:

Database bloat. Telemetry retention defaults are conservative; turning them up at scale balloons disk requirements.
Template push slowness. Pushing a feature template change to 1,000 edges is not instantaneous; it can take minutes.
UI rendering delays. Dashboards that aggregate across thousands of edges get slow if the database is not tuned.
API rate limits. Heavy automation can hit internal rate limits; pace requests.

The standard mitigations: more memory (vManage benefits from RAM more than CPU), tune the database, scale the cluster up to 6 nodes if you are above 2,000 edges, and use the API instead of the GUI for bulk operations.

Summary

vManage (Cisco SD-WAN Manager) is the management plane of Cisco Catalyst SD-WAN. It is the operator UI, the REST API, the template engine, the policy compiler, the monitoring backend, the certificate authority, and the upgrade orchestrator - all in one heavyweight clustered application. Run it in 3-node or 6-node clusters depending on fabric size. Your day-2 operations experience is dominated by how you use vManage, not by the underlying fabric protocol.

Master templates, policy expressions, and the REST API early. Treat the GUI as the entry point but graduate to API-driven automation for any change touching more than a handful of edges. Bookmark the SD-WAN cluster pillar and the Cisco Catalyst SD-WAN architecture article for the broader fabric picture, and lab every template change before pushing to production.

Cisco vManage / SD-WAN Manager: The Operator's Walkthrough