Why STP Convergence Matters
When a critical link fails, the network needs to reroute traffic to a backup path. STP must detect the failure and recalculate the topology. In 802.1D (legacy), this process takes 30–50 seconds. In that time, traffic is dropped, sessions are terminated, and users perceive an outage.
Rapid PVST+ (the modern standard) reduces convergence to seconds or less. Understanding the difference and diagnosing slow convergence is critical for production reliability.
The Convergence Problem: 802.1D vs Rapid PVST+
802.1D Convergence: The 50-Second Outage
In 802.1D STP, when a designated port fails:
- Detection (0–2 sec): Neighbor detects link down (depends on hardware CDP/LLDP)
- Root Port Failure Recognition (0–2 sec): Non-root switch notices root port is gone
- Port Transition to Listening (15 sec): Port enters listening state to flush the MAC table
- Port Transition to Learning (15 sec): Port enters learning state to rebuild the MAC table
- Port Transition to Forwarding (0 sec): Port finally forwards traffic
Total: 30–50 seconds of traffic loss (with default timers: forward delay 15 sec × 2 = 30 sec, plus detection delays).
Default 802.1D Timers
SW1# show spanning-tree vlan 10 | include "Hello Time|Max Age|Forward Delay"
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
- Hello Time (2 sec): Root bridge sends BPDUs every 2 seconds
- Max Age (20 sec): Non-root switches wait up to 20 seconds for a BPDU before assuming the source is dead
- Forward Delay (15 sec): Duration of listening and learning states (15 sec each)
If the root port fails:
- The non-root switch waits up to 20 seconds (Max Age) to confirm the root is unreachable
- The blocking port is promoted to designated
- The port goes through listening (15 sec) and learning (15 sec)
- Total: 20 + 15 + 15 = 50 seconds
Rapid PVST+ Convergence: Sub-Second to Seconds
Rapid PVST+ uses Rapid STP (RSTP) instead of 802.1D. Key differences:
- Active Topology: Non-designated ports are "alternative" ports (not "blocked"). They can take over immediately if the designated port fails
- Rapid Learning: No listening state. The port goes directly to learning, then forwarding
- BPDU-based Detection: Proposals and agreements happen via BPDU exchanges, not timers
- No Timer Dependency: Convergence doesn't depend on Max Age or Forward Delay
Result: 1–6 seconds maximum, depending on link detection speed.
Configuration: Enable Rapid PVST+
SW1(config)# spanning-tree mode rapid-pvst
SW2(config)# spanning-tree mode rapid-pvst
SW3(config)# spanning-tree mode rapid-pvst
SW1(config)# exit
All switches must be Rapid PVST+ for the benefits. If one switch is 802.1D, the network falls back to 802.1D behavior.
Verification: Check STP Mode
SW1# show spanning-tree summary
Switch is in rapid-pvst mode
Diagnosing Slow Convergence: Where is the Delay?
When a failover takes longer than expected, diagnose each stage:
Stage 1: Link Failure Detection (0–10 sec)
How quickly does the switch detect that a link is down?
Hardware detection: Most modern switches detect layer 1 link failure within 1 second.
Software detection: STP relies on BPDUs. In 802.1D, if a BPDU isn't received for Max Age (20 sec), the switch assumes the source is dead.
Check link detection speed:
SW1# show interfaces Gi0/0 | include "line protocol is"
line protocol is up
SW1# show interfaces Gi0/0 | include "Last input"
Last input 0:00:03, output 0:00:02
Monitor these fields. When a link fails:
SW1# show interfaces Gi0/0 | include "line protocol is"
line protocol is down
SW1# show interfaces Gi0/0 | include "Last input"
Last input 0:00:45, output never
The line protocol went down within 1 second. But STP doesn't know about it yet (still processing BPDUs from the other path). This is where the Max Age timer matters.
Stage 2: STP Topology Recalculation (10–50 sec)
After detecting the link is down, how long does it take for STP to recalculate?
Check port states during a failure:
Simulate a failure by shutting down a designated port:
SW1(config)# interface Po1
SW1(config-if)# shutdown
SW1(config-if)# exit
Immediately check neighboring switches:
SW2# show spanning-tree vlan 10 | include Po1
Po1 Root P2Se.1 FWD 19000 19000 10
<After 5 seconds>
SW2# show spanning-tree vlan 10 | include Po1
Po1 Root P2Se.1 LRN 19000 19000 10
Po1 transitioned from FWD to LRN (learning). It's learning MAC addresses from the new topology. Observe when it goes to FWD:
<After 20 more seconds>
SW2# show spanning-tree vlan 10 | include Po1
Po1 Root P2Se.1 FWD 19000 19000 10
Total time: ~25 seconds. The port was in learning state for ~20 seconds, which matches Forward Delay (15 sec + overhead).
For Rapid PVST+, this is much faster:
SW1(config)# interface Po1
SW1(config-if)# shutdown
SW2# show spanning-tree vlan 10 | include Po1
Po1 Root P2Se.1 FWD 19000 19000 10
<After 1 second>
SW2# show spanning-tree vlan 10 | include Po1
Po1 Root P2Se.1 FWD 19000 19000 10
In Rapid PVST+, the alternative port (previously blocked) is already in the forwarding state. When the root port fails, the alternative port is promoted immediately (no learning state required for pre-negotiated ports).
Stage 3: Check Which Timer is Causing Delay
Use this command to see configured timers:
SW1# show spanning-tree vlan 10 | include "Hello Time|Max Age|Forward Delay"
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
If convergence is slow, check if these timers are set too high. In legacy deployments, you might see:
Hello Time 2 sec Max Age 40 sec Forward Delay 30 sec
These timers are very conservative (doubling the default). This is sometimes done to prevent topology oscillations in unstable networks, but it dramatically increases convergence time.
Check the source of timer configuration:
SW1# show running-config | include spanning-tree
spanning-tree mode rapid-pvst
spanning-tree hello-time 2
spanning-tree max-age 40
spanning-tree forward-delay 30
The spanning-tree commands are global and apply to all VLANs. Per-VLAN timers can also be set:
spanning-tree vlan 10 hello-time 2
spanning-tree vlan 10 max-age 40
spanning-tree vlan 10 forward-delay 30
For Rapid PVST+, keep timers at default:
SW1(config)# spanning-tree hello-time 2
SW1(config)# spanning-tree max-age 20
SW1(config)# spanning-tree forward-delay 15
SW1(config)# exit
Don't override Forward Delay in Rapid PVST+ (the root port will use a different mechanism for learning).
Unidirectional Link Failures and Loop Guard
A unidirectional link failure is when one direction of a full-duplex link fails:
- Link from SW1 to SW2 works
- Link from SW2 to SW1 is broken
This is dangerous because STP relies on BPDUs in both directions to detect failures. If one direction fails, the other side doesn't receive BPDUs and might promote a blocking port to forwarding, creating a loop.
Scenario: Unidirectional Fiber Failure
SW1 ──(Tx)──→ SW2
SW1 ←──(Rx)── SW2 (BROKEN)
SW1 sends BPDUs to SW2 continuously. SW2 doesn't receive them (fiber is broken in one direction). After Max Age (20 sec), SW2 assumes the root is unreachable and promotes its blocked port, creating a loop back to SW1.
Without Loop Guard:
- SW1 sends BPDUs on Po1 toward SW2 (working)
- SW2 doesn't receive BPDUs from SW1 (fiber broken)
- After 20 seconds, SW2 assumes SW1 (the root) is dead
- SW2 promotes its blocked port to forwarding
- A loop exists: SW1 → SW2 (working) ← SW2 → SW1 (looped back, wrong path)
With Loop Guard:
Loop Guard monitors for missing BPDUs on ports that should be receiving them. If BPDUs stop coming (but the link is still up), the port is disabled:
- SW2's root port receives BPDUs from SW1 (working)
- SW2 has an alternative port that blocks traffic
- The alternative port also expects BPDUs (from the same root), but doesn't receive them (because that path is blocked)
- When the root port fails and no BPDUs arrive for a short timeout, Loop Guard errdisables the port
- No loop forms
Enable Loop Guard on Point-to-Point Trunks:
SW1(config)# interface Po1
SW1(config-if)# spanning-tree guard loop
SW1(config-if)# exit
SW2(config)# interface Po1
SW2(config-if)# spanning-tree guard loop
SW2(config-if)# exit
Verify:
SW1# show running-config interface Po1 | include guard
spanning-tree guard loop
Test Unidirectional Failure:
In a lab, simulate broken RX on SW2:
SW2(config)# interface Po1
SW2(config-if)# shutdown
SW2(config-if)# no shutdown
Check the port:
SW2# show interfaces Po1 status
Port Name Status Vlan
Po1 err-disabled loopguard
SW2# show log | include "Loop guard"
*Mar 25 15:20:10.345: %SPANTREE-2-LOOPGUARD_BLOCK: Loop guard blocking port Po1 on VLAN0010.
Loop Guard detected the unidirectional failure and disabled the port before a loop could form.
Legacy Features: UplinkFast and BackboneFast
Older networks used UplinkFast and BackboneFast to speed up convergence before Rapid PVST+ was available. These are deprecated and should not be used in new deployments.
UplinkFast (Legacy)
Purpose: Speed up convergence when an uplink fails on an access layer switch.
How it works:
- Detects when the root port fails (within 1 second)
- Immediately promotes the best alternative port (blocked port)
- Bypasses listening and learning states
- Uses hello times of 1 second for faster detection
Configuration (obsolete, for reference only):
SW1(config)# spanning-tree uplinkfast
SW1(config)# exit
Why it's obsolete: Rapid PVST+ is faster and more reliable. UplinkFast is a hack that violates IEEE 802.1D rules.
BackboneFast (Legacy)
Purpose: Reduce Max Age timer on core switches to detect indirect link failures faster.
How it works:
- Reduces Max Age from 20 sec to 3 sec
- Allows faster detection that a root switch is unreachable
- Triggers quicker topology recalculation
Configuration (obsolete, for reference only):
SW1(config)# spanning-tree backbonefast
SW1(config)# exit
Why it's obsolete: Modern networks use port timers of 1–2 seconds and Rapid PVST+. BackboneFast is no longer necessary.
Migration Path: 802.1D to Rapid PVST+
Step 1: Verify All Switches Support Rapid PVST+
SW1# show version | include "IOS XE"
Cisco IOS XE Software, Version 17.6.3
Catalyst 9300 with IOS XE 17.x fully supports Rapid PVST+.
Step 2: Understand Current Convergence Baseline
Before migrating, measure failover time. Simulate a root port failure:
SW1(config)# interface Po1
SW1(config-if)# shutdown
Use ping to test when connectivity is restored:
SW4# ping 10.0.20.1 -c 20 -i 1
...
!!! (loss until convergence)
!!!!!!!!!! (3–5 seconds of loss on Rapid PVST+)
!!!
Record the failover time. This is your baseline.
Step 3: Enable Rapid PVST+ on All Switches
Start with non-critical switches first:
SW3(config)# spanning-tree mode rapid-pvst
SW3(config)# exit
Verify it converges with 802.1D neighbors:
SW3# show spanning-tree summary
Switch is in rapid-pvst mode
Root bridge for: (none, waiting for election)
SW3# show spanning-tree vlan 10
Root ID Priority 32768
Address 0023.47a1.ef80
SW3 has converged and joined the topology.
Then migrate the backup root:
SW2(config)# spanning-tree mode rapid-pvst
SW2(config)# exit
Verify topology is stable (no TCNs):
SW2# show log | include TCNOTIFICATION
(should have none or very few)
Finally, migrate the primary root:
SW1(config)# spanning-tree mode rapid-pvst
SW1(config)# exit
Verify all switches report Rapid PVST+:
show spanning-tree summary (on all switches)
Step 4: Re-measure Failover Time
SW1(config)# interface Po1
SW1(config-if)# shutdown
SW4# ping 10.0.20.1 -c 20 -i 1
...
!!!! (loss until RSTP proposes new port)
!
!! (1–2 seconds total, vs. 30–50 sec before)
Convergence is now sub-second to a few seconds.
Convergence Troubleshooting Symptom → Cause → Fix
Symptom: Failover Takes 30+ Seconds in Rapid PVST+ Mode
Cause: Still using 802.1D timers (Max Age 20 sec, Forward Delay 15 sec), not the Rapid PVST+ accelerated mechanism.
Fix:
- Verify all switches are Rapid PVST+:
show spanning-tree summary - If any switch is 802.1D, migrate it:
SW(config)# spanning-tree mode rapid-pvst - Verify no unnecessary timer overrides:
show running-config | include spanning-tree - Remove high timer values:
SW(config)# no spanning-tree max-age SW(config)# no spanning-tree forward-delay
Symptom: Blocked Port Never Transitions to Forwarding After Root Port Fails
Cause: The port is not a valid alternative port due to topology constraints. Rapid PVST+ won't promote it without negotiation.
Fix:
- Check port role:
If the port shows "Block" (not "Altn"), it's not an alternative port and won't be promoted.show spanning-tree vlan 10 - Verify the topology allows the port to be an alternative. This requires:
- The port connects to a switch closer to the root than the current switch
- The path hasn't been blocked for other reasons (BPDU Guard, Root Guard)
- If the topology is correct, clear STP state and let it reconverge:
clear spanning-tree detected-protocols
Symptom: Frequent TCNs Preventing Stable Convergence
Cause: Ports are flapping (going up/down repeatedly), triggering topology changes on each transition.
Fix:
- Identify flapping ports:
show log | include "UPDOWN" - Check physical layer health:
show interfaces | include errors - Replace faulty cables. If errors persist, test optics:
show interfaces transceiver - Once links are stable, disable any legacy features that might be causing oscillations:
no spanning-tree uplinkfast no spanning-tree backbonefast
Best Practices for Fast Convergence
- Use Rapid PVST+ on all switches (not 802.1D)
- Keep timers at default (Hello 2 sec, Max Age 20 sec, Forward Delay 15 sec)
- Enable Loop Guard on point-to-point trunks to prevent unidirectional link loops
- Enable BPDU Guard on access ports to prevent rogue switches
- Maintain cable quality to avoid flapping links
- Test failover regularly to ensure sub-second convergence is working
- Remove legacy UplinkFast and BackboneFast from all configurations
Verification Checklist After Migration to Rapid PVST+
- All switches report
rapid-pvstmode - No "Listening" port states in
show spanning-tree(only "Forwarding", "Blocked", "Disabled", "Learning") - TCN messages are rare (less than one per minute in stable state)
- Failover time is under 10 seconds for most topologies, sub-second for optimized networks
- No ports are stuck in "Learning" state for more than a few seconds
- All critical uplinks have "Root" or "Desg" roles, none are "Block"
What's Next
In the next article (Article 21), we'll dive into advanced STP configuration topics: load balancing across VLANs by electing different root bridges per VLAN, and configuring STP paths to optimize traffic flow in complex topologies. This is where "art" meets "science" in spanning tree design.