ASA

Cisco ASA Common Outage Scenarios and Fixes

Cisco ASA Common Outage Scenarios and Fixes
In: ASA

Every Cisco ASA that runs long enough hits a small set of repeatable failure modes: a NAT pool exhausting, an ACL line that shadows another and silently breaks one app, a VPN tunnel that comes up and then carries no traffic, a failover pair that splits brain. The patterns are small, the symptoms look identical to dozens of other problems, and the fix is usually one show command away. This article is a field guide to the four most common ASA outage scenarios, the diagnostic command that decides each, and the fix - all using real evidence from the live ASAv 9.23(1) in the PingLabz ASA reference lab.

The pattern across all four: the symptom users report is generic ("the app is slow", "VPN does not work"), but the firewall has a single command output that names the problem. Knowing which command for which symptom is the entire diagnostic skill.

Scenario Map: Symptom to Confirming Command

User reportLikely causeConfirming command
"New connections work but slow / sometimes failing"NAT pool exhaustion, PAT slot starvationshow xlate count + show asp drop | include nat-no-xlate
"This one app stopped working but everything else is fine"ACL line shadowing - a deny higher up matches before the intended permitshow access-list <NAME> with hit counters
"VPN tunnel is up but traffic is not flowing"encaps incrementing but decaps stuck at 0; routing or NAT exemption missing on far sideshow crypto ipsec sa | include peer|encaps|decaps
"Failover pair shows both units active"Split-brain - failover link down or peer authentication mismatchshow failover history + show failover state

Scenario 1: NAT / PAT Pool Exhaustion

Symptom users see: existing connections work fine, new connections from inside hosts intermittently fail to establish. The application teams think the application is broken; networking sees an ASA that is not paged on anything. Both are wrong - the ASA is healthy but cannot allocate a new PAT mapping.

The underlying mechanism: PAT translates many internal source addresses to one external IP by varying the source port. The external port range is 1024-65535, so a single external IP supports at most ~64,500 simultaneous outbound flows. Once exhausted, new flows hit nat-no-xlate-to-pat-pool and drop silently.

The diagnostic chain:

ASA-PERIM# show xlate count
14 in use, 16 most used

ASA-PERIM# show asp drop | include nat-no-xlate
nat-no-xlate-to-pat-pool                                                    0

In the lab those numbers are healthy (14 in use is nowhere near 64K). On a perimeter under load you might see "in use" climbing to 50K+ and the asp-drop counter incrementing - that is your unambiguous signal. The next show command identifies which source is the busiest:

ASA-PERIM# show xlate detail | include PAT
UDP PAT from inside:10.10.10.1/58380 to outside:203.0.113.2/58380 flags ri idle 0:00:37 timeout 0:00:00
UDP PAT from inside:10.10.10.1/52513 to outside:203.0.113.2/52513 flags ri idle 0:00:57 timeout 0:00:00
UDP PAT from inside:10.10.10.1/64356 to outside:203.0.113.2/64356 flags ri idle 0:01:17 timeout 0:00:00
UDP PAT from inside:10.10.10.1/51684 to outside:203.0.113.2/51684 flags ri idle 0:00:17 timeout 0:00:00
... [many more] ...

If one inside host has thousands of PAT entries, that host is the noisemaker. Either it is a real production app burning ports legitimately (in which case widen the pool) or it is leaking sockets (fix the app). Three production fixes:

  1. Add a backup PAT pool. Add a second IP to the existing nameif's NAT pool: nat (inside,outside) source dynamic INSIDE-NET pool PAT-POOL-MAIN backup pool PAT-POOL-OVERFLOW. The ASA fails over from main to overflow when main is exhausted.
  2. Switch to per-session PAT. Adds per-session permit udp any any + per-session permit tcp any any globally. The default is per-session, so this is rarely needed.
  3. Lower the inactivity timer. timeout xlate 0:01:00 reduces the time a PAT mapping survives idle. Risky for long-poll applications - test before you ship.

Scenario 2: ACL Line Shadowing

Symptom users see: one specific application stopped working, every other app on that host works fine. Often happens after someone adds a "tighten that down" deny rule above the catch-all permits. The new rule matches more than intended.

The diagnostic: show access-list <NAME> with hit counters, looking for whether the deny line that was added is firing on the protected destination.

ASA-PERIM# show access-list OUTSIDE_IN | begin OUTSIDE_IN
access-list OUTSIDE_IN; 5 elements; name hash: 0xe01d8199
access-list OUTSIDE_IN line 1 extended deny tcp host 198.51.100.99 any (hitcnt=1) (Last Hit=00:01:09 UTC May 10 2026)
access-list OUTSIDE_IN line 2 extended permit tcp any object DMZ-WEB eq www (hitcnt=1) (Last Hit=23:27:41 UTC May 9 2026)
access-list OUTSIDE_IN line 3 extended permit tcp any object DMZ-WEB eq https (hitcnt=2) (Last Hit=00:01:09 UTC May 10 2026)
access-list OUTSIDE_IN line 4 extended permit icmp any object DMZ-WEB (hitcnt=2) (Last Hit=01:55:20 UTC May 10 2026)
access-list OUTSIDE_IN line 5 extended deny ip any any log informational interval 300 (hitcnt=59) (Last Hit=01:55:47 UTC May 10 2026)

Read the hit counters from top to bottom. Line 1 (an explicit deny against a specific source) has 1 hit - that is a one-off scanner, not the problem. Line 5 (the catch-all explicit deny with logging) has 59 hits - that is the smoking gun. 59 packets matched deny ip any any instead of any of the lines above.

To find what those 59 packets were, the log informational on line 5 is doing the work for you - their syslogs land in show logging:

ASA-PERIM# show logging | include "Deny inbound" | include 4-106023
%ASA-4-106023: Deny tcp src outside:8.8.8.8/56321 dst dmz:192.168.50.10/8080 by access-group "OUTSIDE_IN"
... [more entries] ...

Now you know exactly what was being denied: TCP from 8.8.8.8 to 192.168.50.10 on port 8080. If port 8080 is the application that broke, you have a permit gap (line 2 covers /80 only, line 3 covers /443 only, nothing covers /8080).

The fix is to add a permit for the missing port BEFORE the catch-all deny, then re-test:

access-list OUTSIDE_IN line 4 extended permit tcp any object DMZ-WEB eq 8080

Inserting at line 4 puts it before the original line 5 deny (which auto-renumbers to line 6). After the change, watch the new line 4 hitcnt climb and the line 5 hitcnt stop climbing for the previously-blocked traffic.

Scenario 3: VPN Tunnel Up But No Traffic

Symptom users see: VPN status icon green; traffic to the remote subnet times out. The classic asymmetric tunnel failure.

The diagnostic: show crypto ipsec sa with focus on the encaps and decaps counters:

ASA-PERIM# show crypto ipsec sa | include peer|encaps|decaps
      current_peer: 203.0.113.6

      #pkts encaps: 248, #pkts encrypt: 248, #pkts digest: 248
      #pkts decaps: 0, #pkts decrypt: 0, #pkts verify: 0

Encaps = 248 (we are sending plenty). Decaps = 0 (we are receiving nothing). The tunnel is up - both peers agreed on Phase 1 and Phase 2 - but the far side either is not sending return traffic or its return traffic never reaches us. Three causes, in priority order:

  1. The far side has no return route to our protected subnet. Most common. Login to the far peer and run a routing-table check for our remote IP. If it does not have a route, traffic from local hosts on the far side never reaches the tunnel.
  2. The far side is NATing return traffic before encrypt. NAT-after-encrypt is fine; NAT-before-encrypt breaks the proxy ACL match. Check NAT exemption (NAT exemption for VPNs) on the far side.
  3. The far side has its proxy ACL inverted. The crypto map's interesting traffic must be the mirror image of ours: our local subnet = their remote subnet, our remote = their local. If they are flipped, return packets fail the SA selector and are dropped at decap.

If decaps is non-zero but smaller than encaps, you have packet loss on the far-to-near direction - typically a path MTU problem with overlay encapsulation. Set crypto ipsec security-association tcp-mss-adjust 1380 on both peers and retest.

This pattern is so common that the lab has it documented on the dedicated VPN troubleshooting page; see troubleshoot IPsec phases on Cisco ASA for the full Phase-1/Phase-2 decision tree.

Get the Cisco ASA Field Reference - 9 pages, free

Everything you'd want to remember about Cisco ASA on nine printable pages. Per-packet pipeline diagram, NAT 8.3+ section ordering, six-branch troubleshooting decision tree, real lab show-output annotated, paste-ready three-zone config. Free for PingLabz members - just sign up with your email.

Get the Cisco ASA cheat-sheet

Scenario 4: Failover Pair Split-Brain

Symptom users see: intermittent connection drops, sometimes traffic flows, sometimes it does not, packet captures show the same flow being answered by two different MAC addresses. The pair has both units announcing themselves as Active.

Split-brain happens when the failover link goes down between two ASAs that are both healthy. Each unit sees its peer as failed and promotes itself to Active. Now both units forward traffic, both NAT, both register ARP for the same gateway IPs - hence the alternating MACs and the broken sessions.

The diagnostic on each unit:

ASA-PRIMARY# show failover state
                This host -   Primary
                              Active           None
                Other host -  Secondary
                              Failed           Comm Failure
                Stateful Failover Logical Update Statistics

ASA-PRIMARY# show failover history
==========================================================================
From State                 To State                   Reason
==========================================================================
13:14:01 UTC May 10 2026
Active Standby Ready        Active Drain               Other unit wants me Active
13:14:01 UTC May 10 2026
Active Drain                Active Applying Config    Other unit wants me Active
13:14:01 UTC May 10 2026
Active Applying Config      Active Config Applied     Other unit wants me Active
13:14:01 UTC May 10 2026
Active Config Applied       Active                    Other unit wants me Active
13:34:22 UTC May 10 2026
Active                      Active                    Comm Failure
==========================================================================

Both units will show "This host: Active" with reason "Comm Failure" against the peer. That is the split-brain signature. The "Active Active" transition with reason "Comm Failure" in the history is the timestamp at which the failover link went down.

The fix has two parts:

  1. Bring the failover link back up before promoting any traffic decisions. Look at show interface for the configured failover lan interface; if it is down/down, the cable, port, or transceiver is failed. Until the link is up the units have no way to renegotiate.
  2. Force one unit to standby once the link is healthy. Pick the unit that you want to remain Active - typically the one with the most recent valid config - and on the OTHER unit run no failover active. That unit transitions to Standby Ready and pulls config from the active. The session is briefly disrupted as the duplicate ARPs clear, but the pair is now consistent.

Prevention: monitor the failover link. monitor-interface FAIL-LINK on each unit triggers a syslog the moment that interface goes link-down. Pair with logging trap warnings and a syslog server you actually read.

Other Fast Checks Worth Knowing

Less common but worth keeping in your head:

CheckWhat it tells you
show resource usagePer-context resource utilization on multi-context, or per-platform on single-mode. Conn count, xlate count, syslog rate, AAA tx rate vs platform max.
show service-policy globalInspection engine hit counters. If "tcp-options drops" or "esmtp drops" climb, an inspection is rejecting your application.
show route summaryNumber of routes per source. If static count drops unexpectedly, someone deleted routes.
show vpn-sessiondb summaryLive VPN session count. If a remote-access VPN keeps dropping users, this trends the count.
show clock + show ntp associationsNTP drift breaks AnyConnect cert validation. Worth an early check on any VPN issue.

Key Takeaways

Four scenarios cover most "the firewall is broken" tickets: NAT pool exhaustion (look at show xlate count + asp-drop), ACL line shadowing (look at show access-list hit counters), VPN encaps-without-decaps (look at show crypto ipsec sa), and failover split-brain (look at show failover history). Each has a single confirming command, then a small set of fixes. The full Cisco ASA reference cluster goes deeper on each: NAT order and types, ACL troubleshooting, IPsec phase troubleshooting, active/standby failover configuration, and the operational toolkit for live diagnosis: packet-tracer, CLI packet capture, asp-drop counters, and conn / xlate troubleshooting.

Written by
More from Ping Labz
Great! You’ve successfully signed up.
Welcome back! You've successfully signed in.
You've successfully subscribed to Ping Labz.
Your link has expired.
Success! Check your email for magic link to sign-in.
Success! Your billing info has been updated.
Your billing was not updated.