A BGP session that won't reach Established state is the most common BGP issue you'll face. The good news: the FSM state (Idle, Active, OpenSent, OpenConfirm) tells you exactly where the problem is. The bad news: the actual root cause within each state can be one of many things. This article provides a systematic methodology for isolating and fixing adjacency issues on Cisco IOS XE.
Step 1: Check the FSM State
R1-HQ# show ip bgp summary
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
172.16.0.2 4 65010 0 0 1 0 0 never Idle
172.16.0.6 4 65020 12 14 1 0 0 00:05:12 ActiveState/PfxRcd shows the current state. Let's troubleshoot each one.
Idle State
BGP hasn't started the TCP connection. The router knows about the neighbor but isn't trying to connect.
Common Causes
- Neighbor administratively shut down:
neighbor 172.16.0.2 shutdownis configured - Maximum-prefix limit exceeded: The peer previously sent too many prefixes and was penalty-boxed
- No route to peer address: The destination IP isn't in the routing table at all
- Update-source issue: The source interface for the BGP session is down
Diagnostic Commands
R1-HQ# show ip bgp neighbors 172.16.0.2 | include state|shutdown|Idle
BGP state = Idle
Administratively shut down
! Check if there's a route to the peer
R1-HQ# show ip route 172.16.0.2
% Network not in tableFixes
! Remove administrative shutdown
R1-HQ(config-router)# no neighbor 172.16.0.2 shutdown
! Clear the prefix-limit penalty
R1-HQ# clear ip bgp 172.16.0.2Active State
BGP is trying to establish the TCP connection but keeps failing. Despite the name, Active is a problem state — the router is actively retrying and failing.
Common Causes
- ACL blocking TCP/179: Interface or control-plane ACL filtering BGP packets
- Wrong neighbor IP: Configured IP doesn't match what the peer expects
- Interface down: The outgoing interface toward the peer is down
- eBGP TTL issue: Peer is more than 1 hop away but
ebgp-multihopisn't configured - Update-source mismatch: For iBGP over loopbacks — missing
update-sourceor loopback not in IGP - Firewall in the path: Stateful firewall blocking the return TCP SYN-ACK
Diagnostic Commands
! Basic reachability
R1-HQ# ping 172.16.0.2
!!!!!
! Check interface status
R1-HQ# show ip interface brief | include 172.16.0
GigabitEthernet0/0 172.16.0.1 YES manual up up
! Check for ACLs
R1-HQ# show ip access-lists
! Look for any ACL denying TCP 179
! Check TCP connection attempts
R1-HQ# show ip bgp neighbors 172.16.0.2 | include TCP
Transport(tcp) path-mtu-discovery is enabled
Connection state is ESTAB or SYN_SENT
TCP connection attempts: 47
TCP open sent: 0
! For iBGP over loopbacks
R1-HQ# ping 2.2.2.2 source 1.1.1.1
!!!!!Debug (use with caution in production)
R1-HQ# debug ip bgp 172.16.0.2 events
*Mar 27 10:15:22.114: BGP: 172.16.0.2 Active open failed - Loss of TCP connection; retry in 30 secondsOpenSent State
TCP connected, OPEN sent, waiting for peer's OPEN. The session gets here but doesn't progress.
Common Causes
- AS number mismatch:
remote-asdoesn't match what the peer advertises in its OPEN - Router-ID conflict: Both peers have the same router-ID
- Capability mismatch: One side requires a capability the other doesn't support
Diagnostic Commands
R1-HQ# show ip bgp neighbors 172.16.0.2 | include Notification
Notification: 2/2 (OPEN Message Error/Bad Peer AS) 3 times
Last reset 00:01:14, due to BGP Notification received: peer AS mismatchOpenConfirm State
Both OPENs exchanged, waiting for KEEPALIVE confirmation. Rarely stays here long — if it does, check for TCP issues or packet loss on the path.
Session Flapping (Established → Idle → repeat)
The session comes up but keeps dropping. Different from "won't come up."
Common Causes
- Hold timer expiry: Keepalives not arriving in time — usually CPU congestion, path MTU issues, or link quality problems
- Maximum-prefix limit hit: Peer sends more prefixes than allowed, triggering a teardown
- Route map or policy change: Hard reset triggered by policy change (non-soft-reconfiguration)
- Physical link instability: Interface bouncing causes TCP resets
Diagnostic Commands
R1-HQ# show ip bgp neighbors 172.16.0.2 | include reset|flap|Notification
Last reset 00:02:33, due to BGP Notification sent, hold time expired
Peer had 3 resets in the last 00:15:00
! Check hold timer health
R1-HQ# show ip bgp neighbors 172.16.0.2 | include hold|read|write
Last read 00:00:45, last write 00:00:12, hold time is 180, keepalive interval is 60
! If "last read" approaches the hold time, the peer is struggling to send keepalivesSystematic Troubleshooting Checklist
- Layer 1-3 reachability: Can you ping the peer IP from the correct source? Is the interface up?
- TCP/179: Are ACLs, firewalls, or control-plane policing blocking port 179 in either direction?
- Configuration match: Does
remote-asmatch? Does the neighbor IP match what the peer has configured? - iBGP specifics: Is
update-sourceset? Is the loopback in the IGP? Can you ping loopback-to-loopback? - eBGP specifics: Is the peer directly connected (TTL 1)? If not, is
ebgp-multihopconfigured? - Authentication: Do both sides have the same password? Is one side using MD5 and the other not?
- Notifications: Check
show ip bgp neighbors [ip]for the last NOTIFICATION sent/received — it tells you exactly what went wrong.
Key Takeaways
- The FSM state is your first diagnostic — Idle means not trying, Active means TCP failing, OpenSent means OPEN rejected, Established means healthy.
- Active is the most common problem state. Check reachability, ACLs, source IP, and TTL.
- Always check the last NOTIFICATION message — error code and subcode pinpoint the exact issue.
show ip bgp neighbors [ip]is the single most useful command for BGP troubleshooting — it shows state, timers, counters, capabilities, and error history.- For iBGP issues, the top two causes are missing
update-sourceand loopback not reachable via IGP.