BGP Neighbor States: The 6-State FSM and How to Diagnose It

BGP's finite state machine has six states: Idle, Connect, Active, OpenSent, OpenConfirm, Established. What each means, why Active is a problem state, and the diagnostic that maps each stuck state to a real cause.

BGP runs over TCP/179 and forms a session between two configured peers. That session walks through a state machine before it can exchange routing information, and getting stuck somewhere along the way is the most common BGP failure mode. The good news: every state has a specific meaning, and being stuck in a particular state tells you exactly where to look for the problem.

This article walks through all six BGP neighbor states (Idle, Connect, Active, OpenSent, OpenConfirm, Established), what each state actually means, the events that drive transitions between them, and the diagnostic that maps each "stuck" state back to a real-world cause. If you have ever stared at show ip bgp summary wondering why a peer is in Active mode, this is the reference.

The Six States

Per RFC 4271 the BGP finite state machine has six states. The first three are about establishing the TCP session; the last three are about negotiating BGP itself on top of that TCP session.

StateWhat it meansHealthy steady state?
IdleBGP process is starting; not yet trying to connectNo (transient on session bring-up or after teardown)
ConnectTrying to complete the TCP three-way handshakeNo (transient; should move to Active or OpenSent)
ActiveTCP failed; waiting to retryNo (problem state)
OpenSentSent our OPEN message; waiting for theirsNo (transient)
OpenConfirmReceived their OPEN; waiting for KEEPALIVENo (transient)
EstablishedSession is up; updates can flowYes (this is what you want)

"Active" is famously misleading: in casual English you might think Active means "doing something good." In BGP, Active means "TCP did not work, I am actively trying to retry." It is a problem state, not a healthy one.

State: Idle

When BGP first starts (or after a hard reset), the peer transitions to Idle. In Idle, the BGP process is loaded but not actively trying to connect. It refuses incoming connections.

The transition out of Idle happens on:

  • Manual start. An administrator runs clear ip bgp or first configures the peer.
  • Automatic start. The router determines it has a route to the peer's IP and begins the TCP attempt.

If a peer is stuck in Idle, the most common cause is the IdleHoldTimer. After a session goes down, BGP enters Idle and waits a damping interval before retrying. The interval grows exponentially up to a cap to prevent flap loops. If you just configured a session and it is in Idle, check whether the route to the peer is actually present in the IP routing table.

Cisco-specific gotcha: the BGP process can also be administratively shut down via neighbor X.X.X.X shutdown. The peer shows up in show ip bgp summary but with state Idle and a (Admin) suffix.

State: Connect

The router attempts a TCP connect to the peer's IP on port 179. If the TCP handshake completes, transition to OpenSent. If the TCP handshake fails (timeout, RST), transition to Active.

Connect is fast (sub-second on a healthy network). If you see Connect in show ip bgp summary, you have caught the session in the brief window between Idle and OpenSent or Active. It is not a stuck state in normal operation.

Long stays in Connect generally indicate slow TCP handshakes (extreme latency, packet loss, asymmetric paths through firewalls).

State: Active

Active is the famous problem state. It means: I tried TCP, it failed, I am waiting to retry. The retry interval is the ConnectRetryTimer (default 120 seconds on Cisco, configurable).

Why TCP fails:

CauseDiagnosticFix
No route to peer's IPshow ip route X.X.X.X returns nothingCheck IGP, static routes, or interface config
ACL blocking TCP/179show access-lists, look for matchesPermit TCP/179 between peers
Source IP mismatchPeer's neighbor command expects a different source than we are sending fromneighbor X update-source LoopbackN
Peer's IP wrongPinging the peer failsVerify the IP of the loopback or interface on the peer
eBGP TTL exhaustedPeer is more than 1 hop away on eBGP, default TTL is 1neighbor X ebgp-multihop N
Peer not configured to accept usPeer's BGP is not configured for our IP/ASCoordinate with the other side
MD5 authentication mismatchLogs show "MD5 mismatch"Verify the secret on both sides
Peer firewall, NAT, or routingtraceroute fails, peer hard to reachNetwork path investigation

The single most useful sanity check when a peer is stuck in Active: try to telnet to the peer's IP on port 179. If you cannot complete the TCP handshake from the BGP source IP to the peer's IP, BGP cannot either.

R1# telnet 10.0.12.2 179 /source-interface Loopback0
Trying 10.0.12.2, 179 ... Open
[Press ctrl-shift-6 then x, then "disconnect"]

Open = TCP works. Refused = something at peer is closing the port. Timeout = ACL or routing problem along the path.

State: OpenSent

The TCP session is up. We sent our OPEN message (containing our AS number, hold time, BGP version, capabilities) and are waiting for the peer's OPEN.

OpenSent is brief. If you see a peer stuck in OpenSent, the peer's OPEN never arrived or was malformed.

Common causes of stuck OpenSent:

  • BGP version mismatch. Today both ends should run BGP-4; this is rare in 2026.
  • Hold time too low. If our advertised HoldTime is below 3 seconds (and not 0), the peer rejects us. Default is 180s; do not configure below 3.
  • Capability mismatch. One side advertises a capability (Route Refresh, 4-byte AS, Graceful Restart) that the other does not understand. Modern routers handle this gracefully via capability negotiation, but ancient code can hard-fail.
  • BGP NOTIFICATION received. The peer sent a NOTIFICATION rejecting our OPEN. Common reasons in the NOTIFICATION error code: bad peer AS, bad BGP identifier (router ID), authentication failure.

The diagnostic: debug ip bgp on Cisco shows the OPEN parameters and any NOTIFICATIONs.

State: OpenConfirm

We sent OPEN, received OPEN, and now we are waiting for the first KEEPALIVE from the peer to confirm the session is up.

OpenConfirm is also brief. If you see a stuck OpenConfirm, the peer's KEEPALIVE never arrived. This is usually a unidirectional path problem (we can send to them but they cannot send to us, often a one-way ACL).

The transition to Established happens when we receive the first valid KEEPALIVE.

State: Established

The session is up. Both peers can now send UPDATE messages with prefixes and KEEPALIVE messages every (HoldTime / 3) seconds (default 60s on Cisco, with HoldTime 180s).

Healthy steady-state for a working BGP peer. show ip bgp summary shows it as a number (the count of received prefixes from the peer) instead of a state name.

If a peer drops out of Established, the cause is one of:

  • HoldTime expired. No KEEPALIVE or UPDATE received within HoldTime seconds. The session goes back to Idle.
  • NOTIFICATION received. The peer sent a NOTIFICATION because of a problem (cease, hold timer expired, parsing error, etc.). Session goes to Idle.
  • TCP reset. Underlying TCP died. Session goes to Idle.
  • Manual clear. Administrator ran clear ip bgp. Session goes to Idle.

The HoldTime expiration is the most common production failure: a network glitch interrupts BGP traffic for HoldTime seconds, both ends decide the other is dead, and the session resets. Mitigation is BFD (sub-second detection), see below.

Diagnostic Cheat Sheet by State

Stuck stateLikely causeFirst diagnostic
IdleNo route to peer; admin shutdown; flapping (IdleHoldTimer)show ip route <peer>
ActiveTCP handshake failingtelnet <peer> 179
OpenSentPeer's OPEN never arrived or rejecteddebug ip bgp + check NOTIFICATIONs
OpenConfirmPeer's first KEEPALIVE never arrived (often one-way path)traceroute both directions; check ACLs both ways
Established but flappingHoldTime expiration due to packet loss or path issueBFD; investigate path stability

Timers That Affect the FSM

TimerDefaultPurpose
ConnectRetry120sHow long Active waits before retrying TCP
HoldTime180sNegotiated min of both peers' HoldTime; 0 means do not check
KeepAliveHoldTime / 3 (60s default)How often KEEPALIVE messages are sent in Established
IdleHoldTimerVariable, exponential backoffDamping after session failures; prevents flap loops

HoldTime negotiation: each peer advertises its HoldTime in OPEN, and they use the lower of the two. Setting HoldTime to 0 disables the check entirely; setting it below 3 is invalid.

BFD: Sub-Second BGP Failure Detection

The default HoldTime of 180 seconds is sane for protocol stability but unacceptable for operations. A link can be dead for three minutes before BGP notices, during which traffic blackholes. BFD (Bidirectional Forwarding Detection) plugs this gap.

BFD runs as a separate protocol below BGP, exchanging tiny UDP packets at 50-300 ms intervals. When BFD detects loss (3 consecutive missed packets, default), it tells BGP to bring the session down immediately, regardless of the HoldTime.

Configuration:

R1(config-if)# bfd interval 100 min_rx 100 multiplier 3
R1(config)# router bgp 65001
R1(config-router)#  neighbor 10.0.12.2 fall-over bfd

Result: BGP neighbor failure detection in 300ms instead of 180s. BGP Convergence: Timers, BFD, and Reducing Failover Time covers the full pattern.

The Three Commands You Need

R1# show ip bgp summary
Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.0.12.2       4 65002       1       1        0    0    0 never    Active
10.0.13.3       4 65003   12345   12345     1234    0    0 02:30:15        4521

R1# show ip bgp neighbors 10.0.12.2 | section state
  BGP state = Active
  Neighbor sessions:
    0 active, is multisession capable

R1# debug ip bgp 10.0.12.2 events
*Apr 24 10:12:34: BGP: 10.0.12.2 went from Active to Idle
*Apr 24 10:12:34: BGP: 10.0.12.2 active rejected for connect retry timer expired

The State/PfxRcd column in show ip bgp summary is the daily diagnostic. A number = Established + count of prefixes received. A state name = problem in that state.

Summary

BGP's six-state finite state machine is unusual in that "Active" is a problem state, not a healthy one. The healthy progression is Idle to Connect to OpenSent to OpenConfirm to Established. Anything else is a stuck transition that maps to a specific cause: routing, ACLs, configuration mismatch, or path failure.

If you only remember three things: Active means TCP failed, OpenSent stuck means OPEN was rejected or dropped, and Established but flapping means HoldTime expiration (use BFD). Bookmark this article alongside the BGP cluster pillar as your day-one debug reference.

Read next

© 2025 Ping Labz. All rights reserved.