BGP runs over TCP/179 and forms a session between two configured peers. That session walks through a state machine before it can exchange routing information, and getting stuck somewhere along the way is the most common BGP failure mode. The good news: every state has a specific meaning, and being stuck in a particular state tells you exactly where to look for the problem.
This article walks through all six BGP neighbor states (Idle, Connect, Active, OpenSent, OpenConfirm, Established), what each state actually means, the events that drive transitions between them, and the diagnostic that maps each "stuck" state back to a real-world cause. If you have ever stared at show ip bgp summary wondering why a peer is in Active mode, this is the reference.
The Six States
Per RFC 4271 the BGP finite state machine has six states. The first three are about establishing the TCP session; the last three are about negotiating BGP itself on top of that TCP session.
| State | What it means | Healthy steady state? |
|---|---|---|
| Idle | BGP process is starting; not yet trying to connect | No (transient on session bring-up or after teardown) |
| Connect | Trying to complete the TCP three-way handshake | No (transient; should move to Active or OpenSent) |
| Active | TCP failed; waiting to retry | No (problem state) |
| OpenSent | Sent our OPEN message; waiting for theirs | No (transient) |
| OpenConfirm | Received their OPEN; waiting for KEEPALIVE | No (transient) |
| Established | Session is up; updates can flow | Yes (this is what you want) |
"Active" is famously misleading: in casual English you might think Active means "doing something good." In BGP, Active means "TCP did not work, I am actively trying to retry." It is a problem state, not a healthy one.
State: Idle
When BGP first starts (or after a hard reset), the peer transitions to Idle. In Idle, the BGP process is loaded but not actively trying to connect. It refuses incoming connections.
The transition out of Idle happens on:
- Manual start. An administrator runs
clear ip bgpor first configures the peer. - Automatic start. The router determines it has a route to the peer's IP and begins the TCP attempt.
If a peer is stuck in Idle, the most common cause is the IdleHoldTimer. After a session goes down, BGP enters Idle and waits a damping interval before retrying. The interval grows exponentially up to a cap to prevent flap loops. If you just configured a session and it is in Idle, check whether the route to the peer is actually present in the IP routing table.
Cisco-specific gotcha: the BGP process can also be administratively shut down via neighbor X.X.X.X shutdown. The peer shows up in show ip bgp summary but with state Idle and a (Admin) suffix.
State: Connect
The router attempts a TCP connect to the peer's IP on port 179. If the TCP handshake completes, transition to OpenSent. If the TCP handshake fails (timeout, RST), transition to Active.
Connect is fast (sub-second on a healthy network). If you see Connect in show ip bgp summary, you have caught the session in the brief window between Idle and OpenSent or Active. It is not a stuck state in normal operation.
Long stays in Connect generally indicate slow TCP handshakes (extreme latency, packet loss, asymmetric paths through firewalls).
State: Active
Active is the famous problem state. It means: I tried TCP, it failed, I am waiting to retry. The retry interval is the ConnectRetryTimer (default 120 seconds on Cisco, configurable).
Why TCP fails:
| Cause | Diagnostic | Fix |
|---|---|---|
| No route to peer's IP | show ip route X.X.X.X returns nothing | Check IGP, static routes, or interface config |
| ACL blocking TCP/179 | show access-lists, look for matches | Permit TCP/179 between peers |
| Source IP mismatch | Peer's neighbor command expects a different source than we are sending from | neighbor X update-source LoopbackN |
| Peer's IP wrong | Pinging the peer fails | Verify the IP of the loopback or interface on the peer |
| eBGP TTL exhausted | Peer is more than 1 hop away on eBGP, default TTL is 1 | neighbor X ebgp-multihop N |
| Peer not configured to accept us | Peer's BGP is not configured for our IP/AS | Coordinate with the other side |
| MD5 authentication mismatch | Logs show "MD5 mismatch" | Verify the secret on both sides |
| Peer firewall, NAT, or routing | traceroute fails, peer hard to reach | Network path investigation |
The single most useful sanity check when a peer is stuck in Active: try to telnet to the peer's IP on port 179. If you cannot complete the TCP handshake from the BGP source IP to the peer's IP, BGP cannot either.
R1# telnet 10.0.12.2 179 /source-interface Loopback0
Trying 10.0.12.2, 179 ... Open
[Press ctrl-shift-6 then x, then "disconnect"]Open = TCP works. Refused = something at peer is closing the port. Timeout = ACL or routing problem along the path.
State: OpenSent
The TCP session is up. We sent our OPEN message (containing our AS number, hold time, BGP version, capabilities) and are waiting for the peer's OPEN.
OpenSent is brief. If you see a peer stuck in OpenSent, the peer's OPEN never arrived or was malformed.
Common causes of stuck OpenSent:
- BGP version mismatch. Today both ends should run BGP-4; this is rare in 2026.
- Hold time too low. If our advertised HoldTime is below 3 seconds (and not 0), the peer rejects us. Default is 180s; do not configure below 3.
- Capability mismatch. One side advertises a capability (Route Refresh, 4-byte AS, Graceful Restart) that the other does not understand. Modern routers handle this gracefully via capability negotiation, but ancient code can hard-fail.
- BGP NOTIFICATION received. The peer sent a NOTIFICATION rejecting our OPEN. Common reasons in the NOTIFICATION error code: bad peer AS, bad BGP identifier (router ID), authentication failure.
The diagnostic: debug ip bgp on Cisco shows the OPEN parameters and any NOTIFICATIONs.
State: OpenConfirm
We sent OPEN, received OPEN, and now we are waiting for the first KEEPALIVE from the peer to confirm the session is up.
OpenConfirm is also brief. If you see a stuck OpenConfirm, the peer's KEEPALIVE never arrived. This is usually a unidirectional path problem (we can send to them but they cannot send to us, often a one-way ACL).
The transition to Established happens when we receive the first valid KEEPALIVE.
State: Established
The session is up. Both peers can now send UPDATE messages with prefixes and KEEPALIVE messages every (HoldTime / 3) seconds (default 60s on Cisco, with HoldTime 180s).
Healthy steady-state for a working BGP peer. show ip bgp summary shows it as a number (the count of received prefixes from the peer) instead of a state name.
If a peer drops out of Established, the cause is one of:
- HoldTime expired. No KEEPALIVE or UPDATE received within HoldTime seconds. The session goes back to Idle.
- NOTIFICATION received. The peer sent a NOTIFICATION because of a problem (cease, hold timer expired, parsing error, etc.). Session goes to Idle.
- TCP reset. Underlying TCP died. Session goes to Idle.
- Manual clear. Administrator ran
clear ip bgp. Session goes to Idle.
The HoldTime expiration is the most common production failure: a network glitch interrupts BGP traffic for HoldTime seconds, both ends decide the other is dead, and the session resets. Mitigation is BFD (sub-second detection), see below.
Diagnostic Cheat Sheet by State
| Stuck state | Likely cause | First diagnostic |
|---|---|---|
| Idle | No route to peer; admin shutdown; flapping (IdleHoldTimer) | show ip route <peer> |
| Active | TCP handshake failing | telnet <peer> 179 |
| OpenSent | Peer's OPEN never arrived or rejected | debug ip bgp + check NOTIFICATIONs |
| OpenConfirm | Peer's first KEEPALIVE never arrived (often one-way path) | traceroute both directions; check ACLs both ways |
| Established but flapping | HoldTime expiration due to packet loss or path issue | BFD; investigate path stability |
Timers That Affect the FSM
| Timer | Default | Purpose |
|---|---|---|
| ConnectRetry | 120s | How long Active waits before retrying TCP |
| HoldTime | 180s | Negotiated min of both peers' HoldTime; 0 means do not check |
| KeepAlive | HoldTime / 3 (60s default) | How often KEEPALIVE messages are sent in Established |
| IdleHoldTimer | Variable, exponential backoff | Damping after session failures; prevents flap loops |
HoldTime negotiation: each peer advertises its HoldTime in OPEN, and they use the lower of the two. Setting HoldTime to 0 disables the check entirely; setting it below 3 is invalid.
BFD: Sub-Second BGP Failure Detection
The default HoldTime of 180 seconds is sane for protocol stability but unacceptable for operations. A link can be dead for three minutes before BGP notices, during which traffic blackholes. BFD (Bidirectional Forwarding Detection) plugs this gap.
BFD runs as a separate protocol below BGP, exchanging tiny UDP packets at 50-300 ms intervals. When BFD detects loss (3 consecutive missed packets, default), it tells BGP to bring the session down immediately, regardless of the HoldTime.
Configuration:
R1(config-if)# bfd interval 100 min_rx 100 multiplier 3
R1(config)# router bgp 65001
R1(config-router)# neighbor 10.0.12.2 fall-over bfdResult: BGP neighbor failure detection in 300ms instead of 180s. BGP Convergence: Timers, BFD, and Reducing Failover Time covers the full pattern.
The Three Commands You Need
R1# show ip bgp summary
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
10.0.12.2 4 65002 1 1 0 0 0 never Active
10.0.13.3 4 65003 12345 12345 1234 0 0 02:30:15 4521
R1# show ip bgp neighbors 10.0.12.2 | section state
BGP state = Active
Neighbor sessions:
0 active, is multisession capable
R1# debug ip bgp 10.0.12.2 events
*Apr 24 10:12:34: BGP: 10.0.12.2 went from Active to Idle
*Apr 24 10:12:34: BGP: 10.0.12.2 active rejected for connect retry timer expiredThe State/PfxRcd column in show ip bgp summary is the daily diagnostic. A number = Established + count of prefixes received. A state name = problem in that state.
Summary
BGP's six-state finite state machine is unusual in that "Active" is a problem state, not a healthy one. The healthy progression is Idle to Connect to OpenSent to OpenConfirm to Established. Anything else is a stuck transition that maps to a specific cause: routing, ACLs, configuration mismatch, or path failure.
If you only remember three things: Active means TCP failed, OpenSent stuck means OPEN was rejected or dropped, and Established but flapping means HoldTime expiration (use BFD). Bookmark this article alongside the BGP cluster pillar as your day-one debug reference.