Queueing, policing, and shaping are the three traffic-management primitives every Cisco QoS deployment uses. They sound similar in casual conversation - "they limit traffic, right?" - but each does something different and they are not interchangeable. Confusing them is the second most common QoS mistake (after trust-boundary failures), and the symptoms are subtle: drops in unexpected places, queues growing without bound, traffic that gets policed before it gets shaped.
This article walks through what each primitive does, where it applies, when to use which, the Cisco IOS XE configuration, and the standard production patterns. If you are configuring QoS for the first time or auditing a deployment that "feels off," this is the reference.
The Three Primitives
| Primitive | What it does | Excess traffic | Where it applies |
|---|---|---|---|
| Queueing | Decides the order packets leave a congested interface | Stays in queues until served (or dropped if queue full) | Egress only (output direction) |
| Policing | Enforces a hard rate limit; drops or re-marks excess | Dropped (or re-marked) | Ingress or egress |
| Shaping | Smooths bursts to fit a target rate; buffers excess | Buffered and delayed | Egress only |
Queueing answers "in what order do these waiting packets leave?" Policing and shaping both answer "how do we keep traffic at or below rate X?" but with very different mechanics: policing drops, shaping buffers.
Queueing: Scheduling on Congested Interfaces
An interface is "congested" when more packets want to leave than the interface can transmit in a given time slice. Without congestion, queueing does not matter - packets transmit in arrival order. With congestion, the queueing scheduler chooses what leaves next.
Cisco supports several queueing schedulers, in order of historical introduction:
| Scheduler | Behavior | Use in 2026 |
|---|---|---|
| FIFO (First-In-First-Out) | Single queue; no priority | Default for uncongested interfaces; never on production WAN |
| Priority Queueing (PQ) | Four queues; higher always served first; can starve lower | Legacy; deprecated |
| Custom Queueing (CQ) | 16 queues with byte-count round-robin | Legacy; deprecated |
| Weighted Fair Queueing (WFQ) | Per-flow queues with proportional service based on weight | Default on slow serial; rarely tuned in 2026 |
| Class-Based Weighted Fair Queueing (CBWFQ) | Per-class queues with configured bandwidth guarantees | Modern default for non-real-time classes |
| Low-Latency Queueing (LLQ) | CBWFQ plus a strict-priority queue with built-in policer | Modern default when voice or video shares with data |
LLQ is the dominant queueing strategy in modern Cisco deployments. It gives voice a strict priority queue (lowest latency, lowest jitter) but applies a built-in policer to prevent the priority queue from starving everything else if voice traffic explodes. The remaining bandwidth divides among other classes via CBWFQ proportional service.
LLQ Configuration
policy-map WAN-EGRESS
class VOICE
priority percent 10 ! Strict priority, capped at 10%
class VIDEO
bandwidth percent 30 ! Guaranteed 30%
random-detect dscp-based ! WRED
class TRANSACTIONAL
bandwidth percent 25 ! Guaranteed 25%
random-detect
class SCAVENGER
bandwidth percent 1 ! Tiny guarantee
class class-default
bandwidth percent 34
fair-queue
random-detect
interface GigabitEthernet0/0/0
service-policy output WAN-EGRESSThe percentages must sum to no more than 100. The priority statement implicitly counts against the total. If you only specify priority without a percentage, the priority queue is unbounded - never do this in production; a misbehaving SIP gateway can starve everything else.
Bandwidth Statements
Three forms exist:
| Form | Behavior |
|---|---|
bandwidth percent X | X percent of the interface bandwidth (or the parent shaper rate in hierarchical policies) |
bandwidth X (kbps) | Absolute kbps guarantee |
bandwidth remaining percent X | X percent of bandwidth remaining after priority queues are accounted for |
The remaining-percent form is useful when priority bandwidth varies (e.g. voice scales with call count). The classes that take "remaining" do not need to be re-tuned every time the priority allocation changes.
WRED: Graceful Degradation
WRED (Weighted Random Early Detection) randomly drops packets from a queue as it approaches full, biased toward higher drop precedence. The benefit: TCP flows respond to dropped packets by slowing down, which empties the queue gracefully rather than letting it fill and tail-drop everything.
WRED works best on classes with lots of TCP traffic. Voice and video (UDP) do not benefit because UDP does not back off; for those classes, just rely on the priority queue's policer.
class TRANSACTIONAL
bandwidth percent 25
random-detect dscp-based ! Drop AF23 first, then AF22, then AF21Policing: Hard Rate Limits
A policer enforces a maximum rate. Packets above the rate are dropped immediately or re-marked to a lower-priority DSCP. Cisco implementations use a token-bucket algorithm:
- Tokens accumulate in a bucket at the configured rate (e.g. 10 Mbps).
- Each arriving packet consumes tokens proportional to its size.
- If enough tokens are available, the packet conforms (passes).
- If not enough tokens, the packet exceeds (drop or re-mark, depending on config).
- Bucket has a burst size that allows short bursts above rate; refills constantly at the rate.
Configuration:
policy-map RATE-LIMIT-INBOUND
class class-default
police 10000000 1500000 ! 10 Mbps with 1.5 MB burst
conform-action transmit
exceed-action drop
interface GigabitEthernet0/0/0
service-policy input RATE-LIMIT-INBOUNDPolicing's superpower: instant rate enforcement without buffering. Use it for:
- SLA enforcement at network boundaries (carriers police their customers)
- Subscriber rate plans (ISP enforcing a customer's contracted rate)
- Protection against traffic floods (per-source-IP policers)
- The built-in policer on LLQ priority queues (prevents voice from running away)
Policing's weakness: dropping perfectly good packets. TCP responds by retransmitting and reducing congestion window. The throughput of TCP traffic against a policer is significantly lower than the policed rate. If you can shape instead of police for non-bursty traffic, do.
Re-marking Instead of Dropping
A common pattern: instead of dropping excess, re-mark to a lower-priority class:
class SCAVENGER
police 5000000
conform-action set-dscp-transmit cs1 ! Confirm: keep CS1
exceed-action set-dscp-transmit cs0 ! Exceed: re-mark to BEExcess scavenger traffic is not dropped - just demoted. If the network has spare capacity, the demoted traffic still gets through. If the network is congested, the demoted traffic is the first to drop. This "soft policing" gives you the rate enforcement without the hard cliff.
Shaping: Smoothing Bursts
A shaper buffers excess traffic and releases it at the configured rate. Like a policer, it uses a token-bucket algorithm; unlike a policer, the action for excess is "buffer for later" instead of "drop."
Configuration:
policy-map SHAPE-WAN
class class-default
shape average 50000000 ! Shape to 50 Mbps average
interface GigabitEthernet0/0/0
service-policy output SHAPE-WANShape average smooths bursts to the configured rate. Shape peak (rarely used) allows bursting to a higher rate based on accumulated credit.
The classic shaping use case: your branch has a 1 Gbps physical interface but a 50 Mbps contracted Metro Ethernet handoff. Without shaping, you send bursts at 1 Gbps and the carrier polices the excess (drops). With shaping, you smooth output to 50 Mbps and the carrier never has to police; no drops, no packet loss.
Shaping has one obvious cost: latency. Buffered packets wait. For voice and other latency-sensitive traffic, this is a problem. The solution is hierarchical shaping.
Hierarchical Shaping: The Production Pattern
Hierarchical shaping wraps a queueing policy inside a shaping policy. The shaper smooths to the contracted rate; the inner queueing policy applies LLQ + CBWFQ within the shaped pipe.
policy-map CHILD-WAN-EGRESS
class VOICE
priority percent 30
class VIDEO
bandwidth percent 30
class class-default
bandwidth percent 40
fair-queue
policy-map PARENT-SHAPER
class class-default
shape average 50000000 ! Shape to 50 Mbps
service-policy CHILD-WAN-EGRESS ! Apply child queueing inside
interface GigabitEthernet0/0/0
service-policy output PARENT-SHAPERThis pattern dominates production WAN edges. The parent shapes to the contracted rate (so the carrier never polices). The child applies LLQ inside the shaped pipe, so voice still gets priority queueing - but inside the 50 Mbps shaped pipe, not in the 1 Gbps physical interface.
The percentages in the child policy are percentages of the parent shape rate, not the physical interface. priority percent 30 in the child means 30 percent of 50 Mbps = 15 Mbps reserved for the priority queue.
When to Use Each: A Decision Matrix
| Scenario | Use | Why |
|---|---|---|
| Voice and other real-time on a congested WAN | LLQ with built-in policer | Strict priority + protection against priority abuse |
| Multiple business apps competing for bandwidth | CBWFQ with bandwidth guarantees | Proportional service across classes |
| Sub-rate WAN handoff (1 Gbps interface, 50 Mbps contract) | Hierarchical shaping | Avoid carrier policing; preserve LLQ inside shaped pipe |
| SLA enforcement at network boundary | Policing | Hard rate limit with no buffering |
| Per-subscriber rate plans | Policing or hierarchical shaping per subscriber | Simple at scale |
| TCP traffic with bursty sources | Shaping (egress) | Smooths bursts; preserves TCP throughput |
| Protection against UDP floods | Policing | Hard cap; UDP does not back off |
| Demote-not-drop excess scavenger traffic | Policing with re-mark action | Soft enforcement; uses spare capacity when available |
Verification
! See the policy structure and counters
Router# show policy-map interface GigabitEthernet0/0/0
GigabitEthernet0/0/0
Service-policy output: PARENT-SHAPER
Class-map: class-default
shape (average) cir 50000000, bc 200000, be 200000
target shape rate 50000000
Service-policy : CHILD-WAN-EGRESS
Class-map: VOICE (match-any)
12345 packets, 1234567 bytes
30 second offered rate 12000 bps, drop rate 0 bps
Match: dscp ef (46)
Priority: 30% (15000 kbps), burst bytes 375000, b/w exceed drops: 0
! See queue depth (TX-ring) on a specific interface
Router# show interfaces GigabitEthernet0/0/0
Output queue: 0/40 (size/max) ! Healthy
Output queue: 38/40 (size/max) ! Bordering on tail-dropsThe "drop rate" line per class is the most important diagnostic. Zero drops with significant offered rate = the class has enough bandwidth. Drops growing = class is starved.
Anti-Patterns
- Bare
priorityin LLQ. Always usepriority percent Xorpriority Xin kbps. The unbounded form lets a runaway flow destroy the rest of the policy. - Policing where shaping would do. If the source is under your control, shape (preserve TCP throughput). Police only at boundaries you cannot trust.
- Forgetting hierarchical shaping for sub-rate handoffs. Without it, carrier polices, you get drops, voice quality suffers.
- WRED on UDP-only classes. UDP does not back off; WRED becomes random drops with no benefit. Tail-drop is fine for UDP-heavy classes.
- Bandwidth percentages summing to over 100. Cisco may accept the config but behavior is unpredictable.
Summary
Queueing schedules order on congested interfaces. Policing drops excess. Shaping buffers excess. The three are complementary, not interchangeable. LLQ is the modern queueing default; hierarchical shaping with LLQ inside is the production WAN pattern; policing is for SLA enforcement and protecting priority queues from abuse.
Master the LLQ + CBWFQ + hierarchical shaping triple, and you have covered 90 percent of production QoS configurations. Bookmark this article alongside the QoS cluster pillar and the Cisco MQC walkthrough; lab every change before pushing to production. The penalty for misconfigured queueing/policing/shaping is voice quality issues that only show up under load.