QoS Queueing, Policing, and Shaping Compared (LLQ, CBWFQ)

Queueing, policing, and shaping are the three traffic-management primitives every Cisco QoS deployment uses. They sound similar in casual conversation - "they limit traffic, right?" - but each does something different and they are not interchangeable. Confusing them is the second most common QoS mistake (after trust-boundary failures), and the symptoms are subtle: drops in unexpected places, queues growing without bound, traffic that gets policed before it gets shaped.

This article walks through what each primitive does, where it applies, when to use which, the Cisco IOS XE configuration, and the standard production patterns. If you are configuring QoS for the first time or auditing a deployment that "feels off," this is the reference.

The Three Primitives

Primitive	What it does	Excess traffic	Where it applies
Queueing	Decides the order packets leave a congested interface	Stays in queues until served (or dropped if queue full)	Egress only (output direction)
Policing	Enforces a hard rate limit; drops or re-marks excess	Dropped (or re-marked)	Ingress or egress
Shaping	Smooths bursts to fit a target rate; buffers excess	Buffered and delayed	Egress only

Queueing answers "in what order do these waiting packets leave?" Policing and shaping both answer "how do we keep traffic at or below rate X?" but with very different mechanics: policing drops, shaping buffers.

Queueing: Scheduling on Congested Interfaces

An interface is "congested" when more packets want to leave than the interface can transmit in a given time slice. Without congestion, queueing does not matter - packets transmit in arrival order. With congestion, the queueing scheduler chooses what leaves next.

Cisco supports several queueing schedulers, in order of historical introduction:

Scheduler	Behavior	Use in 2026
FIFO (First-In-First-Out)	Single queue; no priority	Default for uncongested interfaces; never on production WAN
Priority Queueing (PQ)	Four queues; higher always served first; can starve lower	Legacy; deprecated
Custom Queueing (CQ)	16 queues with byte-count round-robin	Legacy; deprecated
Weighted Fair Queueing (WFQ)	Per-flow queues with proportional service based on weight	Default on slow serial; rarely tuned in 2026
Class-Based Weighted Fair Queueing (CBWFQ)	Per-class queues with configured bandwidth guarantees	Modern default for non-real-time classes
Low-Latency Queueing (LLQ)	CBWFQ plus a strict-priority queue with built-in policer	Modern default when voice or video shares with data

LLQ is the dominant queueing strategy in modern Cisco deployments. It gives voice a strict priority queue (lowest latency, lowest jitter) but applies a built-in policer to prevent the priority queue from starving everything else if voice traffic explodes. The remaining bandwidth divides among other classes via CBWFQ proportional service.

LLQ Configuration

policy-map WAN-EGRESS
 class VOICE
  priority percent 10                  ! Strict priority, capped at 10%
 class VIDEO
  bandwidth percent 30                 ! Guaranteed 30%
  random-detect dscp-based             ! WRED
 class TRANSACTIONAL
  bandwidth percent 25                 ! Guaranteed 25%
  random-detect
 class SCAVENGER
  bandwidth percent 1                  ! Tiny guarantee
 class class-default
  bandwidth percent 34
  fair-queue
  random-detect

interface GigabitEthernet0/0/0
 service-policy output WAN-EGRESS

The percentages must sum to no more than 100. The priority statement implicitly counts against the total. If you only specify priority without a percentage, the priority queue is unbounded - never do this in production; a misbehaving SIP gateway can starve everything else.

Bandwidth Statements

Three forms exist:

Form	Behavior
`bandwidth percent X`	X percent of the interface bandwidth (or the parent shaper rate in hierarchical policies)
`bandwidth X` (kbps)	Absolute kbps guarantee
`bandwidth remaining percent X`	X percent of bandwidth remaining after priority queues are accounted for

The remaining-percent form is useful when priority bandwidth varies (e.g. voice scales with call count). The classes that take "remaining" do not need to be re-tuned every time the priority allocation changes.

WRED: Graceful Degradation

WRED (Weighted Random Early Detection) randomly drops packets from a queue as it approaches full, biased toward higher drop precedence. The benefit: TCP flows respond to dropped packets by slowing down, which empties the queue gracefully rather than letting it fill and tail-drop everything.

WRED works best on classes with lots of TCP traffic. Voice and video (UDP) do not benefit because UDP does not back off; for those classes, just rely on the priority queue's policer.

class TRANSACTIONAL
 bandwidth percent 25
 random-detect dscp-based             ! Drop AF23 first, then AF22, then AF21

Policing: Hard Rate Limits

A policer enforces a maximum rate. Packets above the rate are dropped immediately or re-marked to a lower-priority DSCP. Cisco implementations use a token-bucket algorithm:

Tokens accumulate in a bucket at the configured rate (e.g. 10 Mbps).
Each arriving packet consumes tokens proportional to its size.
If enough tokens are available, the packet conforms (passes).
If not enough tokens, the packet exceeds (drop or re-mark, depending on config).
Bucket has a burst size that allows short bursts above rate; refills constantly at the rate.

Configuration:

policy-map RATE-LIMIT-INBOUND
 class class-default
  police 10000000 1500000               ! 10 Mbps with 1.5 MB burst
   conform-action transmit
   exceed-action drop

interface GigabitEthernet0/0/0
 service-policy input RATE-LIMIT-INBOUND

Policing's superpower: instant rate enforcement without buffering. Use it for:

SLA enforcement at network boundaries (carriers police their customers)
Subscriber rate plans (ISP enforcing a customer's contracted rate)
Protection against traffic floods (per-source-IP policers)
The built-in policer on LLQ priority queues (prevents voice from running away)

Policing's weakness: dropping perfectly good packets. TCP responds by retransmitting and reducing congestion window. The throughput of TCP traffic against a policer is significantly lower than the policed rate. If you can shape instead of police for non-bursty traffic, do.

Re-marking Instead of Dropping

A common pattern: instead of dropping excess, re-mark to a lower-priority class:

class SCAVENGER
 police 5000000
   conform-action set-dscp-transmit cs1   ! Confirm: keep CS1
   exceed-action set-dscp-transmit cs0    ! Exceed: re-mark to BE

Excess scavenger traffic is not dropped - just demoted. If the network has spare capacity, the demoted traffic still gets through. If the network is congested, the demoted traffic is the first to drop. This "soft policing" gives you the rate enforcement without the hard cliff.

Shaping: Smoothing Bursts

A shaper buffers excess traffic and releases it at the configured rate. Like a policer, it uses a token-bucket algorithm; unlike a policer, the action for excess is "buffer for later" instead of "drop."

Configuration:

policy-map SHAPE-WAN
 class class-default
  shape average 50000000               ! Shape to 50 Mbps average

interface GigabitEthernet0/0/0
 service-policy output SHAPE-WAN

Shape average smooths bursts to the configured rate. Shape peak (rarely used) allows bursting to a higher rate based on accumulated credit.

The classic shaping use case: your branch has a 1 Gbps physical interface but a 50 Mbps contracted Metro Ethernet handoff. Without shaping, you send bursts at 1 Gbps and the carrier polices the excess (drops). With shaping, you smooth output to 50 Mbps and the carrier never has to police; no drops, no packet loss.

Shaping has one obvious cost: latency. Buffered packets wait. For voice and other latency-sensitive traffic, this is a problem. The solution is hierarchical shaping.

Hierarchical Shaping: The Production Pattern

Hierarchical shaping wraps a queueing policy inside a shaping policy. The shaper smooths to the contracted rate; the inner queueing policy applies LLQ + CBWFQ within the shaped pipe.

policy-map CHILD-WAN-EGRESS
 class VOICE
  priority percent 30
 class VIDEO
  bandwidth percent 30
 class class-default
  bandwidth percent 40
  fair-queue

policy-map PARENT-SHAPER
 class class-default
  shape average 50000000               ! Shape to 50 Mbps
  service-policy CHILD-WAN-EGRESS      ! Apply child queueing inside

interface GigabitEthernet0/0/0
 service-policy output PARENT-SHAPER

This pattern dominates production WAN edges. The parent shapes to the contracted rate (so the carrier never polices). The child applies LLQ inside the shaped pipe, so voice still gets priority queueing - but inside the 50 Mbps shaped pipe, not in the 1 Gbps physical interface.

The percentages in the child policy are percentages of the parent shape rate, not the physical interface. priority percent 30 in the child means 30 percent of 50 Mbps = 15 Mbps reserved for the priority queue.

When to Use Each: A Decision Matrix

Scenario	Use	Why
Voice and other real-time on a congested WAN	LLQ with built-in policer	Strict priority + protection against priority abuse
Multiple business apps competing for bandwidth	CBWFQ with bandwidth guarantees	Proportional service across classes
Sub-rate WAN handoff (1 Gbps interface, 50 Mbps contract)	Hierarchical shaping	Avoid carrier policing; preserve LLQ inside shaped pipe
SLA enforcement at network boundary	Policing	Hard rate limit with no buffering
Per-subscriber rate plans	Policing or hierarchical shaping per subscriber	Simple at scale
TCP traffic with bursty sources	Shaping (egress)	Smooths bursts; preserves TCP throughput
Protection against UDP floods	Policing	Hard cap; UDP does not back off
Demote-not-drop excess scavenger traffic	Policing with re-mark action	Soft enforcement; uses spare capacity when available

Verification

! See the policy structure and counters
Router# show policy-map interface GigabitEthernet0/0/0
 GigabitEthernet0/0/0
  Service-policy output: PARENT-SHAPER
    Class-map: class-default
      shape (average) cir 50000000, bc 200000, be 200000
        target shape rate 50000000
        Service-policy : CHILD-WAN-EGRESS
          Class-map: VOICE (match-any)
            12345 packets, 1234567 bytes
            30 second offered rate 12000 bps, drop rate 0 bps
            Match: dscp ef (46)
            Priority: 30% (15000 kbps), burst bytes 375000, b/w exceed drops: 0

! See queue depth (TX-ring) on a specific interface
Router# show interfaces GigabitEthernet0/0/0
  Output queue: 0/40 (size/max)            ! Healthy
  Output queue: 38/40 (size/max)           ! Bordering on tail-drops

The "drop rate" line per class is the most important diagnostic. Zero drops with significant offered rate = the class has enough bandwidth. Drops growing = class is starved.

Anti-Patterns

Bare priority in LLQ. Always use priority percent X or priority X in kbps. The unbounded form lets a runaway flow destroy the rest of the policy.
Policing where shaping would do. If the source is under your control, shape (preserve TCP throughput). Police only at boundaries you cannot trust.
Forgetting hierarchical shaping for sub-rate handoffs. Without it, carrier polices, you get drops, voice quality suffers.
WRED on UDP-only classes. UDP does not back off; WRED becomes random drops with no benefit. Tail-drop is fine for UDP-heavy classes.
Bandwidth percentages summing to over 100. Cisco may accept the config but behavior is unpredictable.

Summary

Queueing schedules order on congested interfaces. Policing drops excess. Shaping buffers excess. The three are complementary, not interchangeable. LLQ is the modern queueing default; hierarchical shaping with LLQ inside is the production WAN pattern; policing is for SLA enforcement and protecting priority queues from abuse.

Master the LLQ + CBWFQ + hierarchical shaping triple, and you have covered 90 percent of production QoS configurations. Bookmark this article alongside the QoS cluster pillar and the Cisco MQC walkthrough; lab every change before pushing to production. The penalty for misconfigured queueing/policing/shaping is voice quality issues that only show up under load.

QoS Queueing, Policing, and Shaping Compared