QoS (Quality of Service): The Cisco-Focused Complete Guide

QoS (Quality of Service, full form: Quality of Service) is the set of network mechanisms that decide which packets win when there is not enough bandwidth, buffer space, or scheduling priority to accommodate every flow. Without QoS, every packet is best-effort - the network treats your VoIP call the same as a Windows update download. With QoS, the network knows that voice packets matter more than backup traffic and behaves accordingly.

This is the cluster overview for the full PingLabz QoS series: classification and marking, the Cisco MQC (Modular QoS CLI), DSCP and IP precedence, queueing/policing/shaping, voice and video QoS, and how QoS works in modern SD-WAN and wireless deployments. We will work through what QoS actually does, the four-step model every Cisco QoS implementation follows, the marking standards you must understand, and the trust-boundary discipline that separates a working QoS deployment from a misconfigured one. If you are studying for CCNP/CCIE, designing a QoS rollout, or troubleshooting why voice quality went bad after the last firmware push, start here.

What QoS Solves

Every network has finite resources: link bandwidth, switch buffers, scheduler slots, ingress and egress queues. When demand exceeds those resources (transient bursts, sustained congestion, or scheduled events like the 9 AM video-call rush), packets must be dropped or delayed. The question is which packets.

Without QoS, the answer is "whichever packet happened to arrive when the queue was full." The network is fair in a literal sense and useless in a practical sense: voice gets dropped equally with backup traffic, and your CFO's video call breaks while a server pulls a 50-GB OS image from a CDN.

QoS gives you four levers to bias the outcome:

Classification. Identify what kind of traffic each packet belongs to. Is this VoIP? Is this Salesforce? Is this YouTube?
Marking. Apply a label (DSCP, CoS, MPLS EXP) so downstream devices can act on the classification without redoing the work.
Queueing and Scheduling. Decide the order in which packets leave a congested interface. Voice goes first; bulk backups go last.
Policing and Shaping. Decide which packets to drop or delay when the offered load exceeds an agreed rate.

Combined, these four mechanisms let you express policy like "voice gets priority queueing with no rate limit; business-critical apps get a guaranteed 30 percent of the link; bulk traffic gets the rest with no guarantees." That policy compiles into per-interface configurations on every router and switch in the path.

The Four-Step QoS Model

Every Cisco QoS deployment follows the same four-step pattern, sometimes called the QoS toolset:

Step	Where it happens	Cisco mechanism
1. Classify	Ingress to the QoS domain (typically access port)	class-map matching ACL, NBAR, DSCP, CoS, etc.
2. Mark	Same place as classification	policy-map with set dscp / set cos / set mpls experimental
3. Queue and schedule	Egress on every congested interface	policy-map with priority/bandwidth/fair-queue
4. Police or shape	Either ingress (policing) or egress (shaping)	policy-map with police / shape

The discipline: classify and mark once at the network edge, then trust those marks throughout the QoS domain. Re-classifying at every hop is wasteful (CPU, memory) and error-prone. The trust boundary is where QoS-capable infrastructure starts; everything inside trusts the markings, everything outside is suspect.

The Cisco implementation of this four-step model uses the MQC (Modular QoS CLI). MQC has three constructs: class-map (defines classification), policy-map (defines what to do with each class), and service-policy (applies a policy to an interface). See (article forthcoming) for the deep dive.

Marking: DSCP, CoS, IP Precedence, MPLS EXP

The whole point of marking is to label a packet at the edge so every downstream device can treat it consistently without re-classifying. Four marking fields exist depending on layer and protocol:

Marking	Where it lives	Bits	Used at
IP Precedence	IPv4 ToS field, top 3 bits	3 (8 values)	Layer 3; legacy, mostly replaced by DSCP
DSCP (Differentiated Services Code Point)	IPv4 ToS / IPv6 Traffic Class field, top 6 bits	6 (64 values)	Layer 3; modern standard
CoS (802.1p Priority)	802.1Q tag, PCP field	3 (8 values)	Layer 2 (only on tagged frames)
MPLS EXP (Traffic Class)	MPLS label, EXP/TC field	3 (8 values)	MPLS-labeled traffic

DSCP is the dominant marking in modern networks. Its 6-bit value space gives you 64 possible classes, but in practice you use a small standardized set:

DSCP value	Decimal	Class	Typical use
EF (Expedited Forwarding)	46	Voice	RTP voice payload; lowest jitter, lowest loss
CS5	40	Broadcast video	One-way streaming video
AF41	34	Video conferencing	Two-way real-time video (Zoom, Teams)
AF31	26	Multimedia streaming	One-way audio/video streaming
AF21	18	Transactional apps	Salesforce, ERP, low-latency business apps
AF11	10	Bulk data	Email, file transfers
CS1	8	Scavenger	Lower than best-effort; backups, peer-to-peer
BE (Best Effort, Default)	0	Default	Everything else

The IETF's RFC 4594 ("Configuration Guidelines for DiffServ Service Classes") defines the 12-class model that most enterprises base their QoS designs on. Cisco's published "QoS Best Practices" maps that to 8-class and 4-class simplifications for smaller deployments.

For the byte-level explanation of DSCP and IP Precedence, including how the 6 bits map to per-hop behaviors, see (article forthcoming).

Trust Boundary: The Single Most Important Concept

If you take one thing away from QoS, make it this: classify and mark once at a controlled edge, then trust those marks everywhere inside the QoS domain.

The trust boundary is the perimeter where QoS-trusted infrastructure starts. Inside, every device respects the marking on incoming packets and applies policy accordingly. Outside (host PCs, IP phones, BYOD devices), markings are suspect and must be re-validated at ingress.

Common trust boundary placements:

Location	Trust model
Cisco IP phone behind PC	Trust the phone's voice VLAN markings; re-mark or zero everything from the data VLAN
Trusted application server	Trust DSCP set by the application
BYOD / guest device	Never trust; mark all traffic to BE or scavenger
Inter-switch trunk inside the QoS domain	Trust DSCP and CoS
WAN edge to ISP	Trust outbound markings (your own); re-mark or zero inbound (ISP's)

The classic failure: trusting markings from a host PC. A user's machine can mark every packet as DSCP EF (voice) and starve the actual voice traffic. Always re-mark or police at the access port for any device whose markings you cannot vouch for.

Queueing: How Cisco Decides What Leaves Next

When an interface is congested, multiple packets are waiting to leave. The queueing scheduler decides the order. Cisco supports several scheduling algorithms:

Scheduler	Behavior	Use for
FIFO (First-In-First-Out)	One queue, no priority	Default on uncongested links; never on production WAN
Priority Queueing (PQ)	4 queues; higher always served first; can starve lower	Legacy; rarely used today
Weighted Fair Queueing (WFQ)	Per-flow queues with proportional service	Default on slow serial interfaces; rarely tuned
Class-Based Weighted Fair Queueing (CBWFQ)	Per-class queues; configured bandwidth guarantees	Modern default for non-real-time traffic
Low-Latency Queueing (LLQ)	CBWFQ plus a strict-priority queue with policer	Modern default when voice/video share with data

LLQ is the dominant queueing strategy in 2026. It gives voice a strict priority queue (lowest latency, lowest jitter) but applies a built-in policer to prevent the voice queue from starving everything else if voice traffic explodes (e.g. a misbehaving SIP gateway). The remaining bandwidth is divided among the other classes proportionally via CBWFQ.

For the configuration walkthrough of LLQ, CBWFQ, and how priority+bandwidth statements interact, see (article forthcoming).

Policing vs Shaping

Both policing and shaping limit traffic to a configured rate. They differ in what happens to the excess:

Mechanism	Excess traffic	Where applied	Use for
Policing	Dropped (or re-marked)	Ingress or egress	Hard rate limits; SLA enforcement
Shaping	Buffered and delayed	Egress only	Smoothing bursts to fit downstream link

The classic shaping use case: your branch has a 50 Mbps Metro Ethernet handoff but the carrier rate-limits you to 20 Mbps. Without shaping, your switch sends bursts of 50 Mbps and the carrier drops the excess. With egress shaping at 20 Mbps, the switch buffers the burst and feeds the link at the rate the carrier expects. No drops, no packet loss.

The classic policing use case: an SLA contract says "you can send 100 Mbps; anything above gets dropped." The ISP polices at ingress; your egress shaping ensures you do not get policed in the first place.

Both can re-mark instead of drop. A common pattern: police the scavenger class to a percentage of the link, and re-mark traffic that exceeds the cap to the same class but with a higher drop precedence.

Voice and Video QoS

The dominant real-time use case. Voice (VoIP / RTP) has tight requirements:

Latency < 150ms end-to-end (one-way; ITU-T G.114)
Jitter < 30ms
Packet loss < 1 percent (preferably much lower)

The standard pattern: classify voice traffic at the access port (typically by voice VLAN), mark to DSCP EF, place into LLQ priority queue at every egress interface in the path, and reserve enough bandwidth so the priority queue never has to drop. For a 1 Gbps WAN link supporting 100 simultaneous VoIP calls (each ~80 kbps including overhead), you reserve 8 Mbps for the priority queue.

Video conferencing (Zoom, Teams, Webex) has slightly looser requirements but is more bandwidth-hungry. Mark to AF41 or CS4 depending on your design; place in a guaranteed-bandwidth class with WRED for graceful degradation under congestion.

For the full configuration walkthrough including IP phone trust, voice VLAN, LLQ tuning, and verification commands, see (article forthcoming).

QoS in Modern Networks: SD-WAN and Wireless

Two modern contexts where QoS is implemented differently from the legacy MPLS WAN model:

SD-WAN. Application-aware steering replaces a lot of legacy QoS complexity. Instead of marking traffic and trusting the WAN to honor markings, the SD-WAN edge identifies applications via DPI and steers them across multiple transports based on real-time SLA measurements. QoS still exists - voice still gets a priority queue at each WAN egress - but the policy is expressed in terms of applications, not DSCP values, and the SD-WAN handles per-tunnel SLA monitoring. See the SD-WAN cluster pillar for the architecture and the SD-WAN architecture article for the per-tunnel SLA model.

Wireless. Wireless QoS uses 802.11e WMM (Wi-Fi Multimedia) with four access categories: Voice, Video, Best Effort, Background. The Catalyst 9800 maps DSCP values from incoming wired traffic into WMM categories on the air, and CoS-marked frames from wireless clients into DSCP for forwarding upstream. Auto QoS and AVC (Application Visibility and Control) handle most of the configuration automatically. See C9800 QoS Configuration: Auto QoS, DSCP Mapping, and Wireless Profiles for the wireless-specific walkthrough.

QoS Deep Dives in This Cluster

The articles in this cluster, in reading order:

This list grows as new articles are published. Check back for vendor-specific deep dives, configuration walkthroughs, and troubleshooting references.

Frequently Asked Questions

What does QoS stand for?

QoS stands for Quality of Service. It refers to the set of network mechanisms (classification, marking, queueing, policing, shaping) that control how a network treats different kinds of traffic when resources are constrained. The same acronym is used loosely in other contexts (MQTT QoS levels, Kubernetes QoS classes, Queen of Spades cultural references), but in networking it means the IP/Ethernet QoS toolkit covered here.

Do I need QoS on a network that is not congested?

Probably not on the data plane. If your links are routinely under 50 percent utilization and you have no real-time applications (no VoIP, no video conferencing, no industrial control systems), best-effort works fine. You should still mark traffic at the edge (it costs nothing and lets the network be ready when it does become congested), but elaborate queueing and shaping is overkill.

You almost certainly do need QoS on WAN edges, wireless, and any link prone to bursting (cloud egress, sub-1 Gbps WAN, congested interfaces during business hours).

What is the difference between DSCP and CoS?

DSCP is a Layer 3 marking in the IP header (6 bits, 64 values) and is preserved end-to-end across IP routing. CoS is a Layer 2 marking in the 802.1Q tag (3 bits, 8 values) and only exists on tagged Ethernet frames; it is stripped at every routed hop. Most networks classify and mark DSCP at the edge; CoS is only relevant inside switched Layer 2 segments. See the 802.1Q VLAN Tag Explained for where CoS lives in the frame.

What is LLQ and how does it relate to CBWFQ?

LLQ (Low-Latency Queueing) is CBWFQ (Class-Based Weighted Fair Queueing) plus a strict-priority queue with a built-in policer. The priority queue is for voice and other real-time traffic; everything else is in CBWFQ classes with configured bandwidth guarantees. The built-in policer prevents the priority queue from starving the rest of the policy if voice traffic explodes. LLQ is the modern Cisco default for any link sharing voice with data.

When should I use policing vs shaping?

Policing drops or re-marks excess traffic; use it for hard rate limits and SLA enforcement. Shaping buffers and delays excess traffic; use it for smoothing bursts to fit a downstream link. Common pattern: shape egress to match the carrier's policed rate so you do not get policed in the first place. Detail in the queueing/policing/shaping article.

Does QoS still matter in cloud?

Differently. The cloud provider does not honor your DSCP markings inside their network (you do not own that path). What matters in cloud-bound traffic is QoS at your WAN edge (so voice and video get priority leaving your network) and at the cloud-on-ramp (Direct Connect, ExpressRoute) where you have control. Once the packet enters AWS or Azure, your markings are irrelevant. SD-WAN cloud on-ramps re-establish per-tunnel QoS over those links.

Where should I mark traffic?

As close to the source as possible, on a device you trust. The classic pattern: classify and mark at the access switch port for end hosts, at the WAN edge router for inbound flows from outside, and at the application server for traffic the application can mark itself. Re-marking elsewhere should be limited to scavenger-class enforcement at the trust boundary.

Key Takeaways

If you take one thing away from this guide, make it this: QoS is mostly about discipline at the edge. Classify and mark once at a controlled trust boundary, trust those markings everywhere inside the QoS domain, and apply queueing/policing/shaping at congestion points. The protocol mechanics (DSCP values, LLQ, MQC) are simpler than the operational discipline of getting the trust boundary right and keeping policy consistent across hundreds of devices.

Bookmark this page, work through the cluster articles in reading order, and lab every change. QoS is the kind of network design where small misconfigurations cause subtle problems that only show up under load - exactly when you cannot afford them. The cluster articles will give you the patterns; the discipline of edge marking plus consistent end-to-end policy is what makes them work in production.

QoS (Quality of Service): The Complete Guide for Cisco Engineers