QoS (Quality of Service, full form: Quality of Service) is the set of network mechanisms that decide which packets win when there is not enough bandwidth, buffer space, or scheduling priority to accommodate every flow. Without QoS, every packet is best-effort - the network treats your VoIP call the same as a Windows update download. With QoS, the network knows that voice packets matter more than backup traffic and behaves accordingly.
This is the cluster overview for the full PingLabz QoS series: classification and marking, the Cisco MQC (Modular QoS CLI), DSCP and IP precedence, queueing/policing/shaping, voice and video QoS, and how QoS works in modern SD-WAN and wireless deployments. We will work through what QoS actually does, the four-step model every Cisco QoS implementation follows, the marking standards you must understand, and the trust-boundary discipline that separates a working QoS deployment from a misconfigured one. If you are studying for CCNP/CCIE, designing a QoS rollout, or troubleshooting why voice quality went bad after the last firmware push, start here.
What QoS Solves
Every network has finite resources: link bandwidth, switch buffers, scheduler slots, ingress and egress queues. When demand exceeds those resources (transient bursts, sustained congestion, or scheduled events like the 9 AM video-call rush), packets must be dropped or delayed. The question is which packets.
Without QoS, the answer is "whichever packet happened to arrive when the queue was full." The network is fair in a literal sense and useless in a practical sense: voice gets dropped equally with backup traffic, and your CFO's video call breaks while a server pulls a 50-GB OS image from a CDN.
QoS gives you four levers to bias the outcome:
- Classification. Identify what kind of traffic each packet belongs to. Is this VoIP? Is this Salesforce? Is this YouTube?
- Marking. Apply a label (DSCP, CoS, MPLS EXP) so downstream devices can act on the classification without redoing the work.
- Queueing and Scheduling. Decide the order in which packets leave a congested interface. Voice goes first; bulk backups go last.
- Policing and Shaping. Decide which packets to drop or delay when the offered load exceeds an agreed rate.
Combined, these four mechanisms let you express policy like "voice gets priority queueing with no rate limit; business-critical apps get a guaranteed 30 percent of the link; bulk traffic gets the rest with no guarantees." That policy compiles into per-interface configurations on every router and switch in the path.
The Four-Step QoS Model
Every Cisco QoS deployment follows the same four-step pattern, sometimes called the QoS toolset:
The discipline: classify and mark once at the network edge, then trust those marks throughout the QoS domain. Re-classifying at every hop is wasteful (CPU, memory) and error-prone. The trust boundary is where QoS-capable infrastructure starts; everything inside trusts the markings, everything outside is suspect.
The Cisco implementation of this four-step model uses the MQC (Modular QoS CLI). MQC has three constructs: class-map (defines classification), policy-map (defines what to do with each class), and service-policy (applies a policy to an interface). See (article forthcoming) for the deep dive.
Marking: DSCP, CoS, IP Precedence, MPLS EXP
The whole point of marking is to label a packet at the edge so every downstream device can treat it consistently without re-classifying. Four marking fields exist depending on layer and protocol:
DSCP is the dominant marking in modern networks. Its 6-bit value space gives you 64 possible classes, but in practice you use a small standardized set:
The IETF's RFC 4594 ("Configuration Guidelines for DiffServ Service Classes") defines the 12-class model that most enterprises base their QoS designs on. Cisco's published "QoS Best Practices" maps that to 8-class and 4-class simplifications for smaller deployments.
The marking step itself is short. Define a class-map that picks out the traffic, then a policy-map that sets the DSCP value, then attach the policy as service-policy input on the access port closest to the source:
! Identify voice traffic at the edge (typically by VLAN
! or DSCP coming from a trusted IP phone)
class-map match-any VOICE-EDGE
match access-group name VOICE-RTP
! Mark it to DSCP EF so every downstream device
! knows it is voice
policy-map MARK-VOICE
class VOICE-EDGE
set dscp ef
class class-default
set dscp default
! Apply at ingress to the access port
interface Vlan20
service-policy input MARK-VOICE
Once the packet is marked, every router and switch from here to the WAN edge acts on the DSCP value without re-classifying. That single edge marking is the entire point of the marking step.
For the byte-level explanation of DSCP and IP Precedence, including how the 6 bits map to per-hop behaviors, see (article forthcoming).
Trust Boundary: The Single Most Important Concept
If you take one thing away from QoS, make it this: classify and mark once at a controlled edge, then trust those marks everywhere inside the QoS domain.
The trust boundary is the perimeter where QoS-trusted infrastructure starts. Inside, every device respects the marking on incoming packets and applies policy accordingly. Outside (host PCs, IP phones, BYOD devices), markings are suspect and must be re-validated at ingress.
Common trust boundary placements:
The classic failure: trusting markings from a host PC. A user's machine can mark every packet as DSCP EF (voice) and starve the actual voice traffic. Always re-mark or police at the access port for any device whose markings you cannot vouch for.
Queueing: How Cisco Decides What Leaves Next
When an interface is congested, multiple packets are waiting to leave. The queueing scheduler decides the order. Cisco supports several scheduling algorithms:
LLQ is the dominant queueing strategy in 2026. It gives voice a strict priority queue (lowest latency, lowest jitter) but applies a built-in policer to prevent the voice queue from starving everything else if voice traffic explodes (e.g. a misbehaving SIP gateway). The remaining bandwidth is divided among the other classes proportionally via CBWFQ.
A minimum-viable LLQ + CBWFQ policy on a WAN-facing interface looks like this. The class-maps match on DSCP at the egress; the policy-map is the new piece:
! Match the DSCP markings at the WAN egress
class-map match-any VOICE
match dscp ef
class-map match-any VIDEO
match dscp af41
! LLQ + CBWFQ: voice in priority queue, video and
! default with guaranteed bandwidth percentages
policy-map WAN-OUT
class VOICE
priority percent 20
class VIDEO
bandwidth percent 30
class class-default
bandwidth percent 25
! Apply at egress on the WAN interface
interface Ethernet0/1
description WAN to provider
bandwidth 100000
service-policy output WAN-OUT
This policy reserves 20 percent of the configured 100 Mbps interface bandwidth for voice (strict priority), 30 percent as a CBWFQ minimum for video, and 25 percent for everything else. The unallocated 25 percent is held in reserve for control-plane traffic. Use show policy-map interface to confirm the policy attached and is classifying correctly, covered in the verification section below.
For the configuration walkthrough of LLQ, CBWFQ, and how priority+bandwidth statements interact, see (article forthcoming).
Policing vs Shaping
Both policing and shaping limit traffic to a configured rate. They differ in what happens to the excess:
The classic shaping use case: your branch has a 50 Mbps Metro Ethernet handoff but the carrier rate-limits you to 20 Mbps. Without shaping, your switch sends bursts of 50 Mbps and the carrier drops the excess. With egress shaping at 20 Mbps, the switch buffers the burst and feeds the link at the rate the carrier expects. No drops, no packet loss.
The classic policing use case: an SLA contract says "you can send 100 Mbps; anything above gets dropped." The ISP polices at ingress; your egress shaping ensures you do not get policed in the first place.
Both can re-mark instead of drop. A common pattern: police the scavenger class to a percentage of the link, and re-mark traffic that exceeds the cap to the same class but with a higher drop precedence.
The Cisco MQC syntax for both is short. Policing drops the excess; shaping buffers and delays it:
! Policer: cap at 10 Mbps, drop excess
policy-map LIMIT-IN
class class-default
police 10000000
! Shaper: cap at 20 Mbps, queue and pace excess
policy-map SHAPE-OUT
class class-default
shape average 20000000
The policer is one configured line (police with a rate in bits per second) plus optional burst sizes and exceed/violate actions. The shaper is one line (shape average) plus optional buffer-tuning knobs. Most production designs combine the two: shape outbound to fit the carrier's policed rate, so your traffic never gets policed in the first place.
Voice and Video QoS
The dominant real-time use case. Voice (VoIP / RTP) has tight requirements:
- Latency < 150ms end-to-end (one-way; ITU-T G.114)
- Jitter < 30ms
- Packet loss < 1 percent (preferably much lower)
The standard pattern: classify voice traffic at the access port (typically by voice VLAN), mark to DSCP EF, place into LLQ priority queue at every egress interface in the path, and reserve enough bandwidth so the priority queue never has to drop. For a 1 Gbps WAN link supporting 100 simultaneous VoIP calls (each ~80 kbps including overhead), you reserve 8 Mbps for the priority queue.
The MQC implementation of LLQ is short. Define a priority queue for voice and let video and class-default share the rest with bandwidth-percent guarantees:
policy-map WAN-OUT
class VOICE
priority percent 20 ! strict priority + built-in policer
class VIDEO
bandwidth percent 30 ! CBWFQ guaranteed minimum
class class-default
bandwidth percent 25 ! CBWFQ for everything else
The priority percent 20 line is what makes this LLQ rather than plain CBWFQ. It sets up a strict-priority queue for the VOICE class but caps it at 20 percent of the interface bandwidth so misbehaving voice traffic cannot starve video or data. The bandwidth percent lines are CBWFQ minimums: classes can use more bandwidth when available, but are guaranteed at least their configured share under congestion.
Video conferencing (Zoom, Teams, Webex) has slightly looser requirements but is more bandwidth-hungry. Mark to AF41 or CS4 depending on your design; place in a guaranteed-bandwidth class with WRED for graceful degradation under congestion.
For the full configuration walkthrough including IP phone trust, voice VLAN, LLQ tuning, and verification commands, see (article forthcoming).
Verifying QoS
The single most useful command for verifying that a QoS policy is doing what you designed is show policy-map interface. It reports per-class packet and byte counters, the queue depth, total drops, and (for LLQ) the priority-queue policer state. After driving 200 voice-marked, 200 video-marked, and 200 unmarked test packets through the lab's WAN-OUT policy on Cisco IOS XE 17.16, the populated output looks like this:
R1# show policy-map interface Ethernet0/1
Ethernet0/1
Service-policy output: WAN-OUT
queue stats for all priority classes:
Queueing
queue limit 64 packets
(queue depth/total drops/no-buffer drops) 0/131/0
(pkts output/bytes output) 69/69966
Class-map: VOICE (match-any)
200 packets, 202800 bytes
5 minute offered rate 0000 bps, drop rate 0000 bps
Match: dscp ef (46)
200 packets, 202800 bytes
Priority: 20% (20000 kbps), burst bytes 500000, b/w exceed drops: 0
Class-map: VIDEO (match-any)
200 packets, 282800 bytes
5 minute offered rate 3000 bps, drop rate 0000 bps
Match: dscp af41 (34)
200 packets, 282800 bytes
Queueing
queue limit 64 packets
(queue depth/total drops/no-buffer drops) 0/131/0
(pkts output/bytes output) 69/97566
bandwidth 30% (30000 kbps)
Class-map: class-default (match-any)
241 packets, 309892 bytes
5 minute offered rate 5000 bps, drop rate 2000 bps
Match: any
Queueing
queue limit 64 packets
(queue depth/total drops/no-buffer drops) 0/131/0
(pkts output/bytes output) 110/111558
bandwidth 25% (25000 kbps)
Read this from the top:
- Service-policy output: WAN-OUT confirms the policy attached. If the line is missing, the
service-policy outputstatement is not on the interface. - Class-map: VOICE ... Match: dscp ef (46) proves classification is matching. The 200 packets matching DSCP EF were the 200 voice-marked test pings.
- Priority: 20% (20000 kbps), b/w exceed drops: 0 shows the LLQ policer is configured for 20 Mbps and the priority queue never hit its cap. The 131 drops in the priority queue stats above are queue-depth tail drops, not policer drops.
- VIDEO and class-default show the same per-class accounting plus the configured
bandwidth percentin absolute terms (30 percent of the 100 Mbps interface = 30000 kbps). Change the interfacebandwidthstatement and these absolute numbers change with it. - 5 minute drop rate is the rolling drop-rate counter. Non-zero means the class is being shed faster than the queue can drain - the canary for "bump this class' bandwidth or police harder upstream."
The contrast between a fresh attach (zero everywhere) and these populated counters is the whole point of QoS verification. Counters that move prove the policy is reaching the dataplane; counters that match expected packet counts prove classification is matching the right traffic; non-zero drops in the wrong class catch a misconfigured trust boundary or starving CBWFQ allocation before users complain. Pair with show class-map, show policy-map, and show running-config interface to cover the "is this thing configured?" questions on any QoS rollout.
QoS in Modern Networks: SD-WAN and Wireless
Two modern contexts where QoS is implemented differently from the legacy MPLS WAN model:
SD-WAN. Application-aware steering replaces a lot of legacy QoS complexity. Instead of marking traffic and trusting the WAN to honor markings, the SD-WAN edge identifies applications via DPI and steers them across multiple transports based on real-time SLA measurements. QoS still exists - voice still gets a priority queue at each WAN egress - but the policy is expressed in terms of applications, not DSCP values, and the SD-WAN handles per-tunnel SLA monitoring. See the SD-WAN cluster pillar for the architecture and the SD-WAN architecture article for the per-tunnel SLA model.
Wireless. Wireless QoS uses 802.11e WMM (Wi-Fi Multimedia) with four access categories: Voice, Video, Best Effort, Background. The Catalyst 9800 maps DSCP values from incoming wired traffic into WMM categories on the air, and CoS-marked frames from wireless clients into DSCP for forwarding upstream. Auto QoS and AVC (Application Visibility and Control) handle most of the configuration automatically. See C9800 QoS Configuration: Auto QoS, DSCP Mapping, and Wireless Profiles for the wireless-specific walkthrough.
QoS Deep Dives in This Cluster
The articles in this cluster, in reading order:
- DSCP and IP Precedence Explained Byte by Byte
- Cisco MQC (Modular QoS CLI): The Operator's Walkthrough
- QoS Classification, Marking, and Trust Boundaries
- QoS Queueing, Policing, and Shaping Compared
- Voice and Video QoS on Cisco IOS XE
- QoS in Modern Networks: SD-WAN, Cloud, and Application-Aware Steering
- C9800 QoS Configuration: Auto QoS, DSCP Mapping, and Wireless Profiles
This list grows as new articles are published. Check back for vendor-specific deep dives, configuration walkthroughs, and troubleshooting references.
Hands-on QoS - classification, marking, LLQ + CBWFQ
Configure QoS class-maps for VOICE (DSCP EF) and VIDEO (AF41), then apply LLQ + CBWFQ on a Cisco WAN egress interface. Per-class packet counters from show policy-map interface. Open the PingLabz CCNA Labs library.
Frequently Asked Questions
What does QoS stand for?
QoS stands for Quality of Service. It refers to the set of network mechanisms (classification, marking, queueing, policing, shaping) that control how a network treats different kinds of traffic when resources are constrained. The same acronym is used loosely in other contexts (MQTT QoS levels, Kubernetes QoS classes, Queen of Spades cultural references), but in networking it means the IP/Ethernet QoS toolkit covered here.
Do I need QoS on a network that is not congested?
Probably not on the data plane. If your links are routinely under 50 percent utilization and you have no real-time applications (no VoIP, no video conferencing, no industrial control systems), best-effort works fine. You should still mark traffic at the edge (it costs nothing and lets the network be ready when it does become congested), but elaborate queueing and shaping is overkill.
You almost certainly do need QoS on WAN edges, wireless, and any link prone to bursting (cloud egress, sub-1 Gbps WAN, congested interfaces during business hours).
What is the difference between DSCP and CoS?
DSCP is a Layer 3 marking in the IP header (6 bits, 64 values) and is preserved end-to-end across IP routing. CoS is a Layer 2 marking in the 802.1Q tag (3 bits, 8 values) and only exists on tagged Ethernet frames; it is stripped at every routed hop. Most networks classify and mark DSCP at the edge; CoS is only relevant inside switched Layer 2 segments. See the 802.1Q VLAN Tag Explained for where CoS lives in the frame.
What is LLQ and how does it relate to CBWFQ?
LLQ (Low-Latency Queueing) is CBWFQ (Class-Based Weighted Fair Queueing) plus a strict-priority queue with a built-in policer. The priority queue is for voice and other real-time traffic; everything else is in CBWFQ classes with configured bandwidth guarantees. The built-in policer prevents the priority queue from starving the rest of the policy if voice traffic explodes. LLQ is the modern Cisco default for any link sharing voice with data.
When should I use policing vs shaping?
Policing drops or re-marks excess traffic; use it for hard rate limits and SLA enforcement. Shaping buffers and delays excess traffic; use it for smoothing bursts to fit a downstream link. Common pattern: shape egress to match the carrier's policed rate so you do not get policed in the first place. Detail in the queueing/policing/shaping article.
Does QoS still matter in cloud?
Differently. The cloud provider does not honor your DSCP markings inside their network (you do not own that path). What matters in cloud-bound traffic is QoS at your WAN edge (so voice and video get priority leaving your network) and at the cloud-on-ramp (Direct Connect, ExpressRoute) where you have control. Once the packet enters AWS or Azure, your markings are irrelevant. SD-WAN cloud on-ramps re-establish per-tunnel QoS over those links.
Where should I mark traffic?
As close to the source as possible, on a device you trust. The classic pattern: classify and mark at the access switch port for end hosts, at the WAN edge router for inbound flows from outside, and at the application server for traffic the application can mark itself. Re-marking elsewhere should be limited to scavenger-class enforcement at the trust boundary.
Key Takeaways
If you take one thing away from this guide, make it this: QoS is mostly about discipline at the edge. Classify and mark once at a controlled trust boundary, trust those markings everywhere inside the QoS domain, and apply queueing/policing/shaping at congestion points. The protocol mechanics (DSCP values, LLQ, MQC) are simpler than the operational discipline of getting the trust boundary right and keeping policy consistent across hundreds of devices.
Bookmark this page, work through the cluster articles in reading order, and lab every change. QoS is the kind of network design where small misconfigurations cause subtle problems that only show up under load - exactly when you cannot afford them. The cluster articles will give you the patterns; the discipline of edge marking plus consistent end-to-end policy is what makes them work in production.