Table of Contents >> Show >> Hide
- Introduction: When a Good Idea Became a Costly Gatekeeper
- What Was the Step 2 Clinical Skills Exam?
- The Strongest Argument for Step 2 CS
- Problem 1: The Exam Was Expensive in a Way That Felt Unfair
- Problem 2: High Pass Rates Raised Questions About Value
- Problem 3: The Feedback Was Too Thin for Real Learning
- Problem 4: One Artificial Day Cannot Capture Real Clinical Readiness
- Problem 5: Medical Schools Already Assess Clinical Skills
- Problem 6: The Exam Created Stress at the Worst Possible Time
- Problem 7: Equity Concerns Were Built Into the Structure
- Why Discontinuation Did Not Mean Clinical Skills Stopped Mattering
- What a Better System Should Look Like
- The Case Against Step 2 CS in One Sentence
- Experiences From the Step 2 CS Era: What Students Learned the Hard Way
- Conclusion: Clinical Skills Deserve Better Than Step 2 CS
Note: This article is written for informational and editorial purposes. It is based on publicly available medical education information and should not be treated as official USMLE, ECFMG, NBME, FSMB, or school-specific guidance.
Introduction: When a Good Idea Became a Costly Gatekeeper
The Step 2 Clinical Skills Exam, better known as USMLE Step 2 CS, was built on a perfectly reasonable idea: before becoming a physician, a future doctor should be able to talk to patients, gather a history, perform a focused physical exam, explain findings, document a clinical encounter, and communicate like a professional human being rather than a walking flashcard deck.
On paper, that sounds hard to argue with. No patient wants a doctor who can recite the Krebs cycle but cannot ask about chest pain without sounding like a malfunctioning airport kiosk. Clinical skills matter. Communication matters. Empathy matters. Documentation matters. The problem was never the goal. The problem was the exam itself.
The Step 2 Clinical Skills Exam became one of the most criticized parts of the medical licensing process because it was expensive, logistically awkward, stressful, limited in feedback, and arguably weak as a high-stakes measurement tool for U.S. medical students who were already being assessed repeatedly in clinical settings. When the exam was suspended during the COVID-19 pandemic and later discontinued in 2021, many students, educators, and policy observers did not mourn it like a lost treasure. They reacted more like someone finding out their least favorite parking ticket had been permanently abolished.
This article makes the case against the Step 2 Clinical Skills Exam: not because clinical skills are unimportant, but because a costly national performance test was not the best way to prove them.
What Was the Step 2 Clinical Skills Exam?
USMLE Step 2 CS was a standardized patient-based examination used to assess whether medical students and graduates could demonstrate basic clinical and communication skills. Test takers interacted with actors trained as patients, gathered relevant information, performed focused physical exams, wrote patient notes, and were scored across domains such as communication, spoken English proficiency, and integrated clinical encounter performance.
In theory, it tested the human side of medicine. In practice, it often tested whether a student could fly to one of a limited number of testing centers, pay a large fee, perform under artificial time pressure, and remember to wash their hands while silently praying the standardized patient’s abdominal pain was not secretly a psychiatry case wearing a gastroenterology hat.
The exam was pass/fail, but that did not make it low stress. A failure could delay graduation plans, complicate residency applications, create visa problems for international medical graduates, and add thousands of dollars in repeat costs. For an exam with high consequences, critics argued that it gave too little meaningful feedback and too little proof that it improved patient care.
The Strongest Argument for Step 2 CS
To be fair, defenders of Step 2 CS had a serious point. Medicine is not just multiple-choice mastery. A physician must listen, observe, reason, explain, document, and build trust. A national licensing system should care about whether graduates can actually interact with patients safely and respectfully.
Supporters also argued that a standardized exam created a common benchmark. Medical schools vary in curriculum, grading culture, clinical sites, and assessment rigor. A national exam could theoretically reassure licensing boards and the public that every candidate had cleared the same minimum standard.
That argument deserves respect. But respecting the goal does not require defending the tool forever. A smoke alarm is important; a smoke alarm that costs $1,600, requires airfare, and only tells you “maybe smoky” three months later needs a serious design meeting.
Problem 1: The Exam Was Expensive in a Way That Felt Unfair
The most obvious criticism of the Step 2 Clinical Skills Exam was cost. The exam fee alone was historically high, and that was before adding travel, lodging, meals, transportation, time away from rotations, and schedule disruptions. For many students, the real price was not just the registration fee; it was the whole travel circus.
Because Step 2 CS was offered only at specific testing centers, students often had to fly, book hotels, and rearrange clinical duties. A student living near a testing center had a very different experience from someone training across the country. That difference had nothing to do with clinical ability and everything to do with geography, money, and calendar gymnastics.
Medical school is already financially intense. Students pay tuition, buy study resources, apply to residency, travel for interviews, pay exam fees, and often borrow heavily. Adding another costly requirement created a predictable equity problem: students with more money could absorb the hit more easily, while students with fewer resources felt every fee, flight, and hotel night like a tiny academic tax monster.
Problem 2: High Pass Rates Raised Questions About Value
One of the most common arguments against Step 2 CS was that most U.S. and Canadian medical students passed. A high pass rate does not automatically mean an exam is useless. Seat belts work even when most people do not crash. But when an exam costs so much and fails relatively few candidates, critics reasonably asked: what exactly is the return on investment?
If nearly everyone from accredited U.S. and Canadian medical schools passes, the exam may be functioning less as a meaningful filter and more as an expensive confirmation of what schools already know. In other words, it told the system, “Good news: the fourth-year medical student who has been observed for years can talk to a patient for 15 minutes.” Helpful? Maybe. Worth the national burden? That is where the case gets wobbly.
The exam appeared more consequential for international medical graduates, who historically faced lower pass rates and additional barriers. That raises a different policy question: if the goal was to verify readiness and communication skills across diverse training backgrounds, could that be done through a more targeted, flexible, and transparent process?
Problem 3: The Feedback Was Too Thin for Real Learning
Good assessment should do more than sort people into “pass” and “fail.” It should help learners improve. A major frustration with Step 2 CS was that students who failed did not always receive detailed, practical feedback that could guide precise remediation.
Imagine being told, “Your communication was not good enough,” without a rich explanation of whether the problem was empathy, organization, closure, patient education, language clarity, missed counseling, or something else. That is not feedback; that is a fog machine with a score report attached.
Clinical skills improve through observation, coaching, repetition, and targeted feedback. A one-day national exam could identify some problems, but it was not well designed as a teaching tool. Medical schools, clerkships, simulation centers, standardized patient programs, and faculty preceptors are better positioned to provide repeated, specific coaching over time.
Problem 4: One Artificial Day Cannot Capture Real Clinical Readiness
Medicine is performed across months and years, not in a single scripted afternoon. Real clinical competence includes showing up consistently, adapting to uncertainty, working with teams, managing time, recognizing limits, documenting accurately, responding to feedback, and communicating with patients who do not follow neat exam scripts.
Standardized patients are useful. Objective structured clinical examinations, or OSCEs, can be excellent educational tools. But a single high-stakes standardized encounter has limits. It may reward performance polish over deeper clinical judgment. It may punish test-day anxiety more than genuine incompetence. It may favor students who can master the “Step 2 CS style” rather than those who are best prepared for the messy reality of wards, clinics, and emergency departments.
Real patients interrupt. They forget details. They bring family members. They have multiple conditions. They may be scared, frustrated, embarrassed, or confused. Real clinical work is not always a clean 15-minute scene with a curtain, a clipboard, and a standardized checklist. The Step 2 CS format could test pieces of readiness, but critics argued it should not carry such heavy national licensing weight.
Problem 5: Medical Schools Already Assess Clinical Skills
Another major case against Step 2 CS is that accredited medical schools already evaluate students’ clinical skills in multiple ways. Students take histories, perform exams, write notes, present patients, receive clerkship evaluations, complete OSCEs, work with standardized patients, and get observed by residents and faculty.
These assessments are not perfect. Faculty evaluations can vary. Clinical sites differ. Some schools are stronger than others at direct observation. But the answer to imperfect local assessment is not automatically an expensive national exam. A better solution may be to strengthen school-based clinical skills assessment through clearer standards, better faculty training, more direct observation, and more robust remediation.
In fact, clinical competence is best judged longitudinally. A student who communicates well across pediatrics, surgery, internal medicine, psychiatry, family medicine, and emergency medicine has provided richer evidence than someone who performed acceptably during one exam day. If a learner struggles, the school can intervene earlier and more specifically, instead of waiting for a late-stage national failure.
Problem 6: The Exam Created Stress at the Worst Possible Time
Step 2 CS often landed during a chaotic period of medical training. Students were taking Step 2 CK, completing rotations, requesting letters of recommendation, preparing residency applications, scheduling interviews, and trying to look like calm future doctors while living on cafeteria coffee and calendar alerts.
Adding travel and a high-stakes pass/fail exam during that period created unnecessary stress. A failed attempt could complicate residency timelines. Even students who were likely to pass still had to prepare, travel, and wait for results. The exam became another hoop in a profession already famous for installing hoops, lighting them on fire, and calling the process “professional development.”
Stress alone does not make an exam invalid. Medical training is demanding for a reason. But unnecessary stress deserves scrutiny. If a requirement does not clearly improve public safety, learning, or fairness, then the burden becomes harder to justify.
Problem 7: Equity Concerns Were Built Into the Structure
The Step 2 Clinical Skills Exam raised equity concerns in several ways. First, travel costs affected students differently. Second, time away from rotations affected students differently depending on school flexibility and personal responsibilities. Third, international medical graduates faced additional complexity, including travel, certification timelines, visa pressures, and higher stakes for failure.
There were also concerns about language and communication scoring. Spoken English proficiency matters in patient care, but language assessment in a medical licensing context must be careful, transparent, and fair. Communication is not simply accent reduction. Good communication includes listening, structure, empathy, plain-language explanation, cultural humility, and checking understanding.
A better assessment system should distinguish between unsafe communication and normal linguistic diversity. It should also give learners clear pathways for improvement. A national exam that feels opaque or financially punishing can easily become a barrier rather than a safeguard.
Why Discontinuation Did Not Mean Clinical Skills Stopped Mattering
When Step 2 CS was discontinued, some people worried that clinical skills would be downgraded. That fear misunderstands the argument against the exam. The case against Step 2 CS is not a case against communication, bedside manner, physical examination, or clinical reasoning. It is a case against using one expensive, logistically burdensome exam as the main symbol of those skills.
After discontinuation, responsibility shifted more clearly toward medical schools, licensing bodies, existing USMLE components, ECFMG pathways for international graduates, and workplace-based assessment. That is not a perfect solution, but it opens the door to better ones.
Clinical skills can be assessed through repeated OSCEs, direct observation, simulation, clerkship performance, patient notes, oral presentations, entrustable professional activities, and structured remediation. These methods can produce a fuller picture of readiness than a single exam day.
What a Better System Should Look Like
A better clinical skills assessment system should be fair, affordable, educationally useful, and connected to real patient care. It should not merely ask, “Can this student pass a scripted exam?” It should ask, “Can this learner safely, respectfully, and consistently care for patients under supervision?”
1. Use Longitudinal Assessment
Clinical skills should be observed repeatedly across settings. A student who struggles during one encounter should receive coaching and another chance to improve. A student who performs well across many months gives the system more reliable evidence than a one-day test can provide.
2. Improve Direct Observation
Faculty and residents should be trained to observe students directly, not just infer ability from notes or presentations. Watching a learner interview a patient, explain a plan, or perform an exam is essential. Medicine is a contact sport, intellectually speaking; you cannot evaluate the whole game from the parking lot.
3. Keep Standardized Patients, But Use Them Wisely
Standardized patient encounters are valuable, especially for practicing sensitive conversations, documentation, and diagnostic reasoning. They should remain part of medical education. The issue is not standardized patients. The issue is turning one expensive standardized patient exam into a national gatekeeper with limited feedback.
4. Make Feedback Actionable
Any clinical skills assessment should tell learners what to fix. “Needs improvement” is not enough. Students need concrete guidance: organize the history better, close the encounter clearly, explain the differential diagnosis in plain English, show empathy before jumping into the checklist, or document pertinent negatives more consistently.
5. Protect Patients and Learners
The public deserves physicians who can communicate and examine patients competently. Students deserve assessments that are transparent, affordable, and genuinely connected to learning. A strong system should do both.
The Case Against Step 2 CS in One Sentence
The Step 2 Clinical Skills Exam tried to measure something important, but it did so through a costly, stressful, limited, and imperfect format that often added more burden than value.
That is the heart of the argument. The exam’s defenders were right about the importance of clinical skills. Its critics were right that the exam itself was a poor tool for the job. A licensing system should not confuse the value of a competency with the value of a specific test.
Experiences From the Step 2 CS Era: What Students Learned the Hard Way
For many students, Step 2 CS was less a dramatic test of doctorly wisdom and more a logistical obstacle course wearing a white coat. The experience often began months before the actual exam, when students tried to find a test date that did not collide with clerkships, sub-internships, Step 2 CK, residency application deadlines, weddings, family obligations, or the mysterious calendar black hole known as “interview season.”
A typical student might finish a hospital shift, open the scheduling portal, and discover that the best available date required flying to another city during a week that already looked like a crossword puzzle designed by a sleep-deprived dean. The student would then book a flight, reserve a hotel, budget for meals, and hope no weather delay turned the clinical skills exam into an airport skills exam.
Preparation had its own strange rhythm. Students practiced knocking on imaginary doors, greeting imaginary patients, washing imaginary hands, and saying phrases like, “I understand this must be difficult for you,” with enough warmth to sound human but not so much drama that it resembled community theater. They memorized templates for patient notes. They rehearsed transitions. They learned to ask about smoking, alcohol, sexual history, medication allergies, and family history while watching the clock like it was a tiny villain.
Some of the preparation was genuinely useful. Practicing empathy, structure, and patient-centered language can make students better clinicians. Many students became more aware of how they sounded when explaining medical ideas. They learned that “myocardial infarction” may be accurate, but “heart attack” is usually more helpful to a worried patient. They learned to summarize, clarify, and close the encounter. Those lessons were valuable.
But the exam atmosphere could distort the learning. Students sometimes focused less on authentic communication and more on performance mechanics. Did I counsel enough? Did I ask enough review-of-systems questions? Did I drape correctly? Did I include three differential diagnoses? Did I remember the doorway information? Did I look compassionate, or did I look like a person mentally calculating whether I can still make checkout at the hotel?
The waiting period after the exam added another layer of anxiety. Because the result was pass/fail and the consequences were significant, students could spend weeks replaying tiny moments. Maybe they forgot to ask one question. Maybe they ran out of time. Maybe the patient note was too brief. Maybe the standardized patient noticed the nervous laugh. Maybe the entire future of residency depended on whether they said “thank you” before leaving the room. The mind, given enough stress, can turn a missed handshake into a Greek tragedy.
For students who passed, the most common feeling was not triumph. It was relief. Many described the result as one more box checked on the long road to residency. For students who failed, the experience could be devastating, especially if the score report did not clearly explain how to improve. A failure did not necessarily mean the student was unsafe with real patients, yet it could carry serious professional consequences.
The Step 2 CS experience taught medical education an important lesson: assessment design matters. A test can promote good habits, but it can also create unnecessary cost, anxiety, and inequity. The best clinical skills assessments should make students better doctors, not merely better performers in a temporary exam room. That is why the end of Step 2 CS should not be seen as the end of clinical skills testing. It should be seen as an invitation to build something smarter.
Conclusion: Clinical Skills Deserve Better Than Step 2 CS
The case against the Step 2 Clinical Skills Exam is ultimately a case for better assessment. Future physicians must communicate clearly, reason carefully, examine patients competently, and document responsibly. Those skills are too important to be reduced to a costly one-day performance test with limited feedback and uneven burdens.
Step 2 CS served a purpose in pushing the medical education system to take clinical skills seriously. But over time, its weaknesses became difficult to ignore. It was expensive. It was inconvenient. It added stress. It provided limited educational feedback. It duplicated work already happening in medical schools. And for many students, it felt less like a meaningful measure of readiness and more like a very expensive toll booth on the road to residency.
The better path is not to abandon clinical skills assessment, but to improve it: more direct observation, stronger school-based standards, useful feedback, fair remediation, and assessments that reflect real patient care. Clinical skills are the soul of medicine. They deserve a system that measures them with intelligence, fairness, and humanity.
