High-stakes testing refers to when a test is used to make important decisions. High-stakes tests are used in schools, employment settings, and various professions.
In the context of schools, the results of a high-stakes test can affect students, teachers, individual schools, and entire districts.
Government agencies at the local, state and federal levels use the tests as a means of accountability to ensure that students are learning and teachers are effective.
Consequences of high-stakes tests include:
- For schools, schools that do poorly on high-stakes standardized tests can be sanctioned, have their funding cut, and endure negative publicity. Schools that do well may receive accolades, increased funding, bonuses for teachers, and social recognition.
- For students, scores on high-stakes tests can mean advancing to the next grade level or being held back.
- For teachers, high-stakes testing can have implications for their salary and promotional opportunities. Teachers whose students do well on high-stakes tests may be rewarded with a bonus, salary increase, become more qualified for promotion, or receive a stipend for professional development.
Although often discussed in educational contexts, high-stakes testing also occurs in various professions to make hiring decisions, determine who is promoted, or as part of the licensure process.
For example, medical doctors and nurses must pass an exam to obtain a license to practice. Failing the exam means not being able to practice in that profession. Therefore a licensing exam is considered a high-stakes test.
High-Stakes vs. Low-Stakes Tests
The primary difference between a high-stakes and low-stakes test centers on how it is used. Generally speaking, any test which is used to make an important decision is considered a high-stakes tests.
For example, if the test is used to determine if a student graduates or is admitted to an academic institution, then it is a high-stakes test.
If however, the test is used as part of other assessments to determine a student’s grade in a course, then it is considered a low-stakes test.
Some types of tests may fall somewhere in between and are open for discussion as to whether they are “high-stakes” or not.
For example, some do not consider the SAT or ACT to be a high-stakes exam because there is not a clear pass/fail line of demarcation. Even a very low score will not prevent someone from attending a university.
Pedagogically, high-stakes test are generally summative (the results are final) rather than formative (the results can be amended and improved as the student move through topic mastery). They also tend to test product (whether you got the result right or wrong) rather than process (how you went about getting to your result).
Origins of High-Stakes Testing
Although the history of high-stakes testing in the U.S. can be traced to the early 1900s during the neoliberal education era, scholars have discovered that the first high-stakes tests were used in ancient China approximately 2,000 years ago (Bowman, 1989).
Several scholars have pointed to the Qin and Han dynasties around 200-100 B.C.E. as the earliest, verifiable, use of high-stakes exams. Tests were administered to thousands of citizens to make distinctions in their abilities and determine whom could to serve in the government (Eberhard, 1977; Hucker, 1978).
“The emperor himself gave written examinations to all nominees–probably the first written examinations of any sort in world history” (Hucker, 1978, p. 64).
Because these exams were used to select individuals for employment, they should be considered high-stakes.
Several centuries later, the exams had become highly developed and formalized. Exams were divided for use in different levels of government (municipal to national) and used efficiently for several hundred more years (Bowman, 1989).
When Europeans learned about this exam system, they were impressed with its objectivity. Governments eventually adopted similar procedures, which then made its way to North America.
High-Stakes Testing Examples
- Driver’s License Test: Probably the most important test in history from the perspective of most teenagers. Passing the test means freedom; failing means humiliation and constantly having to ask one’s parents for a ride.
- The Job Interview: Although it may not involve a formal testing procedure and is highly subjective, the job interview is essentially a pass/fail exam that determines whether someone is employed or not.
- Nurse’s License: All individuals in the U.S., Canada, and Australia must pass the National Council Licensure Examination (NCLEX) in order to be employed as a licensed nurse.
- The Bar Exam: Although the exact process and nature of testing is different from country to country, and in the U.S., state to state, in order to practice law an individual must go through law school and then pass the bar exam.
- U. S. Citizenship Test: In addition to other criteria, to become a citizen of the United States, an immigrant must pass the citizenship test. The test consists of 10 questions about U. S. history and government. The test-taker must get at least 6 questions correct to pass.
- The Dissertation Defense: Before being granted a doctorate, each candidate must participate in and pass an oral defense of their dissertation. The exam consists of a panel of professors that ask the candidate challenging questions about their dissertation. Passing means receiving the Ph.D.
- To Become a Pilot: The FAA (Federal Aviation Association) requires that in order to fly an airplane, the individual must obtain a pilot’s license. A license is needed to fly airplanes, gyroplanes, helicopters, gliders, balloons and airships. Each test is unique, and it is a pass/fail decision.
- Computer Simulations: Several jobs require applicants to take an exam via a computer simulation. For example, to become a public bus driver may involve driving through a simulated city, demonstrating awareness of traffic signals and avoiding various obstacles.
- Language Proficiency Exams: Many universities require that foreign students take a language proficiency exam in the language of instruction. Admission is, in part, dependent on achieving a certain minimal score.
- To Become a Navy Seal: In addition to a multitude of other tests, individuals wanting to become a Navy Seal must also take the CSORT, which stands for Computerized-Special Operations Resilience Test. The test measures performance strategies, psychological resilience, and personality traits.
High-Stakes Testing Pros and Cons
Advantages of High-Stakes Testing
Many forms of high-stakes tests, such as achievement tests used by school districts, can be very informative. The results allow administrators and stakeholders to know which subject areas and skills are being adequately instructed and which need improving.
When it comes to large budgets and the need to allocate resources, test scores can guide data-driven decisions and lead to more effective and efficient use of resources.
2. Standardized and Refined
Many high-stakes tests are used on a large scale over a period of years, sometimes decades. They have gone through many years of refinement that involve administering the test to different populations.
This means that test items that do not meet appropriate criteria are eliminated and replaced with higher-quality items. After several iterations, the final version of the test, which is administered to all test-takers, can be far superior to its original form.
In addition, testing procedures have also been developed and refined over an extended period of time. Test administrators are well-trained and follow the same procedures no matter where and when the test is delivered. In fact, administrators will be given carefully designed scripts that are implemented in the same manner on every testing occasion.
All of these efforts lead to a testing process and environment that is as optimal as possible.
3. Ensures Qualified Professionals
High-stakes tests are utilized in numerous critical professions. For example, medical doctors and other healthcare professionals must obtain passing scores on exams which are stringent and challenging.
Aviators, which are responsible for the safety of hundreds of passengers a day, must also meet strict criteria assessed via a high-stakes exam. Not passing means not flying.
This ensures that those working in the profession have met critical criteria and have a certain level of proficiency in that occupation. Thus, minimizing risk to the public being served.
High-Stakes Testing Criticisms
1. Issues of Validity
The term validity refers to whether the test actually measures what it purports to measure. For example, achievement tests can cover a wide range of subjects, but does a student’s score on the test really reflect their level of knowledge on each of those subjects?
Concerns regarding validity also come into question when one examines carefully some modern testing procedures. For example, the increasing use of computerized testing has many benefits, but it can present challenges to students from lower income homes who have limited access to computers.
This means that taking a test on a computer can be anxiety-provoking, which interferes with accurate assessment. In this scenario, the student’s test score is not an accurate reflection of their knowledge and means the test lacks validity for that student.
2. Teaching to the Test
Teachers can feel a great deal of pressure for their students to perform well on achievement tests. This means that some teachers will have a tendency to focus on class material that they know is covered on the test (Popham, 2001).
This creates numerous problems (Volante, 2004). For instance, students may be missing out on learning important subject material not covered on the test. Teachers may focus lessons on factual information and rote memory instead of teaching students more valuable higher-order thinking skills (Herman, 1992).
In addition, research indicates that teaching to the test may increase students’ scores, this may not reflect actual growth in learning (Shepard, 2000; Smith & Fey, 2000).
3. Test Anxiety
Test anxiety refers to intense emotional, physiological, and behavioral responses experienced by individuals when placed in a testing situation that may involve negative consequences dependent on the test-taker’s performance (Zeidner, 1998).
Students that experience test anxiety often engage in off-task behaviors and have thoughts and feelings during testing that interfere with their performance (Cizek & Burg, 2006).
Those challenges are exacerbated when students are placed in high-stakes testing environments (Segool et al., 2013).
Von der Embse et al. (2013) point out that between 10% and 40% of students experience test anxiety (Gregor, 2005). Furthermore, students with disabilities, females, and minority students report higher rates of test anxiety (Putwain, 2007; Sena et al., 2007).
This means that students suffering from test anxiety will perform even less well when confronted with high-stakes testing, which then may result in unfair negative consequences.
High-stakes tests are tests that involve serious consequences for test-takers. Passing or failing a high-stakes test can mean the difference between being employed or being accepted into a prestigious university.
Despite the ramifications of performance, high-stakes tests have been criticized on a number of grounds. For example, some may question whether a 4-hour achievement test accurately measures the amount of knowledge a student has acquired over a period of years.
In addition, some students suffer from test anxiety which significantly impairs their test-taking. For these individuals, their score may not reflect their true abilities.
At the same time, high-stakes tests can be very informative for school boards and state governments. They allow stakeholders to see where the strengths and weaknesses in instruction are occurring and offer an opportunity for adjustments.
High-stakes tests can also help ensure the safety of the public. As qualified professionals must meet certain stringent criteria in order to be employed, this means they are capable of performing those duties to an acceptable standard.
Bowman, M. L. (1989). Testing individual differences in ancient China. American Psychologist, 44(3), 576–578. https://doi.org/10.1037/0003-066X.44.3.576.b
Eberhard, W. (1977). A history of China. University of California Press.
Gallagher, C. J. (2003). Reconciling a tradition of testing with a new learning paradigm. Educational Psychology Review, 15, 83-99.
Gregor, A. (2005). Examination anxiety: Live with it, control it or make it work for you? School Psychology International, 26, 617 – 635.
Hamilton, L. S., B. M. Stecher, and K. Yuan. 2012. Standards-based accountability in the United States: Lessons learned and future directions. Education Inquiry 3.2: 149–170.
Herman, J. L. (1992). What research tells us about good assessment. Educational Leadership, 49(8), 74-78.
Hucker, C. O. (1978). China to 1850: A short history. Stanford University Press.
Phelps, R. P. (2019). Test frequency, stakes, and feedback in student achievement: A meta-analysis. Evaluation Review, 43(3-4), 111-151.
Phelps, R. P. (2012). The effect of testing on student achievement, 1910–2010. International Journal of Testing, 12(1), 21-43.
Popham, W. J. (2001). Teaching to the Test? Educational Leadership, 58(6), 16-21.
Putwain, D. (2008). Deconstructing test anxiety. Emotional & Behavioural Difficulties, 13, 141 – 155.
Segool, N. K., Carlson, J. S., Goforth, A. N., Von Der Embse, N., & Barterian, J. A. (2013). Heightened test anxiety among young children: Elementary school students’ anxious responses to high‐stakes testing. Psychology in the Schools, 50(5), 489-499.
Sena, J. D. W., Lowe, P. A., & Lee, S. W. (2007). Significant predictors of test anxiety among students with and without learning disabilities. Journal of Learning Disabilities, 40, 360 – 376.
Shepard, L. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4-14.
Smith, W. C. 2014. The global transformation toward testing for accountability. In Special issue: The comparative and international history of school accountability and testing. Edited by S. Dorn and C. Ydesen. Education Policy Analysis Archives 22.116: 1–34.
Smith, M. L., & Fey, P. (2000). Validity and accountability of high-stakes testing. Journal of Teacher Education, 51(5),
Togut, T. D. (2004). High-stakes testing: Educational barometer for success, or false prognosticator for failure. Hartfield, VA: Harbor House Law Center, Inc. Retrieved July, 5, 2023. https://www.harborhouselaw.com/articles/highstakes.togut.pdf
Volante, L. (2004). Teaching to the Test: What Every Educator and Policy-Maker Should Know. Canadian Journal of Educational Administration and Policy.
Von der Embse, N., Barterian, J., & Segool, N. (2013). Test anxiety interventions for children and adolescents: A systematic review of treatment studies from 2000–2010. Psychology in the Schools, 50(1), 57-71.
Zeidner, M. (1990). Does test anxiety bias scholastic aptitude test performance by gender and sociocultural group? Journal of Personality Assessment, 55, 145.
Zeidner, M. (1998). Test anxiety: The state of the art. New York: Plenum Press.
Zuriff, G. E. (1997). Accommodations for test anxiety under ADA? Journal of the American Academy of Psychiatry and the Law Online, 25(2), 197-206.
Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]