Defined by a Test Score: America’s Troubling Education System

By Christopher Chen

The dreadful, horrifying 8AM: the time when millions of high school juniors and seniors go to high schools around America to take the SAT. Tired, caffeine-addicted teens take the SAT in hopes that the score will give them acceptance to their dream college. The students start their test and begin bubbling in their scantrons. One question at a time, students frantically answer as many questions as they can. Several students stop and ponder the countless number of questions. Others don’t think; they guess. “5 minutes left.” The sound of the students frantically trying to bubble in their answer, double-checking their answers, the sound of kids trying to survive the cut to get into the best college surrounds the room; it is this sound that makes millions of young Americans anxious about whether they get into a good college, will have a good-paying job or even be competent enough. The sound is you, your score; it is your score that determines your future and fate.

The SAT is one of the most important tests that most Americans take in their lifetime. It is an entrance exam used by most colleges and universities to make admissions decisions, to give special individuals their coveted acceptance letter in the mail. It determines aspects of our everyday lives: the ranking of high schools and colleges, the housing market and the indication of a “good school district,” the determination of whether a student will be competent in his or her first year of college or even his or her future. The standardized test has played a very important role in our lives, especially for our education system. The test has changed a lot over the past few years.  Before, the test was out of 2400, a lauded score that few students could ever achieve in their high school careers. A slight change was made on March 2016 when the SAT was changed again to reflect  “the knowledge, skills and understandings that research has identified as most important for college and career readiness and success,” whereas in the past, it only emphasized reasoning and analytical thinking (Compare SAT). The test procedures changed significantly, in that, each wrong answer doesn’t penalize you, and you can earn a score up to 1600, a score similar to the SAT version up to 2005. The test takes three hours with a writing section, an optional section in the SAT that most universities and colleges require students to take in order to apply for that college. The SAT has become one of the most important standardized tests in America; however, it gives the disadvantaged an unfair chance at getting into a college and to earn quality education. Many critics disparage the SAT for its material, bias and racism in the test content. Studies have shown that the material inside the SAT might affect students’ scores by the wording of the problems and the various cultural interpretations of words. Thus, it is necessary to address issues that arise from standardized testing, such as the built-in bias, which results in several groups scoring lower than others, and the limited opportunities for higher education that results from testing. In this Literature Review, it is essential to discuss several important topics: the bad and the good of SAT, the solution that can eliminate disparity among groups, and the implications for the future.

There are many problems with the SAT that affect students of different backgrounds such as cultural and income bias. These problems are critical to a student’s future in that they can potentially impact college admissions, potential scholarships and jobs. The article, The SAT and Admission: Racial Bias and Economic Inequality by Ethan Biamonte, gives an overview of how colleges implement the SAT into their application process and how it is “unethical for the SAT to be used in college admission because it has cultural and economic biases that oppress low-income groups, racial minorities and females” (Biamonte 1). Biamonte believes that the SAT has different kinds of biases that affect many groups of people. For example, he presents evidence that there is such economic bias in the SAT, given that, as the numerical value of family income increases, the test scores of three different SAT sections go up as well, suggesting a correlation between family wealth and educational success (Biamonte 2). It doesn’t necessarily mean that being rich would automatically give you a higher score, but that the concept of money that can buy coaching programs, which help the child to get a good SAT score. The income bias is to reflect the ability to have a tutor access to many test-preparation books available in the market today. Furthermore, we must consider that income bias should also reflect the ability to buy test-preparation books and online services that offer SAT guides and lessons, which are costly to certain individuals.

But it is not only income bias that increases the score gap disparity affecting the millions of low-income, underprivileged students who are trying to change their future for the better, but also gender bias that encourages the score gap disparity. Biamonte suggests that there is gender bias because “females with similar ability levels to males tend to perform worse than males on the math section of the test” (Biamonte 8). He concludes that “females’ abilities are underestimated” (Biamonte 8). However, why do females tend to underperform males? One possible explanation for this is that “females are less confident in their answers” (Biamonte 9). In a study, Ellen Lenney claims that women “display lower self-confidence than men across almost all achievement situations” and double-checking answers on the SAT gives a significant disadvantage, as women are more likely to doubt their answer with the time putting pressure on them (Lenney 1) . To add to Lenney’s claim, an article written by Anemona Hartocollis of New York Times reported that one recent SAT test sitting in May 2016 reveals that several people, more specifically ‘tutors’, found two items to be disturbing and discriminating, one in the verbal portion and the other in the Math portion. The items “posed what some test-prep experts considered a textbook example of a ‘stereotype threat’” (Hartocollis 1).  The questions were evident: one math question involved “showing more boys than girls in math classes overall” and the other was a verbal passage that students had to read and analyze “a 19th-century polemic arguing that women’s place was in the home” (Hartocollis 4).  When people are reminded in the test of a negative stereotype about a race or ethnicity that they relate to, psychologists say that, “It creates a kind of test anxiety that leads them to underperform” (Hartocollis 3). What was interesting was that the essay included Christian references in the passage, which were edited out for ‘length and focus’, although adding Christian references would rather discriminate or disturb certain audiences, not really to focus on the complexity and difficulty of the passages (Hartocollis 4). What stands out is that these passages “argued that women have a lower status than men and wield their influences through the domestic sphere” (Hartocollis 4). It is a classic example of bias, more specifically gender bias, in the SAT. Biamonte’s and Hartocollis’s articles give us different point of views. They critique the SAT and give us a point of view of the problems of America’s education system. Biamonte’s article gives us an in-depth look at what kinds of biases exist in the SAT and explains why these occurrences happen. Hartocollis’s article supports Biamonte’s claim by proving the discrimination, proving that there exist inequalities in our society. These two articles are crucial in a way that they show us the disparity of the SAT, the reliance of the admission tests and the effects of the disadvantaged. College has become the way to the path of success and with admission test as a barrier to the disadvantage, it tells us something about life and America: not everything is fair or equal.

However, there seems to be a strong necessity for testing in the world, and we cannot survive without it. Testing gives students focus on essential content and skills that are useful for the future. It motivates students to excel and to improve. Harvey S. Leviton published an article in 1967 about A Critical Analysis of Standardized Testing. Leviton addresses the criticisms that standardized testing has changed our school to follow a curriculum, to follow guidelines and to not innovate or ameliorate the system. As Leviton says, “Test producers are more followers than leaders of the curriculum,” but what people fail to realize is that standardized testing “is a natural outgrowth from the teacher’s natural evaluatory procedures” (Leviton 394). Testing is never perfect; most tests are imperfect and we cannot expect “perfection in any other product” (Leviton 394). Without the implementation of tests into our school system, we would have to rely more on “less adequate facilities and faculties” (Leviton 394); and if we didn’t implement the standardized testing, is there anything better? Thus, with this article by Leviton regarding standardized testing, we can apply the SAT to Leviton’s idea. The SAT is essential. As Leviton said, “There is really no alternative that can achieve what the SAT can do.” Additionally, the College Board has done a lot to make the new version “profoundly transparent” (Rosner 1). There is really no “instrument” that can achieve the concept of equating the public school curriculum. There are no such options left to achieve a perfect test where there exists gender bias, ethnic bias, income bias and other types of bias, but the SAT. The SAT is the only way for standardized testing and America’s educational system to remain as one. As David Z. Hambrick, an associate professor of psychology at Michigan State University, said, “the SAT works well” and “works for its intended purpose — predicting success in college” (Hambrick). He suggests that if the “intelligence test” concerns “the question of whether it is fair to use people’s scores to make decisions that profoundly affect their lives,” he said “that’s just too bad” (Hambrick). When there are so many applicants who have the desire to apply to College X, you are bound to encounter several problems of the variance of grade-point averages and curriculum from different schools across America.

Nonetheless, if many critics disparage the SAT for its biasness, then are there any solutions to fix this problem that currently affects the millions of Americans on whether they are attending a college or having a prosperous future? Freedle has found one solution. Roy E. Freedle, a research psychologist for the Education Testing Service, the world’s largest private nonprofit educational testing and assessment organization that manages the SAT and other standardized tests such as the GRE, wrote one of the most important academic articles that considered the disparity in the SAT and offered a solution to solving this problem. He wrote Correcting the SAT’s Ethnic and Social-Class Bias: A Method for Reesstimating SAT Scores that describes a problem: SAT is both “culturally and statistically biased,” which he adapted from Stephen Jay Gould who mentioned that “a test can be biased in at least two ways, culturally and statistically” (Freedle 1). Freedle expands on Gould’s idea that a test can be culturally biased when “one group performs consistently lower than some reference population” (Freedle 2). A test can be “culturally biased if individuals from different ethnic groups interpret critical terms in many of the test items differently” (Freedle 2). Freedle used the concept of DIF or differential item functioning to examine minorities and White responses to each test item. With this method, Freedle found out that “Whites tend to score better on easy items and African Americans on hard items” (Freedle 3). The most possible reason as to why this phenomenon happens is that “easy analogy items tend to contain high-frequency vocabulary words while hard analogy items tend to contain low-frequency vocabulary words” (Freedle 6). Thus, individuals of different cultural backgrounds “may well differ in their definitions of common [easy analogy] words” (Freedle 6). He applied his concept of DIF to multiple testing platforms like several Advanced Placement tests and found similar results that “Hispanics, Asian Americans and disadvantaged Whites perform differentially better on hard verbal and quantitative items” (Freedle 28). He claims that “cultural familiarity and semantic ambiguity play an important role in determining the relatively poor performance of minority groups on essentially the easiest test items” (Freedle 29); with these findings, Freedle proposed the R-SAT, or the Revised SAT. The revised version focuses “on hard-item performance”, which can “remove a large part of [the SAT’s] cultural and statistical bias” (Freedle 7). He claims that his solution reduces “ethnic bias and therefore has the potential to increase dramatically the number of minority individuals who might qualify for admission into our nation’s select colleges and universities” (Freedle 28). Indeed, his results show that with his model, minority groups like African Americans score better with the R-SAT than the SAT, and the difference of performance between White and African American examinees “is shown to be substantially reduced” (Freedle 23).

The idea of implementing the R-SAT into our current American education system is a sound idea as it could potentially solve many societal problems that exist today. However, how did Freedle propose the idea that many scholars today consider to be a model to potentially solve the disparities in the SAT? Jay Mathews of The Atlantic wrote The Bias Question covered Freedle’s solution to ending the bias in the SAT. In his article, Mathews examines Freedle’s contribution in exposing the biasness of the SAT. Mathews claims that “if minorities are at a disadvantage in taking the SAT, their choice of colleges will be significantly limited, with the important implications for their financial, professional and social future” (Mathews 2). Freedle noticed this trend when he worked for ETS. He found that by “analyzing various linguistic aspects of the questions, he could predict the ones test takers in Seoul or Shanghai or Sarajevo would find easy and which would make them chew their pencils and look at the clock” (Mathews 7). He found out that “simple word repetition” could lead to test makers choosing an answer if the context of the answer is similar to the context of the question (Mathews 8). He eventually compiled his results into one report and handed his report to his supervisors, but they kept rejecting him. After the 11th revision, the report was accepted, but little was done to implement his analysis and findings into the SAT. Freedle “wrote several reports on the subjects” but all his research proposals were “being turned down” (Mathews 9). Freedle retired and sent his proposals to Harvard Educational Review. There, his report, Correcting the SAT’s Ethnic and Social-Class Bias: A Method for Reesstimating SAT Scores was published. Regardless, Freedle’s study received attention, in that, “this study further evoked distrust in the test and warned the colleges and universities who continued using SAT.” His study remains one of the most controversial, but most valuable studies that addresses the concerns of the SAT and provides a solution that many scholars analyze and research today.

Freedle’s study emphasized an important idea: the SAT is not a predictable and reliable admission test, as there exist biases that can make certain applicants’ score higher, and it is utterly disparate and unfair to use the “biased” score to evaluate students on college admissions, scholarships and many other applications of the SAT. Freedle proposed the R-SAT to eliminate biases and to be a more reliable testing platform.  However, his solution does draw new questions: does the R-SAT have validity and reliability? Can it potentially solve the massive problem that has existed for years? Is the R-SAT trustworthy and dependable to be implemented into America’s education system? A follow-up on Freedle’s research was done by Maria Veronica Santelices, an assistant professor at the Department of Education at the Catholic University of Chile, and Mark Wilson, a professor in the Graduate School of Education at the University of California, Berkeley. The academic article, Unfair Treatment? The Case of Freedle, the SAT, and the Standardization Approach to Differential Item Functioning, presented the results of the experiment that  Santelices and Wilson replicated Freedle’s experiment and compared the results of the experiment from that of Freedle’s. The article aims to see if “Freedle’s phenomenon and results hold across different ethnic groups” and to verify Freedle’s results (Santelices and Wilson 112). Like Freedle’s research, Santelices and Wilson used the concept of DIF to see if Freedle’s claim is applicable for the present. The academic article concludes that Freedle’s research “confirm the relationship between item difficulty and DIF estimates reported by Freedle for the African American/White comparison of verbal items” (Santelices and Wilson 127). The research, however, “did not find evidence to suggest that this issue applies to Hispanic students, nor did it find evidence to suggest that the issue applies to questions other than verbal items” (Santelices and Wilson 126).  In other words, this experiment wouldn’t necessarily apply to certain ethnic minorities such as “Hispanic students” (Santelices and Wilson 127). This means other ethnic groups like Asian Americans cannot necessarily apply to this data and we must continuously research if these ethnic groups can apply to Freedle’s phenomenon. Santelices and Wilson concluded that the “SAT continues to be one of the most influential tests in the United States” and “fairness of its results should be of utmost importance” in giving everyone a fair chance to get a desirable score without any bias in the test (Santelices and Wilson 128).

If universities took away the requirement to add one’s SAT score in one’s application, what will happen? Rebecca Zwick wrote a book Fair Game?: The Use of Standardized Admissions Tests in Higher Education that evaluates a study that found that “only 46 percent of four-year colleges considered test scores to be a ‘very important’ factor in admissions decisions” while “87 percent rated high school achievement as very important” (Zwick 35). Nonetheless, if we took out the SAT score in a regular application, will it do harm or bring benefits? In Howard Wainer’s book, Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies, Wainer analyzes one’s college choice to eliminate the SAT requirement in its college application and its result from this drastic measure. Wainer analyzes Bowdoin College, a “small, selective, liberal arts college located in Brunswick, Maine” (Wainer 9). The college eliminated the requirement of the SAT for its applicants in 1969, but the students do have a choice of submitting one if they desire. A table shows that for the Class of 1999, 106 out of the 379 students did not submit their scores (Wainer 10). One could assume that these 106 students did not submit their scores because they were lazy to take the SAT, but it was intriguing that these 106 students “did, in fact, take the SAT” (Wainer 10). In a special data-gathering investigation, the 106 students’ scores were retrieved and the mean score of the 106 students who didn’t submit their SAT score was 1201 versus those who submitted their score, which the average was 1323. So we know that a large percentage of students didn’t report their scores in college, but one must, so as to know how well these students perform in their studies at this small, liberal-arts college. Results show that the students “not only did 120 points worse on the SAT, but also received [a GPA] 0.2 points lower than those who did submit their score in [their college application]” (Wainer 11). Wainer provides a dot graph showing the college performance of students by comparing their first year GPA with their SAT score, or their combined score on the Verbal and Mathematics part of the SAT (Wainer 19). Each dot on the graph represents a student, with the x-axis alignment representing the student’s SAT score and the y-axis alignment representing the student’s GPA. Through this graph, we see dots that represented students who did and did not submit their SAT score and by analyzing the graph, there is no trend in the graph that exists to prove that a student who did not an submit SAT score who get a lower GPA than that of a student who did submit an SAT score. The results are simply scattered; there is no trend to prove an student that submitted an SAT score is generally better off in college and would result in a better first-year college GPA. From this, we cannot assume that students who didn’t submit a SAT score are not smart or academic-driven. We cannot assume that students who didn’t submit a SAT score are people who are not good enough for college. What this graph emphasizes is the unpredictability of students. Though students who didn’t submitted a SAT score might have a low SAT score, a student can have a higher first-year GPA than a student who did submit a score. It is absolutely absurd to think a student’s grades are in correlation to one’s SAT score. Every student is different, but we cannot place heavy emphasis on these scores. We need to place less emphasis on these score in order to allow people of different backgrounds to have access to higher education.

In the future, we must find a way to solve this disparate problem that is affecting America greatly. It has been a big problem for many students to get into the college of their choice, to be closer to fulfilling their dreams, to become successful one day as well as to innovate the world for the better. The test is treated as a barrier for many low-income, minority students who cannot afford the $54 SAT test and test-preparation materials. It is treated as a restriction for students of a different culture who have different interpretations of words. It is treated as a limitation to the many diverse ethnicities, backgrounds to get that high score they need to go to college. Thus, America’s education system is in a critical and crucial position. A reform must take place to educate our society for the better, in order to flourish with academia all around. It is the path towards success and we must continue the generational trend of influencing students to go to college and pursuing their dreams, the American Dream. We cannot wait. We cannot ignore. We cannot disregard this issue, as this problem concerns a great majority of Americans. In order to make a better place and a better future, we must act now before it’s too late.


Acknowledgements: I would like to thank Heather Steffen for the insightful comments, critiques, and suggestions she provided for the completion of this comprehensive Literature Review that took time and dedication to finish. I would like to also thank Lyna Moreno and the additional peers who have also helped me in enhancing this Literature Review. I would also like to thank the University of California, Santa Barbara for providing the helpful resources to find academic articles and helpful sources.


Works Cited

Biamonte, Ethan. “The SAT and Admission: Racial Bias and Economic Inequality.” The People, Ideas, and Things Journal, 15 Nov. 2013, pitjournal.unc.edu/fall2013/sites/default/files/satinequality.pdf.

“Compare SAT Specifications.” SAT Suite of Assessments, The College Board, 11 Feb. 2016, collegereadiness.collegeboard.org/sat/inside-the-test/compare-old-new-specifications.

Freedle, Roy O. “Correcting the SAT’s Ethnic and Social-Class Bias: A Method for Reestimating SAT Scores.” Harvard Educational Review, vol. 73, no. 1, 2003, pp. 1–43. doi:10.17763/haer.73.1.8465k88616hn4757.

Hambrick, David Z. “The SAT Is a Good Intelligence Test.” The New York Times, 16 Dec. 2011, nytimes.com/roomfordebate/2011/12/04/why-should-sats-matter/the-sat-is-a-good-intelligence-test.

Hartocollis, Anemona. “Tutors See Stereotypes and Gender Bias in SAT. Testers See None of the Above.” The New York Times, 26 June 2016, nytimes.com/2016/06/27/us/tutors-see-stereotypes-and-gender-bias-in-sat-testers-see-none-of-the-above.html.

Lenney, Ellen. “Women’s Self-Confidence in Achievement Settings.” Psychological Bulletin, vol. 84, no. 1, Jan. 1977, pp. 1–13. doi:10.1037/0033-2909.84.1.1.

Leviton, Harvey S. “A Critical Analysis of Standardized Testing.” The Clearing House, vol. 41, no. 7, 1 Mar. 1967, pp. 391–395. JSTOR, jstor.org/stable/10.2307/30183090?ref=search-gateway:79433eb38e8bfe2053a3202d28f778e3.

Mathews, Jay. “The Bias Question.” The Atlantic, Nov. 2003, theatlantic.com/magazine/archive/2003/11/the-bias-question/302825/.

Rather, Dan. “Stress Test: Getting Into College.” YouTube, uploaded by Dan Rather Reports, 27 November 2011, youtube.com/watch?v=Lzyj4HELxgY.

Rosner, Jay. “Why the New SAT Isn’t as Transparent as the College Board Wants You to Believe.” Los Angeles Times, 29 Apr. 2016, latimes.com/opinion/op-ed/la-oe-0501-rosner-sat-transparency-20160501-story.html.

Santelices, Maria Veronica, and Mark Wilson. “Unfair Treatment? The Case of Freedle, the SAT, and the Standardization Approach to Differential Item Functioning.” Harvard Educational Review, vol. 80, no. 1, 2010, pp. 106–134. doi:10.17763/haer.80.1.j94675w001329270.

“The Perfect Score: Cheating on the SAT.” YouTube, uploaded by CBS News, 1 Jan. 2012, youtube.com/watch?v=dfqFEiP10_E.

Wainer, Howard. Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies. Princeton University Press, 2011.

Woollen, Susan. “Test Bias: The SAT in the College Admissions Process.” What Kids Can Do, 2008, www.whatkidscando.org/featurestories/2008/03_scoring_well/pdf/24-4-woollen.pdf.

Zwick, Rebecca. Fair Game?: The Use of Standardized Admissions Tests in Higher Education. Routledge Falmer, 2002.