When Proficient Isn't Good

The deceptive nature of proficiency as a measure of student progress — and how to fix the funhouse mirror

By Leah Shafer, on January 4, 2016 2:26 PM
When Proficient Isn't Good: The deceptive nature of proficiency as a measure of student progress — and how to fix the funhouse mirror #hgse #usableknowledge @harvarded

Animated graphics by Iman Rastegari

Every teacher wants to see her students improve. But measuring that improvement may be more difficult than it seems. Over the past 15 years, No Child Left Behind and other federal policies have given special prominence to one primary measure for assessing student progress: the so-called percent proficient measure. With this metric, educators can track the percentage of students in a class, school, or district who are “proficient” (scoring at or above a certain designated baseline), with hopes that this percentage will eventually rise to 100.

But HGSE Professor Andrew Ho has a warning for the teachers, administrators, and policymakers who rely on this measure. Looking at these percentages is like “viewing progress through a funhouse mirror,” Ho cautions. “If educators and policymakers have questions about growth and equity, their answers will be at best distorted and at worst just wrong.”

Fixing the Funhouse Mirror

Ho’s research has found three main problems with assessing students by “percent proficient.”

Arbitrary markers

First, he says, "these initial proficiency markers are arbitrary, determined by an overwrought, judgmental, and ultimately political process.”

The animation above illustrates the point. If we imagine the usual “bell curve” of students, some states can set a high standard, resulting in relatively low percentages of proficient students, while other states can set a low standard, resulting in relatively high percentages proficient. The difference between these percentages is arbitrary, says Ho.

Distorted perceptions of growth


Aside from distorting comparisons between states, percent proficient can distort perceptions of growth within a state, or district, or classroom. States or schools with 50 percent proficient — the top of the bell curve — will have more students performing right around the proficiency marker. If those students make progress (or if they regress), more of them will cross the proficiency line than if the cutoff were more extreme. When you chart those student scores, the number of students reaching proficiency appears to accelerate or decelerate rapidly — as if the class has made significant gains or losses. But the appearance of that rapid or meaningful change is an illusion, Ho says.   

Misreading achievement gaps


The third problem, Ho explains, raises concerns about achievement gaps — for example, average differences between test scores of white or higher-income students and minority or poor students. When comparing two groups of students, whichever group has percentages closer to 50 percent will appear to progress or regress faster, leading to assumptions about changes in the achievement gap that are incorrect another illusion.

For Policymakers: Implications of Misinterpreted Scores

Start with the questions teachers and parents and students actually have, and then design the test around answering those questions. - Andrew Ho, Harvard Graduate School of Education #hgse #usableknowledge @harvardedHo began this research back in 2008. Why is it still significant now?

With the push toward “college- and career-ready standards,” Ho explains, many states have raised standards, making “proficient” a more difficult level to achieve and causing percentages of proficient students to fall. Under this system, achievement gaps often seem to widen. 

And the reverse can happen, too. “It’s actually much easier to show progress from 50 percent than it is from 10 percent or 90 percent," says Ho, "and this isn’t sensible or fair. We shouldn’t be able to make achievement gaps seem smaller just by picking a different standard.”  

Meanwhile, the increase in standardized test usage has teachers, not just superintendents and policy workers, more closely analyzing students’ scores, risking more widespread misinterpretations.

The recent passage of the Every Student Succeeds Act (ESSA) may help. While ESSA mandates proficiency levels, it leaves states the option of reporting scores by using averages or other metrics. Still, Ho fears that the dominance of percent proficient may linger. “ESSA does nothing to disrupt the momentum of proficiency. It will be up to states to encourage a focus on growth for all students, not just those who happen to be close to the proficiency score.”

For Teachers: What Scores Say About Your Classroom

For most classroom purposes, Ho suggests, tests should be able to answer two simple questions: “Should I be worried?” and “If so, what should I do about it?” Any time a teacher has a class whose percent proficient is near 50, she should expect relatively big swings in percent proficient after a test. All it may take is a little extra studying and concentration (or a little less) to make most of the class “proficient” (or “not proficient”). These swings are not necessarily anything to worry (or get excited) about; they likely represent only a tiny change in overall student scores.

On the other hand, a teacher with a class closer to 90 or 10 percent proficient — a very high-achieving class or a very low-achieving one — should pay close attention to small swings. These small swings are actually important, indicating large numbers of students moving near the middle of the distribution.

That is why, for most teachers, a better method is simply to compare average scores from one test to the next, as opposed to analyzing percentages. This practice gives a more accurate picture of student achievement, encourages teachers to expand their focus beyond the students who happen to be close to the cutoff, and improves identification of students who are falling behind. 

The Testing We Want

Most importantly, tests should also be able to answer the “what should I do about it” question. Ho contends that if tests “don’t give anyone any insight into what can be done about them, then they’re not much of an achievement.” Tests, he argues, should be faster in terms of both time to complete them and time to receive scores, more curriculum-relevant, and more teacher, student, and parent-friendly.

“Start with the questions teachers and parents and students actually have,” he explains, “and then design the test around answering those questions.”

***

Get Usable Knowledge — Delivered
Our free monthly newsletter sends you tips, tools, and ideas from research and practice leaders at the Harvard Graduate School of Education. Sign up now.

Faculty in this article

Andrew Ho

Andrew Ho is a psychometrician who studies the properties and consequences of test-based "accountability metrics." He is particularly interested in popular targets of current educational policies: proficiency, growth, value added, trends, achievement gaps, course completion, and college readiness.