Testing. Testing. 1-2-3.

Posted January 22, 2018
By Lory Hough

You’ve been writing about this issue for decades, but you’ve been holding back. Why the change?

As an academic, I try to evaluate the evidence dispassionately, and for many years, I wrote measured descriptions of the accumulating evidence. I presented the first evidence of score inflation — increases in scores much larger than actual improvements in learning — more than 25 years ago, and I and others have presented additional studies of score inflation, bad test preparation, cheating, and other negative effects ever since.

But I finally lost patience. Dispassionate explanations turned out to be easy to ignore. Many of the people with control over education have simply disregarded the accumulating evidence or asserted that it couldn’t apply to their system. Score inflation helped them ignore it; as long as the press and public didn’t become skeptical, it looked like student performance was improving substantially, even when it wasn’t.

Unfortunately, all too many social scientists have also downplayed the negative evidence. And while they continued to ignore it, the misuse of tests became ever more extreme, in some cases reaching truly absurd levels — for example, “evaluating” teachers based on the scores obtained by teachers in other schools or teaching other subjects to different students. This does real harm to schooling, to educators, and ultimately to kids.

I finally decided that it was time to try to make it harder to ignore the evidence. In this book, as you noted, I’m blunter: I used “honest adjectives.” However, I did more than that. I pulled together evidence about both the positive and negative effects to show by how much the negative outweighs the positive. And I offered both principles and concrete suggestions for doing better.

This is not an anti-testing book. Testing done right can be tremendously useful. I argue that a replacement for the current, failed system should include sensible testing. The problem isn’t testing; it’s the misuse and sometimes abuse of testing.

You’ve said regardless of test scores, parents should be asking, “What do you want to see when you walk through the door of your school?”

This should be the starting point in designing a system to replace our current, failed system. To design a productive accountability system, we first must decide what we want to see improved. The logic of test-based accountability was that if we held people accountable for just a few of the things we value in education, primarily test scores in a few subjects, the other important things would get better, or at least not get worse. That is nonsense. We have decades of research showing that if you measure only a few of the outcomes that matter, most of the others will not get better, and some will get worse. Teachers have limited time and resources.

In Charade, I suggest that we start by monitoring what I call the Big Three: student achievement (and not just the portion we can measure well with standardized tests), quality of instruction, and school climate. For example, if you want to see students engaged, motivated, and curious — these were among my most important criteria when I evaluated classes for my own children — holding people accountable for test scores won’t get you this. You have to give teachers the support they need to teach that way. In Charade, I explain this by giving real examples of both excellent and awful teaching, neither of which would necessarily have been picked up by test scores.

Can you picture a day when your next book will be titled, The Testing Turnaround: How Testing Actually Did Help Make Schools Better?

Nothing would please me more, but I’m afraid that day is far off.

There are several ways that testing can make schools better. First, testing is an invaluable tool for monitoring overall performance, provided accountability hasn’t inflated scores. For example, how do we know that the performance gap between minority and white students has been slowly narrowing while that between poor and well-off students has been widening? Standardized tests. How do we know that the mathematics performance of American students is mediocre by international standards? Again, standardized tests. Standardized tests allow more trustworthy comparisons among schools than measures like grades, precisely because they are standardized. And well-designed tests, used sensibly, can help guide teachers’ efforts to improve their instruction.

To capture those benefits, however, we have to end the damaging policies in place now and clear away the damage already done. To take just one example, one of the most disturbing negative effects of test-based accountability is that many young teachers have been trained specifically to use bad test prep — test prep that generates bogus gains in scores rather than true improvements in learning. Some have been told explicitly that doing so is “good instruction,” and some districts and states have been purveyors of this bad test prep. Some teachers have never seen anything else. It won’t be enough to stop giving these teachers incentives to cut corners. We will also need to retrain many of them. Undoing the damage and building a better approach will take a great deal of work and time.

Read an excerpt from The Testing Charade.

Ed. Magazine