Evaluating Teachers: Value-Added Has No Value
Sign up for the Worlds of Education newsletter.
Sign up for the Worlds of Education newsletter.
Thank you for subscribing
Something went wrong
The neoliberal politics and economics of the past thirty or so years has increasingly blamed government for society's ills. One corollary has been to promote narrow business models to evaluate government performance and to use those models to cut back government. Both of these tendencies have become more pervasive as the current economic crisis offers neoliberals an excuse to further blame, evaluate, and cut government activities.
Public school teachers have been a focus for applying this neoliberal ideology. Around the world, good teachers are being fired because of poorly conceived and designed evaluations that go by the name of merit pay for teachers, performance pay, or incentive pay. These approaches are too often based on a piece of statistical legerdemain that has been sweeping the globe called "value-added." In the U.S., the Obama administration's Race to the Top program mandates this approach. Value-added is a statistical set of procedures that purports to measure scientifically what has become the holy grail of education -- the impact of a teacher on student test scores.
Unfortunately, measuring value-added in practice is simply impossible, illogical, and unscientific, and you don't have to be a statistician to understand why. Value-added statistical models are supposed to separate the impact of one factor -- the teacher -- from the literally dozens of other factors that contribute to a student's performance on a test. For example: access to a home computer, books and other resources in the home, technology access in the schools, effort at homework, parents’ education, parents' support, influence of previous teachers, peer effects, school climate, aspirations, access to health care, better diet, a good night's sleep, and many, many others. Even if you had information on all these factors, believing some statistical model could sort out the relative influence of each is simply wishful thinking. Moreover, value-added models have data on only very few factors -- in the U.S., data is usually available only on special education status, English proficiency, attendance, and eligibility for reduced price lunch (an inadequate measure of poverty). "Controlling" for these factors and attributing the rest to the teacher makes no sense. The effect attributed to the teacher is always incorrect since the many factors that have been omitted, if included, will always change the teacher impact measure -- and in either direction. Statisticians who attempt to control for a few of these factors can do so, and the analysis will always identify so-called meritorious teachers. But the results are completely illegitimate. Controlling for different factors will lead to different teachers selected as meritorious, and there is no basis for deciding which factors to control. The upshot is that value added models distribute awards to teachers randomly, not in accord with their performance.
The State of Florida tried a value-added approach to merit pay for schools in the 1980s. This suffers the same problems as a value-added merit pay for teachers’ scheme. In Florida, school district statisticians found their value-added models identified different schools as meritorious, depending on which factors they controlled for, and they realized there was no right way to decide what to control for. These statisticians were embarrassed when it came time to awarding money to meritorious schools since there was no stable way to estimate which schools were meritorious, and there was no rational basis for explaining to schools why they won or lost.
The spread of value-added schemes can be explained by their simplistic but attractive logic and the fact that there is a set of technical experts and businesses who lobby for these schemes as they become increasingly lucrative. I am not saying test scores are irrelevant to teacher assessment. While narrow approaches to achievement testing in some countries have gotten out of control, simple measures of a classroom's gain in test scores, as one piece of information along with many others about a teacher’s performance, can be interpreted with knowledge of the local context as part of a professional peer evaluation system. But we can no more scientifically determine teachers’ effects on test scores than we can legislators’ impact on economic growth or poverty reduction. Sure, both have an impact, but the processes are too complicated for simplistic solutions. If value-added models are to be used, let us experiment with merit pay for legislators and others who advocate such models before we try to foist off such schemes on teachers.
The opinions expressed in this blog are those of the author and do not necessarily reflect any official policies or positions of Education International.