The New York Times’ first big story on the Teacher Data Reports released last week contained what sounded like great news: After years of studies suggesting that the strongest teachers were clustered at the most affluent schools, top-rated teachers now seemed as likely to work on the Upper East Side as in the South Bronx.
Teachers with high scores on the city’s rating system could be found “in the poorest corners of the Bronx, like Tremont and Soundview, and in middle-class neighborhoods,” “in wealthy swaths of Manhattan, but also in immigrant enclaves,” and “in similar proportions in successful and struggling schools,” the Times reported.
Education analyst Michael Petrilli called the findings “jaw-dropping news” that “upends everything we thought we knew about teacher quality.”
Except it’s not really news at all. Value-added measurements like the ones used to generate the city’s Teacher Data Reports are designed precisely to control for differences in neighborhood, student makeup, and students’ past performance.
The adjustments mean that teachers are effectively ranked relative to other teachers of similar students. Teachers who teach similar students, then, are guaranteed to have a full range of scores, from high to low. And, unsurprisingly, teachers in the same school or neighborhood often teach similar students.
“I chuckled when I saw the first [Times story], since the headline pretty much has to be true: Effective and ineffective teachers will be found in all types of schools, given the way these measures are constructed,” said Sean Corcoran, a New York University economist who has studied the city’s Teacher Data Reports.
The design stems from evaluators’ attempts to improve on the way teachers are judged. In the past, assessments of teacher quality tended to look only at students’ test scores: A teacher whose students scored higher was deemed stronger. But that design stacked the deck against teachers whose students started the school year with greater needs and lower scores.
The idea behind value-added measurements is that they look instead at how much growth students make in a year. Teachers are rewarded not when their students score highest, but when the students’ performance gains exceed the average gains made by similar students.
So while the ratings were explicitly designed to compare teachers who work with similar students, they cannot compare teachers who don’t. “This is just a difficult question that we still don’t know how to answer — this question of how to compare teachers who are in very different kinds of schools,” said Douglas Staiger, a Dartmouth College economist.
He added, “There are a lot of issues that I disagree with critics of value-added. But this is a real issue that it’s not clear how best to handle.”
Conclusions that the best teachers are clustered at the most affluent schools usually stem from other data points, such as teachers’ years of experience, their SAT scores, and the relative competitiveness of the college they attended. A recent study of New York City teachers found that gaps in teacher qualifications that had favored affluent schools narrowed between 2000 and 2005, especially at elementary schools.
Value-added ratings, meanwhile, are most often used by researchers to examine differences within schools. The fact that many New York City schools show wide differences in teacher ratings is not unusual, said Robert Meyer, who directs the Value-Added Research Center at the University of Wisconsin that produced the algorithm that the city used to calculate the teacher ratings.
“In any given school, there are teachers who are higher-performing and teachers who are lower-performing,” Meyer said. “That would be totally in keeping with 30 years of work with these issues.”