testing takeaways

A decade of stagnation: Little progress on closely watched federal test, as big disparities persist

Scores on the exams known as the “nation’s report card” have barely budged over the last two years, new data show.

The minimal progress on the federal math and reading exams given to fourth and eighth graders will be a disappointment to officials who have hoped that their policies would boost students’ performance or help close yawning gaps between groups of students.

The 2017 results also mean that the U.S. has seen its test scores largely stagnate for a decade, after 10 years of substantial gains in math. The country’s “achievement gaps” between black and white students, and between low-income and affluent students, have also largely held steady over the last 10 years.

“I’m pleased that eighth-grade reading scores improved slightly but remain disappointed that only about one-third of America’s fourth- and eighth-grade students read at the NAEP Proficient level,” said former Michigan Governor John Engler, the chair of the National Assessment Governing Board, which oversees the tests. “We are seeing troubling gaps between the highest- and lowest-performing students. We must do better for all children.”

In an era when standardized testing is commonplace, the National Assessment of Educational Progress is the rare exam with low stakes for individual students and schools, but high stakes for politicians and policymakers. Some education leaders have staked their own reputations on NAEP results.

But score analyzers, beware: It’s difficult to draw conclusions about the benefits of specific policies based on the results. NCES, the federal agency that administers the tests, warns against it.

Some have also questioned what the transition to digital assessments means for the trends in individual state results, though NCES insists that extensive efforts have been made to account for this change.

Still, advocates on all sides will use them to argue for their preferred changes to education policy. U.S. Secretary of Education Betsy DeVos has already praised the gains in one state, Florida, and highlighted the disappointing national results. “The report card is in, and the results are clear: We can and we must do better for America’s students,” she said.

What you should know about NAEP and this year’s scores

The National Assessment of Educational Progress is administered by the federal government to a sample of students across the country. The most closely watched tests are the fourth- and eighth-grade math and reading exams, since they show how scores are changing nationally, in individual states, and in a number of cities.

The 2017 results showed only tiny differences from 2015: a loss of 1 point in both subjects in fourth grade, and a gain of 1 point in both subjects in eighth grade. Only the grade eight reading improvement was statistically significant compared to the last test.

(One way to think about how big that is: the difference between the “basic” and the “proficiency” benchmarks is about 35 points, depending on the test.)

Most students did not reach the test’s “proficient” benchmark, which is considered a high bar to clear. But some groups of students remain further behind.

Exam Share of students scoring ‘proficient’ and above Share scoring ‘basic’ and above
Fourth-grade math 40% 80%
Eighth-grade math 34% 70%
Fourth-grade reading 37% 68%
Eighth-grade reading 36% 76%

In eighth-grade math, the average black student scored just below the “basic” benchmark, while the average white student came several points shy of the higher “proficient” benchmark. Forty-four percent of white students were proficient, compared to 20 percent of Hispanic students and 13 percent of black students.

Gaps were similarly large between students who did and did not qualify for free or reduced-price lunch, a common proxy for poverty.

While test score gaps by race and poverty remained static, there was a notable increase in the difference in performance between the highest achieving students and the lowest-achieving ones.

As expected, some states and cities saw their scores rise and fall modestly, though the vast majority held steady. Alaska, Louisiana, New Hampshire, South Carolina, and Vermont saw statistically significant declines on two or more tests, while Florida was the only state that made significant gains on multiple tests. None of the state improvements or drops were more than 6 points.

Eighth grade reading scores over time

Fourth grade math scores over time

Source: National Center for Education Statistics. Graphics by Sam Park.

Louisiana, New Mexico, and Mississippi students continued to rank at or near the bottom, while Massachusetts, New Jersey, and New Hampshire students perform consistently well.

(Since state demographics vary — and NAEP results are highly correlated with race and poverty — research has tried to account to for that to better isolate performance of schools, and those rankings differ significantly.)

The longer-run trends in NAEP are more positive than the latest results. Nationally, scores have improved substantially in math and modestly in reading since the early 1990s.

What’s the deal with the flat scores?

It’s unclear why scores are flat. NCES, which administers the exam, says the scores could be influenced by specific policies, resources available to schools, and demographics.

That’s a frustrating limitation for policymakers who want clear solutions. It’s also unlikely to stop the finger-pointing and policy prescriptions.

Critics of the reform efforts that prevailed under the Obama administration — the expansion of charter schools, introduction of the Common Core learning standards, and the creation of new teacher evaluation systems — will likely see the results as vindication, even as supporters use the latest data to argue that public schools need substantial change.

A handful of careful statistical analyses have tried to gauge how certain policies have affected NAEP scores in the past. For instance, one recent study found that states that made greater cuts in school funding in response to the Great Recession saw worse NAEP scores as a result; an older study found that an infusion of school funding led to greater NAEP gains.

Other research has found that when states introduce stringent school accountability systems, they do better on the NAEP math tests.

Are Children Learning

Memphis schools in most need of growth see gains, but vast majority of students still not on grade level

PHOTO: Laura Faith Kebede
Principal Melody Smith discusses how students at A.B. Hill Elementary grew significantly in test scores.

Three years after one elementary school joined Shelby County Schools’ flagship school improvement program, Principal Melody Smith says growth is proof their efforts are working.

“We came together we battled, we cried, we fought tooth and nail, but in the end we kept our students in the center,” Smith told teachers as they reviewed the results a week before school began.

PHOTO: Laura Faith Kebede
Teachers at A.B. Hill Elementary discuss what makes an ideal school.

A.B. Hill Elementary School, which is part of the Innovation Zone, went from less than 5 percent of students reading on grade level last year to 15 percent in state test scores released Thursday. That jump earned the South Memphis school the state’s highest ranking in growth, but the scores also mean about 85 percent of students still don’t meet state requirements.

The iZone’s two dozen schools have been heralded for how much students have grown since 2012, especially when compared to the state-run Achievement School District, which heavily relies on private charter organizations to boost test scores, and scored the lowest in student growth.

But the challenge is far from over, and school leaders are looking for ways to improve faster.

State leaders generally look at three years of data before determining if academic strategies are working. And in the past three years, the state’s switch to online testing has been tumultuous, which has caused some district leaders and state lawmakers to question the results. But on national tests, Tennessee was held up as a model for student growth compared to surrounding states in a recent Stanford University study — even while the state is still in the bottom half of test scores nationwide.

PHOTO: Caroline Bauman
Antonio Burt became assistant superintendent in July over the Innovation Zone and other struggling schools within Shelby County Schools.

Only three schools in the iZone — Westhaven Elementary, Cherokee Elementary, and Ford Road Elementary — have more than 20 percent of students reading on grade level. By comparison, 16 schools surpassed that in science, five in math, and four in social studies.

“There was a lot of movement in our elementary schools,” said Antonio Burt, the district’s assistant superintendent for schools performing poorly on state tests. But “we’re going to need a laser light focus on our high schools and our middle schools.”

The district created the iZone to boost student achievement in schools performing the worst in the state, all of which are in impoverished neighborhoods. The state Legislature allowed principals to have much more autonomy on which certified teachers they could hire, pumped about $600,000 per school for teacher pay incentives, and added more resources to combat the effects of poverty in the classroom, such as clothes and food closets.

Now, entering its seventh year, the iZone is still outshining the state-run district, and students are still showing more growth compared to their peers across the state who also performed poorly last year. Nine schools in the iZone got the state’s highest ranking for growth, compared to just five last year when the state switched to a new test. (Scroll to the bottom of this story to compare test scores and growth for iZone schools.)

Of the 23 schools in the iZone last year, seven of them were high schools. None of the high schools had more than a third of students on grade level or above in any subject. Four of them — Raleigh Egypt, Melrose, Mitchell, and Hamilton — saw significant growth in at least one subject. Last year was Raleigh Egypt’s first year in the iZone under Shari Meeks, who previously was principal at Oakhaven Middle School.

PHOTO: Laura Faith Kebede
Clothes closet at A.B. Hill Elementary School in Memphis.

Burt said “the first big thing” that will be done to combat low reading scores in middle and high schools will be to strengthen curriculum. Adding curriculum for younger students played a part in boosting test scores that contributed to growth, leaders said.

Also, new reading specialists will teach a separate class for students who are the furthest behind on top of their normal English class. Before, teachers were responsible for catching up those students, or specialists would take them out of class to work on reading skills.

At the district level, Burt said science, social studies, math, and English advisors will be working more directly with teachers. And principal coaches will have more say in how and where those advisors concentrate their efforts.

Inside the school, Smith, the principal at A.B. Hill Elementary, said having teachers practice more difficult lessons in front of each other helped spur more ideas on how to make the curriculum work for their students.

Teachers said collaboration with others was key to figuring out the best way to improve test scores there. It was common for teachers to invite each other to sit in on lessons and give feedback.

“We would debrief with each other all the time,” said Brenda Pollard, who taught fourth-grade English and social studies. Now she says the foundation has been laid for higher achievement.

“It can be done,” she said. “We’re living proof it can be done.”

Below is a table of how iZone schools fared on state tests. Fields labeled “4.9” were hidden in state data, but are likely below 5 percent.

tar heel trivia

New education research? A good chance it’s from North Carolina.

PHOTO: Creative Commons/Boston Public Library

Barbeque. Basketball rivalries. The Blue Ridge Mountains.

Education research?

It’s something else North Carolina is known for, at least among a subset of social scientists.

“North Carolina has really done something special,” says Amy Ellen Schwartz, a professor and the editor of Education Finance and Policy, an academic journal.

“If you look over the last 20 years and focus on the highest quality work, it’s disproportionately work that comes from North Carolina data,” says Dan Goldhaber, an education professor at the University of Washington at Bothell.

North Carolina students aren’t more interesting or easier to find. But a disproportionate share of education research — and therefore, a disproportionate amount of what we know about how certain policies work — comes out of the Tar Heel State.

That’s because North Carolina has kept track of things like student test scores, teacher demographics, and school accountability data since the ‘90s, and also made that information more accessible to researchers than anywhere else.

It works well for those looking for data. But it also underscores a troubling reality: We know much less about how policies play out in places where data is hard to access — and in some cases, may be kept under lock and key for political reasons. That leaves the public to take the best lessons it can from a state that’s home to just 3 percent of the country’s public school students.

“The problem is that what you really want to do is look at lots of places,” said Schwartz, a professor at the Maxwell School at Syracuse University. “You want to be able to leverage the natural experiments and understand the variation in a way that’s really hard to do in one place.”

Of course, researchers in many cases do work productively with local officials to obtain data. And although it appears that North Carolina is the most commonly studied state in education policy, it is by no means the subject of the majority of academic papers. For instance, seven studies published in Education Finance and Policy over the last two years were focused on North Carolina — more than any other state or district, though over 30 others focused on K-12 schooling in the U.S used national data or data from elsewhere.

North Carolina’s popularity is tied to the fact that it is one of the few states where researchers can get student data (that has been anonymized) from a third party, in this case a research center established in 2000 that operates out of Duke University. In most states, the state education department or other state agency controls that information. Many states and districts lack the resources, streamlined systems, or staff capacity that North Carolina’s center has to meet researchers’ requests.

That center also separates policymakers and the keepers of the data — which may be crucial for ensuring information is made available.

“Not every place wants to open up their data and say, ‘Study what you want,’” said Schwartz. “The risk is that a researcher investigates something or casts it in a way that’s not positive for the school district.”

Goldhaber echoed this. “If you’re talking to somebody who’s involved with politics … they’re going to see everything through a political lens. And that when it comes to evaluating programs and policies, people often don’t see much upside,” he said.

In North Carolina, local researchers realized the importance of tracking students and schools over time, according to Duke’s Clara Muschkin, the faculty director of the data center.

When Goldhaber was studying schools there in the 1990s, he recalled, “There was a real belief that people ought to study these issues, and that was kind of pervasive under Gov. Jim Hunt.”

That extended to research that Hunt’s administration might not like. For instance, Goldhaber was interested in studying whether teachers who attained National Board certification were more effective in the classroom. Hunt was the founding board chair of the organization that awarded those certifications, and Goldhaber’s research had previously shown that certification types didn’t make much difference. But that didn’t stop the administration from providing that data to Goldhaber, who ultimately found North Carolina’s board certified teachers were particularly effective.

It’s impossible to say how often political concerns play a role in keeping data from researchers. When politics is involved, researchers themselves may not know, and if they do, they may not want to publicize it in hopes of eventually working out an agreement. (This reporter has heard frequent complaints about politics getting in the way of data access — but in most cases those are made off the record.)

A more subtle method of interference is when officials decide not to collect data in the first place that researchers might use to reach unflattering conclusions. California, Goldhaber said, is a particular culprit.

The largest state in the country has weakened, or declined to improve, its data systems since 2010, and the information that exists is not readily available to researchers. Governor Jerry Brown has argued that educational data is of little use to teachers and schools, and feeds into a test-focused mentality of schooling.

“You are not collecting data or devising standards for operating machines or establishing a credit score,” wrote Brown in a critique of the Obama administration’s Race to the Top program, which encouraged more data collection. “I sense a pervasive technocratic bias and an uncritical faith in the power of social science.”

Goldhaber has found it difficult to study the state’s education policies.

“There is just basic data that we could not get out of California,” he said, referring to a study he and colleagues are undertaking there.

Some places are becoming more cognizant of concerns about a lack of quality research about their schools. In Washington, D.C., the city council is considering funding an education research group and may make its data widely available to researchers. In California, some advocates and policymakers have pushed for improving its data systems, an idea the state’s likely next governor has backed.

In the meantime, those interested in key education questions — in California, DC, and elsewhere — can always look to North Carolina for answers. That’s largely a good thing, says Goldhaber.

“The fact that we are learning things in North Carolina is tremendously useful for informing policy and practice in other states,” he said.