How well does an elementary school in Maryland stack up to one in New Jersey? Do California’s eighth graders make faster academic gains than their peers in Connecticut?

In 2010, then-Secretary of Education Arne Duncan made the case for common state tests that would allow parents and educators to find out — and predicted that the comparisons would lead to dramatic policy changes.

“For the first time, it will be possible for parents and school leaders to assess and compare in detail how students in their state are doing compared to students in other states,” Duncan said. “That transparency, and the honest dialogue it will create, will drive school reform to a whole new level.” It was a heady moment: Most states had signed on to at least one of the two cross-state testing groups, PARCC and Smarter Balanced.

Though their numbers have since dwindled substantially, the two groups still count over 20 members between them. But seven years later, it remains difficult to make detailed comparisons across states, as a potent mix of technical challenges, privacy concerns, and political calculations have kept the data relatively siloed. And there’s little evidence that the common tests have pushed states to compare notes or change course.

“This is one unkept promise [of] the common assessments,” said Mike Petrilli, president of the Fordham Institute, a conservative think tank that has backed the Common Core standards.

“I’ve been surprised that there haven’t been more attempts to compare PARCC and Smarter Balanced states,” said Chad Aldeman of Bellwether Education Partners.

What comparisons are available? PARCC publishes a PDF document with scores from different states, based on publicly available information. “We have more states than ever administering tests that will allow for comparability across states,” said Arthur Vanderveen, the CEO of New Meridian, the nonprofit that now manages PARCC. “That data is all public and available. I think the vision really has been realized.”

Smarter Balanced does not publish any data comparing states, though those scores could be collected from each participating state’s website.

The presentation of the data stands in contrast to the National Assessment of Educational Progress, a test taken by a sample of students nationwide. NAEP has an interactive site that allows users to compare state data. No such dashboards exist for Smarter Balanced or PARCC, though both tests could offer more granular comparisons of schools and students.

Tony Alpert, the head of Smarter Balanced, says a centralized website would be difficult to create and potentially confusing, since states report their results in slightly different ways.

“The notion of comparable is really complicated,” he said. Nitty-gritty issues like when a test is administered during the school year, or whether a state allows students who are learning English to use translation glossaries on the math exam, can make what seems like a black and white question — are scores comparable? — more gray, he said.

“Early on our states directed us not to provide a public website of the nature you describe, and [decided that] each state would be responsible for producing their results,” said Alpert.

Neither testing group publishes any growth scores across states — that is, how much students in one state are improving relative to students who took the test elsewhere. Many experts say growth scores are a better gauge of school quality, since they are less closely linked to student demographics. (A number of the states in both consortia do calculate growth, but only within their state.)

“I’m not sure why we would do that,” Alpert of Smarter Balanced said. States “haven’t requested that we create a common growth model across all states — and our work is directed by our members.”

That gets at a larger issue of who controls this data. For privacy reasons, student scores are not the property of the consortia, but individual states. PARCC and Smarter Balanced are also run by the states participating, which means there may be resistance to comparisons — especially ones that might be unflattering.

“The consortium doesn’t want to be in the business of ranking its members,” said Morgan Polikoff, a professor at the University of Southern California who has studied the PARCC and Smarter Balanced tests. “Except for the ones that are doing well, [states] don’t have any political incentive to want to release the results.”

As for PARCC, a testing expert who has works directly with the consortium said PARCC has made it possible to compare growth across states — the results just haven’t been released.

“Those [growth scores] have been calculated, but it’s very surprising to me that they’re not interested in making them public,” said Scott Marion, the executive director of the Center for Assessment. This information would allow for comparisons of not just student proficiency across states, but how much students improved, on average, from what state to the next.

Vanderveen confirmed that states have information to calculate growth across states.

But it’s unclear if any have done so or published the scores.

Chalkbeat asked all PARCC states. Colorado, Illinois and Maryland responded that they do not have such data; other states have not yet responded to public records requests.

Vanderveen said that states are more interested in whether students are meeting an absolute bar for performance than in making comparisons to other states. “A relative measure against how others students are performing in other states — and clearly states have decided — that is of less value,” he said.

The cross-state data could be a gold mine for researchers, who are often limited to single states where officials are most forthcoming with data. But both Polikoff and Andrew Ho, a professor at Harvard and testing expert, say they have seen little research that taps into the testing data across states, perhaps because getting state-by-state permission remains difficult.

Challenges in the ability to make comparisons across states and districts led Ho and Stanford researcher Sean Reardon to create their own solution: an entirely separate database for comparing test scores, including growth, across districts in all 50 states. But it’s still not as detailed as the consortia exams.

“One of the promises of the Common Core data was that you might be able to do student-level [growth] models for schools across different states and our data cannot do that,” he said.