Behind the numbers

Why ‘personalized learning’ advocates like Mark Zuckerberg keep citing a 1984 study — and why it might not say much about schools today

PHOTO: TechCrunch/Creative Commons
Facebook founder Mark Zuckerberg.

Facebook founder Mark Zuckerberg made a bold statement in a recent essay: By giving students individual help, average students can be turned into exceptional ones.

“If a student is at the 50th percentile in their class and they receive effective one-on-one tutoring, they jump on average to the 98th percentile,” Zuckerberg wrote.

It’s a remarkable claim, one that strains the limits of belief. And for good reason: The results from the 1984 study underlying it have essentially never been seen in modern research on public schools.

Still, the results have become a popular talking point among those promoting the “personalized learning” approach that Zuckerberg’s philanthropy is advancing. One video created by the Chan Zuckerberg Initiative features an illustration of a 50 on a graph zooming upward to hit 98. The New Schools Venture Fund, another influential education group that backs personalized learning, cites the same work by Benjamin Bloom.

But a close look at the study raises questions about its relevance to modern education debates and the ability of new buzzed-about programs to achieve remotely similar results.

“If you’re really going to make these huge investments and huge pushes [based on this study], you might want to be absolutely sure that the analysis of that research is solid,” said Ben Riley, head of the group Deans for Impact and a skeptic of personalized learning.

Jim Shelton, who heads CZI’s education work, said in an interview that the organization relies on a great deal of other research, but highlights Bloom to illustrate in the best case scenario for what schools might accomplish.

“It stands to reason that many kids that currently perform at levels that we consider average or even below average could be performing at levels that we would consider superlative,” he said.

Questions then and now about the meaning of Bloom’s work

The conclusions on the effects of tutoring from Bloom’s widely-cited paper are drawn from two studies conducted by University of Chicago graduate students.

One of those studies is available online, but reading the other requires some sleuthing. (We ended up paying for access through a service that compiles dissertations.)

In both studies, students were taught novel subject matter — probability or cartography — using different methods over the course of a few weeks. Some students were taught in a traditional lecture style, others received “mastery-based” teaching, and others received small group tutoring.

On a final test, students who were tutored one-on-one or in small groups came out far ahead, and in some cases the average tutored student beat 98 percent of those taught in the traditional way. Students who received the mastery-based teaching — which overlaps with modern conceptions of personalized learning — also did much better, though not as well as those tutored.

Jim Shelton of the Chan Zuckerberg Initiative in one of the organization’s video, saying that the average student will move to 98th percentile with one-on-one tutoring.

The applicability of these studies today is an open question. Combined, the studies focus on just three schools and a few hundred students. And since this was done more than 30 years ago, things like what traditional instruction looks like may have substantially changed.

The papers include little information about those final tests, but it appears they were designed by the researchers, unlike a traditional standardized test. Researcher-created assessments on subjects that are totally new to students — like cartography and probability, in this case — tend to see students make the largest gains.

Bloom’s work also doesn’t focus on technology-based tutoring, a point personalized learning advocates usually acknowledge. “If it supports anything, it supports one-on-one human tutoring,” Riley said.

But what earned the most attention, then and now, is how big of an impact tutoring had on students. The difference between tutoring and traditional instruction after just three weeks was two standard deviations — to researchers, a truly incredible result. It means bringing students from average to exceptional.

“I’ve never seen a study in education that found effects in the range of two standard deviations, so it’s remarkable for that reason,” said Jon Guryan, a Northwestern professor who has done research on tutoring.

Another researcher, Robert Slavin of Johns Hopkins University, logged concerns about Bloom’s outsize claims as early as 1987. Focusing on such unusually large gains, he wrote, “is misleading out of context and potentially damaging to educational research,” since it could lead researchers to “belittle” more realistic results.

Guryan’s recent work, on tutoring of struggling students in Chicago, found what would normally be considered fairly large gains: about a quarter of a standard deviation on math standardized tests. Other recent research on intensive tutoring in public schools looks similar, in some cases showing even smaller effects. Meanwhile, studies on computer-based personalized learning have shown a range of effects — but none comes close to two standard deviations.

Bror Saxberg, CZI’s vice president of learning science, acknowledged that Bloom’s findings are bigger than in other research. But he said human and computer tutoring can have a substantial impact, pointing to a 2011 overview of research where results come close to a full standard deviation. (This overview included studies in a variety of contexts, including outside K-12 education.)

In sum, a number of studies suggest that Bloom’s huge results are not plausible to expect in public schools today, and they have rarely been seen in other research. Meanwhile, Zuckerberg, Shelton, and CZI’s public statements imply that, with the right tools, students could see similar off-the-charts improvements.

Can ‘personalized learning’ drive huge gains? Advocates hope so.

Shelton analogized Bloom’s work to the human quest to run a four-minute mile: a crazy-seeming goal that was eventually attained by a small number of elite runners.

“Everyone said it was impossible to break the four-minute mile, until somebody broke the four-minute mile,” Shelton said. “Someone has broken the four-minute and its equivalent and we need to figure out how to do it and how to get a lot more people to be able to do it.”

Many others also see Bloom’s research less as a precise accounting of the results of tutoring and more as a call to action. Indeed, most of Bloom’s paper amounts to him pondering a question philanthropists are grappling with today: How can schools get the benefits of individual tutoring without the prohibitive expense of actually hiring each student their own tutor?

“If the takeaway from Bloom is that by doing tutoring and mastery you’re going to get two [standard deviation] gains — I don’t think that’s the right takeaway,” said Todd Rose, a Harvard professor who has argued that schools need greater customization. (CZI has funded some of Rose’s work.)

The value of the study, he says, is that “it speaks to a very different view of human potential than is embedded in our current system.”

Debbie Veney, a spokesperson for New Schools Venture Fund, which is supported by CZI, had a similar take: “[Bloom’s results] inspired and challenged many to figure how to achieve similar conditions in a more cost-effective way — which spawned many creative concepts and efforts to scale similar results.”

That’s in line with CZI’s sweeping ambitions — “empower every teacher everywhere,” as described in one CZI video — and deep pockets.

Zuckerberg and his wife Priscilla Chan have pledged to donate 99 percent of their Facebook shares — worth an estimated $45 billion in late 2015 — to CZI over their lifetime. The organization — which also focuses on criminal justice, immigration, and economic policy — is expected to give “hundreds of millions of dollars” per year to education causes.

The group has already supported a number of tech-based approaches to school, including the Summit learning platform, a computer program created by a charter network to help teachers personalize learning. CZI has also tried to broaden the definition of personalized learning, funding organizations that offer free eye exams and small-group, in-person tutoring.

A spokesperson pointed to other research CZI relies on, including psychological studies from Rose and others on how children learn and develop and the work of Stanford professor Carol Dweck, which suggests that people with a “growth mindset” are more likely to succeed.

But Sarah Reckhow, who studies education philanthropy at Michigan State University, suggests that CZI’s ambitious goals will meet the hard realities of the classroom and fall far short of Bloom’s results.

“I do think they’re setting themselves up to fail,” she said. “If you look at educational research, if you look at what will most definitely vary once you to put something into practice … those effect sizes won’t be replicated, but also there will probably be some cases where it will not turn out well or there will be unintended consequences.”

Asked about his benchmarks for success, Shelton said it’s not clear yet what is possible.

“We’re at the beginning of our journey, not the end of our journey,” he said. “We are in the business of trying to figure out how to solve this problem that has never been solved before.”

Regents rundown

As elections approach, New York’s top education policymakers begin to outline legislative priorities

PHOTO: Creative Commons, courtesy JasonParis
Albany statehouse

New York’s top education policymakers are gearing up to discuss their legislative wishlist for next year’s session, just as the political balance of the state legislature could turn on its head.

The state’s Board of Regents will kick off the discussion Monday by reviewing last year’s priorities — everything from bullying prevention programs to expanding access to advanced coursework — and propose tweaks and additions.

They’ll also discuss what to prioritize in their overall funding request for education across the state (the board has not yet requested a specific dollar amount). Last year the Board asked for a $1.6 billion increase, which is less than the $1 billion boost that was ultimately approved. But the if the state Senate, which has been controlled by Republicans for years, flips to Democrats, it could reshape the annual budget dance just as it kicks into gear.

Also on the Regents agenda: a discussion of state test scores that were released late last month. However, state officials have repeatedly said the results do not offer much insight about whether student learning is improving across the state because of changes to the test that make results hard to compare to previous years.

Here’s what you should know in advance of the meeting.

Legislative chatter

Officials are set to discuss last year’s legislative priorities and how close they got to their goals.

One priority from that cycle, for instance, was to address the yawning gap in access to advanced coursework in different school districts across the state, a top concern of New York City Mayor Bill de Blasio as well. Among wealthy suburban school districts, students were roughly five times as likely to have access to six or more Advanced Placement or International Baccalaureate offerings as students in New York City, according to a report released earlier this year. (The city is also launching a pilot program to allow virtual classes in advanced subjects at 15 high schools in the Bronx, under the new teachers contract.)

The Regents requested $3 million in grants to help expand offerings among high-needs districts, and wound up with $500,000, according to state documents. (Though the board doesn’t have any formal power over the legislature, they can help sway the outcome as the state’s top education policymaking body.)

They’ll also discuss a slew of other priorities, including how to support new intervention plans for New York’s lowest-performing schools that were developed as part of the state’s compliance with the federal Every Student Succeeds Act.

And the Regents will talk about progress on their efforts to support English learners; they have previously asked for funding to translate Regents exams into Spanish so students can better demonstrate skills beyond their proficiency in English.

Other issues, beyond these priorities, may surface in discussions Monday as well.

The board isn’t expected to approve a full set of legislative goals until December, and it’s possible that a wave election could give Democrats control of the State Senate. Regents Chancellor Betty Rosa previously told Chalkbeat said she hopes “the combination of the Assembly and the Senate will create leverage” in the budget process, a dynamic she hopes will lead to more funding.

Many of the Regents’ priorities — more support for vulnerable students, additional social services in schools, and other initiatives — would require significant additional investments.

Testing testing

State and local education officials have said it’s impossible to compare the newly released results on the state English and math exams to last year’s because the test was changed — it’s administered over just two days instead of three —  but several lingering issues could surface.

In New York City, there are still significant score gaps between white and black students. Almost 67 percent of white students passed their English tests, close to double the percentage of black students. And almost 64 percent of white students passed math, compared to about a quarter of black students.

And even though Regents reduced the number of testing days, opposition to the exams continued, with about the same percentage of New York students deciding to opt out as did the previous year. In New York City, where most kids usually take the test, there was a slight uptick in students who sat out.

This comes after the state agreed to soften certain penalties for schools where opt-out rates remained consistently high.

Some Regents remain committed to computer-based testing, and the state hopes to eventually expand the practice to all students. Some are concerned about the nature of the exams, whether they are fair to English language learners, and whether the tests help perpetuate disparities.

State education officials have shown some interest in different approaches to testing. Regents decided not to apply for a federal waiver to pursue “innovative” exams — involving essays, projects, and tasks — but they did form a work group that is partially focusing on testing.

2018 SPF ratings

Fewer Denver schools earn top ratings as the district raises the bar for quality

PHOTO: Melanie Asmar/Chalkbeat
Students at Kepner Beacon Middle School work on an assignment.

The Denver school district raised its bar this year for what it deems a quality school — and the number of schools meeting that bar plummeted, according to ratings released Friday.

Even though districtwide elementary and middle school test scores rose last spring, just 88 of Denver Public Schools’ more than 200 schools this fall are rated blue or green, the top two ratings on the district’s five-color scale. That’s down from a record 122 blue and green schools last year, and lower than the 95 schools that earned those ratings in 2016.

Twenty schools this year are rated red, which is at the bottom of the scale. That’s twice as many as last year but not as many as earned a red rating in 2016.

Superintendent Tom Boasberg said the lower number of top ratings doesn’t signal that Denver schools are getting worse but rather that the district is ratcheting up its expectations — something it has been planning for years. Now, to get the district’s highest ratings, schools need to show that more of their students can read, write, and do math at grade level.

The ratings system, Boasberg said, “really expresses our shared aspirations as a community for the academic growth and performance of our students.”

“We think it’s very, very important for us to articulate clearly those shared aspirations, to establish goals for people to strive for, and to be very public about how all of us are doing,” he added. “I know that’s not easy. Any time … you set an aspirational goal, you don’t always achieve it. I don’t think the answer is to water down your aspirational goals.”

The ratings — known as the School Performance Framework, or SPF, ratings — matter for several reasons. Many parents use them to pick where to send their children to school, a decision that’s both easier and more crucial in a district that prizes school choice. If fewer parents pick a particular school, the school gets less funding, which means it could be forced to cut the teachers or programs that would make it a desirable choice in the first place.

The district also uses the ratings to determine which schools are struggling and in need of extra money or support — and which are so consistently low-performing that they should be closed. The school board is currently reevaluating its policy for when to close red-rated schools.

Denver’s ratings have long been controversial because some people think the way they are calculated presents an incomplete or unfair picture of a school’s quality. The ratings are overwhelmingly based on how students performed on state literacy and math tests the previous spring. This past spring marked the third year students in grades three through eight took a set of more rigorous exams known as CMAS.

Their performance on those tests has actually improved over time. The percentage of Denver students scoring on grade level in both literacy and math has inched to within a few points of the statewide average, narrowing what had been a wide chasm.

Denver students have also shown strong academic growth, a measurement that compares students with those with similar score histories. Strong growth indicates that Denver students, most of whom are black or Hispanic and come from low-income families, are making more progress in a year’s time than their academic peers across the state.

But because the district made it harder for schools to be rated blue, which means a school is “distinguished,” or green, which means it “meets expectations,” fewer schools earned top ratings. In fact, 37 percent of Denver’s 207 schools got lower ratings this year than last year.

Among them were some of the district’s large comprehensive high schools, including North High and South High. Both fell from a yellow rating, which means a school needs some improvement, to an orange rating, which means a school needs more improvement.

John F. Kennedy High went from orange to red, which means a school needs significant improvement. So did West Leadership Academy, one of two smaller schools that replaced comprehensive West High. The district has in the past closed schools with repeated red ratings.

However, it also targets low-rated schools for extra financial help, providing up to $1.7 million over five years to the schools officials deem most struggling.

Boasberg attributed the high school ratings slips to a poorer than expected showing on state tests. This was the first year Colorado ninth-graders took the PSAT test, a precursor to the popular college preparatory exam, and their growth scores were surprisingly low.

“The SPF reflects that,” Boasberg said, referring to the district’s rating system.

One high school principal expressed concern that the ratings put too much emphasis on test scores and not enough on graduation rates and whether a school’s graduates can go on to college without having to take remedial courses. Stacy Parrish, principal at High Tech Early College, said she believes those metrics provide a more accurate measure of high school quality.

“When we have an inaccurate assessment tool, we are at the mercy of the color we are given,” she said. “We need to be able to control our own narrative of what we’re doing in our schools. Because across the metro area, we are doing beautiful work.”

A smaller number of schools, 10 percent, earned higher ratings this year. They include the district’s most requested middle school, McAuliffe International, which went from green to blue.

The district raised its quality bar this year in several ways. For instance, it increased the percentage of elementary and middle school students who must score on grade level on the CMAS tests for a school to be rated blue or green. It used to be 40 percent. It’s now 50 percent.

If that doesn’t sound very high, consider this: Just 45 percent of students statewide scored at grade level on the CMAS literacy test this past spring, and only 42 percent of Denver students did. The percentages were even lower for the math test.

The percentage of students in kindergarten through third grade who must score at grade level on early literacy tests for a school to be rated blue or green increased, as well. That change came after parents and community leaders complained that last year’s ratings were inflated because the early literacy tests overstated students’ reading abilities.

The ratings of these schools were downgraded because of their academic gaps:
East High
Thomas Jefferson High
Northfield High
CEC Early College
Skinner Middle
DSST: College View Middle
Girls Athletic Leadership Middle
McKinley-Thatcher Elementary
Carson Elementary
University Prep — Arapahoe St.
Southmoor Elementary
Asbury Elementary
Rocky Mountain Prep Southwest
KIPP Northeast Elementary
Cowell Elementary
Lincoln Elementary
Brown International Academy
Montclair School of Academics and Enrichment
McMeen Elementary
John H. Amesse Elementary
Columbine Elementary
Munroe Elementary

There is another factor at play this year, too: an “academic gaps indicator” that measures how well certain groups of students are scoring on the tests compared with benchmarks and with their peers. The demographic groups include students of color, students from low-income families, students with disabilities, and students learning English as a second language.

If students in those groups aren’t meeting benchmarks or if the gaps between, say, students of color and white students at a particular school are too big, the school’s rating will be penalized. Schools must be rated blue or green on the academic gaps indicator to be blue or green overall.

This year, 22 schools that would have been green were downgraded a step to yellow because they scored poorly on the academic gaps indicator. (See box.) The 22 schools include the district’s biggest and most requested high school, East High.

This is the second year the district has used the indicator. Last year, nine schools were downgraded from green to yellow. Three of them — Bromwell Elementary, Teller Elementary, and Hill Campus of Arts and Sciences, a middle school — made enough progress toward closing gaps or boosting the scores of students in those groups to move back up to green.

As for the district’s lowest-rated schools, the Denver school board is set on Monday to discuss changes to its school closure policy, which has hard and fast rules for when to shutter or replace struggling schools. Board members agreed this year not to use those rules, which rely heavily on school ratings and have been criticized as harsh and inflexible. Instead, the board talked about taking other evidence into account, though it hasn’t yet decided how that will work.

It’s also not clear if the board will consider closing any schools this year. Back in June, board member Lisa Flores, who proposed suspending the rules, said school closure wasn’t completely off the table, especially if a struggling school also has low enrollment.

Under the suspended rules, nine schools with successive years of low ratings could have been eligible for closure or replacement if they earned a red rating this year. Only two did: Lake Middle School, a district-run school, and Compass Academy, a charter middle school.

The seven other schools did better. They include the large, comprehensive Abraham Lincoln High, which earned an orange rating, and Math and Science Leadership Academy, an elementary school that jumped up to a green rating this year.

The district held a press conference about the ratings Friday at the Kepner campus in southwest Denver. That location is noteworthy because it represents one of the more controversial school improvement strategies deployed under Boasberg, who is stepping down as superintendent next week after 10 years at the helm of Denver Public Schools.

In 2014, the district began phasing out struggling Kepner Middle School. The district replaced it, grade by grade, with two new schools that share the building: Kepner Beacon, a district-run middle school, and STRIVE Prep – Kepner, a charter middle school.

Those two schools are green this year. Technically, so is Kepner Middle School, which doesn’t exist anymore. The last class of 132 Kepner Middle School eighth-graders moved on last spring, but their test scores were good enough to earn a green rating this fall.

“Turnaround is not an easy process and there’s lots of opposition,” Boasberg said. But he pointed out that in the case of Kepner, as the original middle school shrunk, the students who remained there thrived. “What turnaround is ultimately about,” he said, “is, ‘How do we get better opportunities faster for the students we serve?’”

Find your school in the spreadsheet below. Or look it up on the district’s website.