Stop Evaluating Funding Streams…And Joel Rose & Jenee Henry Wood On NCLB-Style Accountability

Recently I mentioned that I have no big hot “take” on the question of schools being open or not during Covid’s Omicron surge because it’s largely a local issue contingent on local questions and conditions. Non-takes are a funny thing. Sometimes they generate more negative feedback than a “take,” even one lots of people disagree with. Devil you know or something.

Anyway, for today, here’s an actual take: Stop basing state and local policy decisions and advocacy on big broad studies that get headlines but may not have a lot of explanatory leverage for the question you’re thinking about.

Eric Lerum makes this point on teacher evaluations. Everyone is saying evaluation doesn’t work based on this recent study. But there is an IMPACT in DC, heavily evaluated, and with positive effects. Meanwhile, there are other successful evaluation schemes (the recent eval paper linked above discusses this but that nuance eluded most of the coverage and social media chatter I saw). I have my criticisms of the thrust toward teacher evaluation but to conclude it just doesn’t work is absurd. Here’s Kate Walsh on the same.

We see versions of this again and again.

On reading there was a lot of evidence that Reading First was helping with reading instruction in various places. That wasn’t pro-Bush hype, it was coming from lefty policy shops like the one Jack Jennings, a long time Democratic Hill aide, ran. But as usually happens when you evaluate a funding stream the overall results were OK but not a slam dunk and now the CW is that it didn’t work. Investing in good, evidence-based, reading instruction is one of the best things we can do to help kids get a solid start in school.

A similar thing happened with President Obama’s school turnaround initiative (SIG). There was evidence about what worked and didn’t work in turnarounds, but everyone seized on the top-line finding.

It’s weird that people have such a rooting interest in things like teacher evaluation or school turnarounds not working. That’s for another day.

This same issue has vexed Title I evaluations for years. It’s a huge funding stream, spent on many different things. Some effective, some not. The other day we talked about pre-K and why the difference in quality between providers is certainly a reason for the underwhelming results we see there.

In practice, often what you find are bigger differences within treatment groups than between them. SIG is a great example here. The real takeaway was that light touch turnaround initiatives tended to bounce off of low-performing schools, but more intensive interventions could have impact. That’s the action, not the top line “does it work” question when “it” is a bunch of money. Because politically the system loves light touch interventions (often more money) and hates the disruptive stuff (real change) it’s a result with tricky substantive and political implications. Especially given that politics meant – as we also saw with NCLB accountability – the inclusion of lots of light touch trap doors.

Ed tech is a mess here, too, in different ways. Too infrequently is the question asked, does it work for X student population in Y setting instead of just, “does it work?” Bart Epstein has been laboring in these vineyards for a while now.

And charter schools, whoo boy. There is a lot of variance in performance by state, urban, suburban, and rural status, and other factors and yet three decades in it’s still, “do they work?”

A few takeaways:

  1. Don’t evaluate broad funding streams, evaluate specific initiatives, that’s how you learn. In general the differences within specific initiatives will be greater than the differences between things at scale. So focus there wherever possible and mandated evaluations should take this into account. 
  2. Scale with fidelity is still a big challenge. Doesn’t work and doesn’t work at scale are two different issues – especially given the political dynamics around scale in the education sector. And yes, because everything works somewhere contexts and conditions matter an complicate this even more.
  3. Do evaluate, because these broad evaluations don’t offer a great deal of leverage for policymakers the response can’t be, “it’s just not knowable.” The issue is being thoughtful about what we can know or not know. Faddishness and nihilism are always a risk to be mitigated in this sector.
  4. As always, ignore horse race coverage or political coverage of various studies, there is almost always more nuance. After the pandemic, I shouldn’t have to tell anyone this.

While we’re here, a bonus take. Would you invite Joel Rose to your birthday?

Rose and Jenee Henry Wood celebrated the anniversary of No Child Left Behind by attacking it! That’s fine, debate is healthy and NCLB had its pluses and minuses both in 2001 when the deal came together and as it played out. Bellwether has worked with New Classrooms on a few occasions, for instance here. Wood and Rose raise important points about anchoring accountability in grade-level proficiency that policymakers most consider.

Yet from an equity and accountability standpoint we have to be thoughtful about how to ensure that policy doesn’t open trap doors for kids who then never catch up. School accountability systems often created a dynamic where schools were always getting there but there was no accountability for at some point getting there. That’s why NCLB put in place the floors (and in practice they were low-floors) that everyone hated. The 100% thing you heard so much about was political mythology echoed by a credulous press. It’s not how the law worked.

On the other hand, those in the Joel and Jenee camp make two important points. First, just focusing on grade level proficiency creates an incentive for schools to get kids over the bar in math but not teach them the foundational skills and concepts they actually need later. The donut problem. And, second, that grade level proficiency creates perverse incentives more generally.

But moving away from grade level proficiency creates really bad incentives, too, with low-income, Black, Hispanic, ELL, and special education students most impacted. That’s the puzzle policy leaders have to solve. There are discussions and efforts underway to find ways through that balance equity and personalization but it’s far from straightforward when it’s time to actually put pen to paper about what’s going to count. It’s especially complicated to solve against a backdrop of a system that’s generally just allergic to accountability. Also, in case you spent the last two years in a cave, it’s also a system that doesn’t default to putting the interests of kids first or thinking of them as the prime client.

I’ve said enough. (It’s a month until Mardi Gras). Happy February.

Leave a Reply

Your email address will not be published.