“What Works” in Education Is Not Merely A Question of Effect Sizes

Here’s a pet peeve: A champion of some particular education intervention will point to some research study showing Intervention X led to positive outcomes for participating students. Ok, great. Assuming it’s a good study, we now have evidence that Intervention X “worked” in a given Situation Y.

That does not mean Intervention X will easily replicate to new Situation Z.

Anytime you hear someone attach the phrase “high-quality” in front of some intervention, they’re talking about this problem.  These caveats pop up frequently in debates over particular reforms:

  • The small school reform effort produced long-term gains for students, but their backers largely abandoned them.
  • School integration efforts produced large gains for black students (with no harm to white students), but formal integration programs were and remain relatively small.
  • “High-performing” charter schools produce large gains for students, but there is wide variation in those results.
  • Teacher evaluation reforms produced gains in some cities, but the effects were smaller or non-existent when similar reforms were spread more broadly.

Some of these reflect implementation challenges. Others are more about politics (which is itself a particular type of implementation challenge). I am by no means the first person to make this point, but we can’t just say something “works” or “doesn’t work” without giving some consideration to where the policy worked, for whom it worked, what outcomes changed, and by how much it changed the status quo.

–Guest post by Chad Aldeman

  1. Jordan Posamentier

    Yep, there’s also the problem of thinking something works only if the study shows a big main effect size. See Geenberg (2017) available at https://www.tandfonline.com/doi/abs/10.1080/19345747.2016.1246632?journalCode=uree20 (concluding that “universal interventions are likely to be cost-effective, even if they produce outcomes in only certain segments of the entire population, and they should be analyzed in ways that allow us to capture these impacts, not merely by examining main effects.”).

