[Note: I have recieved some great critiques regarding this post, including the thought that I may be conflating some of my concepts. See my response at the end of the post.]
This TED Talk on education data that strives to test students' ability to adapt to change and solve real, unfamiliar problems. Andreas Schleicher and his team have done incredible work. I am going to criticize their great work in this post, and I want to be clear that I see serious critique as the highest respect we can pay to work of any kind.
It means that we take the work so seriously that we are willing to devote our own energies to try to improve it. I'm not writing this preface as an apology, but mostly to make a point about critique that is often not recognized.
So, on with it...
A Bit of Background
I highly recommend that you watch the talk above--this post will still make sense, but won't hit home in the same way if you don't.
For those of you who can't listen to the audio/video of the talk:
Andreas Schleicher heads the Program for International Student Assessment (PISA) at the Organization for Economic Co-operation and Development (OECD). What it means is: He's designed a test, given to hundreds of thousands of 15-year-olds around the world (the most recent covered almost 70 nations), that offers unprecedented insight into how well national education systems are preparing their students for adult life. As The Atlantic puts it, the PISA test "measured not students’ retention of facts, but their readiness for 'knowledge worker' jobs—their ability to think critically and solve real-world problems."
The results of the PISA test, given every three years, are fed back to governments and schools so they can work on improving their ranking. And the data has inspired Schleicher to become a vocal advocate for the policy changes that, his research suggests, make for great schools.
It's very interesting and useful research; however, it's heavily overstated.
Epedemiology began as a methodology for studying and predicting the transmission of communicable diseases (primarily bacterial and viral). Because these diseases are passed through physical means, and because we understand the organic mechanism of action through which the bacterium or virus causes disease, we can draw valid conclusions/make predictions based on epidemiology.
To clarify: if we know that Cholera bacteria cause particular symptoms in all healthy humans, we can say that the physical relocation of Cholera from [Non-immune Person A] to [Non-immune Person B] will cause the same symptoms. We can say this because humans are nearly genetically identical, though they may vary widely in terms of phenotypic expression.
When using epidemiological methodology, we need causality to make any sort of valid conclusion. No exceptions, and "close" doesn't count. If we don't have causality, the furthest we can state a claim is in the form of a testable hypothesis.
Communicable disease are just about as close as we can get to showing genuine causality, as opposed to correlation, so we don't need any further investegatory steps to make valid conclusions/recommendations. Side note: for most of modern history, on of the only other things that we could really claim proper causality for was Newtonian physics...but Bohr, Einstein, Planck, etc. changed that.
We are able to draw valid conclusions from epidemiology, as the study of transmission of physical disease factors, because of the strength of the correlation coefficient, which is effectively 1 in the case of communicable disease.
What Does This Have To Do With Education Research?
Since other sciences have adopted epidemiological methodology, they have failed to recognize the above fact about epidemiological methodology.
Because correlation coefficients in other sciences are not 1, we cannot draw conclusion based on this sort of comparative/epidemiological methodology. Period. We can't even say, "Well, it's strong enough to make a compelling case for conclusion X." There are too many confounding factors. All we can say is that the correlation warrants the formation of a hypothesis, based on a proposed mechanism of action, that will then need to be tested.
Schleicher does a good job guarding against outright, cross-contextual conclusions, but the problem is that he's still looking at epidemiological factors and not making it to testing via proposed mechanisms of action. (Note: a proposed mechanism would NOT just be a context-appropriate blend of epidemiological factors.)
In nutrition, epidemiology is misused to draw conclusions (aka, recommendations) about systems of eating based on cross-context/country studies. Unfortunately, these conclusions/recommendations are, by definition, invalid. Following an epidemiological study, the next step requires that we propose and test a mechanism of action in order to reach (or work towards) a conclusion. In the case of nutrition, this takes the form of biochemical lab tests based on an understanding of organic/biochemical mechanisms.
To clarify things, let's look at an example of each kind of study. Pay close attention to the kind of thing that each study is trying to accomplish:
Example of epidemiological study: "Risk factors for subclinical atherosclerosis in diabetic and obese children."--this is the sort of study that you cannot use to make conclusion/recommendation.
Goal of study: "We aimed to evaluate early signs of atherosclerosis and investigate for predisposing factors in children and adolescents affected by type 1 diabetes."
Example of study looking at a mechanism of action: "Mechanisms of β-Cell Death in Type 2 Diabetes"--this is the sort of study that tests a mechanism of action. A series of studies like this would allow us to reach a valid conclusion/recommendation.
Goal of study: "We will show that pathways regulating β-cell turnover are also implicated in β-cell insulin secretory function. Depending on the prevailing concentration and the intracellular pathways activated, some factors may be deleterious to β-cell mass while enhancing insulin secretion, protective to the β-cell while inhibiting function, or even protective to the β-cell while enhancing function."
Notice that the first (epidemiological) study, if combined with other studies, at best allows us to pose a hypothesis that would become the subject of the tests in the second (mechanism) study. Only after we have verified a mechanism of action can we then reach a conclusion/recommendation (which would then be subject to review, verification, etc.).
Note: The above two studies are not the best examples because they concern biologically distinct disease states, so we wouldn't stack them in the progression I present above, but it's close enough to make the point.
But What About Education?
An example of an education study using epidemiological methodology? The video above.
An example of an education study that looks for mechanisms of action? I've never seen one. I've tried to lay the groundwork for this in my thesis because I have not seen it anywhere in education research, nor much of any social science research in general.
It's easy enough to come up with examples of mechanisms in the physical sciences if you're science literate: tidal forces, β-cell death, heat transfer, etc. But what would a mechanism of action look like in the social sciences?
An example of work that comes close to reaching for a mechanism of action in the social sciences is, perhaps oddly, Peircean Semiotics. Whether Peircean Semiotics is helpful or not, it is still a proposed mechanism of action. It represents a relatively easy-to-grasp example of the kind of thing I mean when I point towards mechanisms of action in the social sciences. It is a mechanism in part because it is not a nod towards particular populations or their features, but rather a look at certain rules that these populations are subject to.
Elsewhere, I received the critique that I'm conflating statistical methods with epidemiology, and that what I'm talking about actually has nothing to do with epidemiology. Here is my response:
I should be more clear: problematizing this through the lens of epidemiology is helpful because it points out the shortcomings in such a way that we have a very wide range of possible solutions immediatley available to us. Statistics alone is too broad of a scope to look at the problem, because it is a tool that is used across all sorts of different applications. In other words, yes, both use statistical methods, but if we're interested in ameliorating a specific issue, we need a context-specific methodology that allows us to do so. Statistics-broadly does not give us any such context-specific method, but epidemiology provides an analogical, context-specific method that we can learn from.
Any practical recommendation we make based on data is not part of statistics-proper. We may make statistical inferences, and even this is falls at the furthest reaches of what we might still call statistics, we have still not yet made practical recommendations for actually people to follow. Once we start (implicitly or explicitly) making practical recommendations, we're not only using statistical data, but we now require much context-specific methods that are out of the realm of statistics. This is why it is helpful to look at this problem through the lens of epidemiology--epidemiology is a practice that often misuses statistics in similar ways to education research, but the best biochemists know how to make statistical data valuable in a methodologically valid manner.