If you had told me as a child that one day, I would be an entomologist writing a scathing retort to a New England Journal of Medicine Editorial, I probably wouldn’t believe you. First, I’d be confused because I didn’t know what an entomologist was, at the time, and secondly, because I was instilled with a strong sense of deference to authority as a child. Authority wants whats best for us, collectively, right? Authority makes reasoned, evidence based decisions that helps society be functional and productive, and since I cannot be an expert in all things, I should defer to the authority in a given area.1 It’s how living in a community, in a society, works best.
I think this is why I get so mad, so personally affronted, when I observe people in positions of authority who aren’t acting in ways that support the greater good, and instead are taking painfully obvious actions to maintain their own authority over a group.
In case you haven’t read it, I’m talking about this editorial.
If there ever was an authority I’d uncritically defer to, it is the New England Journal of Medicine.
Seriously, though. When it comes to contentious issues in medicine, NEJM is certainly regarded as an authority. But their recent editorial on data sharing, well, baby, you’re in my house now.2
There are many, many reasons that data shouldn’t be shared, and
most many open science advocates are quick to acknowledge these issues- the editorial touches on some of these points. The big, obvious ones I see are confidentiality concerns and situations where releasing the data would otherwise present a hazard to the subjects under study.3 However, there is also a really important dynamic that’s often unacknowledged- the interplay between open science, privilege and power. Terry McGlynn explores this issue in more depth on his excellent blog post, but it can be summarized as such- the people in the most precarious positions- the students, the postdocs, the people working at small institutions who don’t have the resources to support many irons in the fire- are the ones that face the most risk when data sharing. The established scientists with large budgets at large research institutions, and the infamy and clout to defend their research ‘territory’ (if you will) have disproportionately little risk by sharing data. Yet, most outreach activities in the open science community target early-career scientists,4 and the most vocal cries against data sharing I’ve seen have come from the most established of the establishment. I take no objections to these arguments, and am actively working within the system to try and mitigate these risks and issues.5
So, this all being said, there are two main points that I take6 exception to in the NEJM editorial.
A second concern held by some is that a new class of research person will emerge — people who had nothing to do with the design and execution of the study but use another group’s data for their own ends, possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what the original investigators had posited.
Read that again.
even use the data to try to disprove what the original investigators had posited.
Wow. So you mean to say that the data is only valid when used to support the collectors’ hypothesis? Do we need to do a little bit of a review of the scientific method here?
This statement irks me for several reasons- first of all, it assigns some sort of social value on the hypothesis. Everyone likes to be ‘right’ but hypotheses never are- they are either supported or not supported by the data (within the frequentist paradigm at least). However, data supporting one hypothesis doesn’t mean that hypothesis is true- it just means it was the best hypothesis tested in the study.7 If another person comes along and uses the body of available data to formulate a new, better supported hypothesis, this is not something to get sore over- this is a sign the scientific process is working. I know, scientists are people with egos, but if you really believed that your paper, your hypothesis was the final answer, shouldn’t science, I dunno, stop?
But it doesn’t stop. I think if you want to be the final answer in science, then you don’t really want to be a scientist. You just want to win.
The second bit that gets to me is more personal:
There is concern among some front-line researchers that the system will be taken over by what some researchers have characterized as “research parasites.”
This issue of the Journal offers a product of data sharing that is exactly the opposite. The new investigators arrived on the scene with their own ideas and worked symbiotically, rather than parasitically, with the investigators holding the data, moving the field forward in a way that neither group could have done on its own.
So, Hi! I’m Christie and I am a research parasite. I’m a pretty productive scientist for my career stage- partly because of what I often jokingly refer to as niche partitioning- I function as the data analyst on most of the collaborative projects I’m on. I see this as a mutually beneficial, heck, symbiotic relationship- and I believe most of the collaborators and data creators don’t feel my role is parasitic, exploitative or derivative. Yet NEJM seems to think this sort of positive relationship is some sort of exception. It also belittles the science I do-for example, one of my recent papers used three separate data sets, produced by others, for applications other than the data creators intended- and not all data creators ended up being authors on the final manuscript (although some did- based on their contribution to the scientific ideas, analysis and writing in the final paper). This paper is an original contribution to the literature that builds on the work of others, brings together the information we know about several domains to create new knowledge. This is what I grew up believing science was about.
What, precisely, does the typical “research parasite” look like to the editors of NEJM , I wonder? Evil monsters, lurking in the shadows, taking fuzzy pictures of your poster presentation so they can copy your graph? That grad student who can smugly stats faster than you, so she can ANOVA the crap out of your RCBD and get into Nature? Certainly not human beings with lives and families, who are interested in how the world works, and want to use our existing body of knowledge to ask more meaningful questions. Nah, that would never happen.
This NEJM editorial is not just about data sharing. It is about the scientific establishment using its power to foster the culture of fear and competitiveness that keeps them in power. And I, for one, am not buying what they’re selling.
1. this approach still tends to work reasonably well in the sphere of hair grooming. I usually just say to my stylist “You’re the expert, just do something low maintenance and flattering” and, as long as I don’t go to SuperBudgetCuts, the outcome is usually better than if I’d attempted to micromanage the situation.
2. If you take data sharing advice from me, you are not officially obligated to also take medical advice from me. Not that kind of doctor. No. Stop. I don’t want to see your rash.
3. Bahlai et al 2012, “A comprehensive listing of exact locations of endangered species often poached for the alternative medicine industry” in the Journal of Hypothetical Examples is, indeed, my most under-appreciated paper.
4. #OSRRcourse. Guilty as charged, officer.
5. This is a rant for another day, but I see the problem as there being a near total lack of incentives for open science practice within the traditional ways that scientists measure success. The establishment maintains control of these metrics, meaning that scientists who succeed by traditional metrics are the ones that gain power. Basically, a positive feedback loop of establishment power, corporate interests in the form of scientific publication, and people rewarding only the people who think most like them. This is wrong, and we need to rise up against it.
6. expletive deleted
7. “All models are wrong, some are useful” GEP Box, I believe. Models, mathematical formulations of hypotheses, are abstractions that can approach truth, but never really hit the truth asymptote, because nature isn’t neat and clean like that. But when you have a frequentist paradigm test of a hypothesis, you’re typically rejecting a null rather than directly testing your hypothesis. So basically, you’re saying “Well, it’s not NOT my hypothesis so my hypothesis is supported.”