Authorship for sharing data? Social responsibility and power in science

May 5, 2016 Academia Research 3 Comments

Candice C. Morey and Richard D. Morey

Science is a social activity. It does not happen in a single-lab vacuum; rather, science happens across global networks of groups investigating similar topics. Our research is influenced by previous research and also by the work that I know other researchers in my network are currently pursuing. Our research benefits greatly from the generosity of my colleagues, most of whom are eager to explain their work, share published data upon request, and share experimental software and materials. We, likewise, participate in the same activities. Sharing software, analysis code, and data saves a tremendous amount of time: it prevents labs from continuously reinventing the same tasks and analyses, or from repeating ideas that do not work as envisioned. Sharing data also ensures that others can check and extend my analysis, instead of having to take us at our word about what the data “say”.

Note that we didn’t use the word “reciprocate” or anything like it in describing these relationships. That’s because these networks are not formed because of any quid pro quo established between our team and people who have cooperated with us previously. They arise because of the responsibility we all take on when publishing about our data. In reply to a query about whether the original authors of a data set should be given co-authorship on a new paper using their data:

@PsychScientists I don't think so. The paper + data are the science. W/o data the paper is just a brochure; w/o paper, data is purposeless.

— Chris Chambers (@chrisdc77) May 1, 2016

Our papers are for describing our research as clearly as possible. That job doesn’t end when the paper is published and its reference is enshrined in the corresponding author’s CV. With authorship, we accept an obligation to promote and defend that research forever. That’s why in their tract “On Being a Scientist”, the National Academy of Sciences said that keeping clear data records is a fundamental scientific obligation, and that

“…when a scientific paper or book is published, other researchers must have access to the data and research materials needed to support the conclusions stated in the publication if they are to

verify and build on that research…Given the expectation that data will be accessible, researchers who refuse to share the evidentiary basis behind their conclusions, or the materials needed to replicate published experiments, fail to maintain the standards of science. (p. 11)”

Obligations to the data sharer

Suppose you use publicly shared data in your own, novel work. What are your obligations to the original corresponding author? Is citation a sufficient acknowledge of their contribution, or should co-authorship be offered?

Surprisingly to us, many commenters (see discussions on Facebook and Twitter) feel that original authors should become co-authors on any subsequent paper in which their data were used, in acknowledgement of the work performed in collecting the original data. After all, that work was instrumental to the new work, in some cases even irreplaceable.

While we can see the superficial appeal of such arguments, we think they quickly collapse with scrutiny. Science is always an iterative endeavor. All of our research depends on crucial observations made before our own, and yet we do not routinely acknowledge with co-authorships all the previous work on which our work was founded.

Perhaps data re-use is a special case that should differ from re-use of previous theoretical or methodological ideas? One argument is that data sharing can require additional work (e.g., explaining the columns of a data set, etc) that are not entailed in other cases of building on previous research. However, as the quote from “On Being a Scientist” above makes clear, clearly explaining the results of one’s research is an obligation of publishing. The extra work was implicitly agreed to when the paper was submitted for publication.
The second argument against data re-use being a special case demanding coauthorship is that offering coauthorship is not, by any means, standard practice. Meta-analysis is the re-use of published data in novel secondary analyses, and yet common practice is simply to cite the data included in the meta-analysis, not to promote the original authors to co-authors on meta-analytic papers. This holds regardless of whether the meta-analytic data were found in archived papers or gathered via soliciting the data from the corresponding authors.

Is there any harm in allowing the original author of a data set you want to re-use as a co-author on a new paper? Some commenters imagine that this is a “win-win” scenario that costs neither the original author nor the new author anything.

We think this is mistake for several reasons. First, we doubt that it is healthy to view publication or authorship as a “win”. Second, as Chris Chambers pointed out, the pressure to add a corresponding author as your co-author as a condition of receiving the data at all creates a perverse incentive that rewards bad citizenship.

@PsychScientists @CandiceMorey < offering only a citation to those who do, we create a system that explicitly rewards lack of transparency.

— Chris Chambers (@chrisdc77) May 2, 2016

We should be using policy to promote adherence to scientific norms and good citizenship, such as the transparency that comes with unconditional release of one’s data and analysis.

With great responsibility comes great power

Finally, we should keep in mind that authorship is not a mere reward; authorship comes with responsibilities. All authors on a paper share responsibility for the content of a paper. A full consideration of these these responsibilities provides strong reasons why it should not be standard practice for original authors to be included as coauthors on papers using their data.

Lead authors have a responsibility to ensure that all coauthors consent to the papers’ submission. This is a bedrock ethical principle in scientific publishing; failure to obtain consent of all authors could lead to retraction. This responsibility means that original authors, if included on every subsequent paper about the data set, have veto power over any interpretation or critique of the data set. Such limitations on the freedom of the authors using the data set should be unacceptable to both the original authors and the authors using the data set.

Coauthors have a responsibility to ensure the integrity of the derivative work. As co-author, the original author is also responsible for ensuring that the novel work is sound regardless of whether they have sufficient expertise to judge it. It would be awkward for the original author to be forced to accept co-authorship on an analysis that s/he is not sure is adequate.

For these reasons, we think that co-authorship is certainly not a “win-win”, but a dangerous proposition for both the new authors and the generators of the original data.

3 thoughts on “Authorship for sharing data? Social responsibility and power in science”

Jay

May 6, 2016 at 00:17

I like that very last point. Perhaps there is a solution in making your first drafts so unbearably inane that the other party simply opts out of coauthorship.

Pingback: All references are equal, but should some be treated as more equal than others? On the data-set authorship discussion. | …not that kind of psychologist
Tom Wallis

August 4, 2016 at 12:28

Great post. You raise a few new points I hadn’t considered, particularly the “veto power over any derivative work”.

I would like to raise an additional analogy regarding the comment of Julia DG:

“the very life of your paper relies on her data, so of course its a major contribution and she deserves authorship”

The same could be said for infrastructure (scientific or otherwise), or research grants leading to salaries. For example, running a new analysis on old data could require the use of a university’s computing resources, without which the new paper would also fail to live. Should those who procured the compute cluster also be co-authors?

It is for this reason that most publishing guidelines are very explicit: simply holding the grant or providing equipment to conduct research is not a sufficient criterion for authorship. Published data is simply another resource that should be accessible to the community.

The Mnemonic Lode

Authorship for sharing data? Social responsibility and power in science

Obligations to the data sharer

With great responsibility comes great power

Interpreting nulls, even surprising ones, is not trivial

Mnemonic strategies: flexibly deployed to serve memory and to preserve the theoretical status quo

3 thoughts on “Authorship for sharing data? Social responsibility and power in science”

Leave a Reply Cancel reply