Manage your data - everyone!

Publishers

What are the actual requirements for journals in Physics and Chemistry in terms of data sharing? Overview of learned societies and general publishers.

Contents

In reference to the problem of reproducibility, Tsuyoshi Miyakawa, editor-in-chief of Molecular Brain reports having rejected 40 articles out of 181 in two years because of a lack of data supporting the conclusions. (1)

In 2019, a study entitled “Effect of impact factor and discipline on journal data sharing policies” reviewed the policies for data sharing in 447 journals coming from different disciplines related to medical science. Only 12 of them, i.e. 2.7%, had a strict policy making data sharing a condition for publication.

Two factors can influence the adoption of stricter standards for data sharing. The first is the impact factor. Journals with a high impact factor are more likely to insist on data reproducibility and peer reviews (2). Discipline is the other factor. Out of 150 journals examined in 2015 in a survey conducted by COPE, 30% of biomedical scientific journals presented a very voluntary policy in terms of data sharing, compared with 10% of Physics and Chemistry journals for example.
Physics and Chemistry journal publishers demonstrate great disparities in requirements and ambition for the publication and verification of data. See below for a summary current practices of learned societies and general publishers.

LEARNED SOCIETIES

American Chemical Society

Since February 2020, ACS has set up a platform to facilitate the deposit of NMR (Nuclear Magnetic Resonance) data according to FAIR principles. To date, only two organic chemistry journals, Journal of Organic Chemistry and Organic Letters, have participated in this pilot project by inviting authors to use the service to deposit their FID files. At this stage, data are not being stored on ACS servers, though this is one of the project’s objectives.
For each journal published by the ACS, instructions are available in the Publications Center. General recommendations on NMR data are published here. Crystallographic data must be verified using the CheckCIF tool. A list of file formats accepted for “supporting information” is also available.
For more detail about each journal policy, go to the drop-down menu and read the information regarding “supporting information” and “data requirements”. Prerequisites vary greatly from one journal to another:

  • The Journal of the American Chemical Society provides a considerable amount of space to this in its guide for authors.
  • The Journal of Organic Chemistry also provides a rather thorough user manual. It explains that: “The publishers insist that the presentation of yields over 95%, of isomeric ratios above 200:1 and enantiomeric excess of more than 99% are not considered realistic without detailed explanations.”
  • The Journal of Proteome Research insists on the obligation for authors to deposit raw data and associated metadata in warehouses such as ProteomeXchange with access details in the manuscript (URL and even passwords if necessary). “Data remains confidential while the manuscript is under review, but will be opened once it is published.”
  • The journal ACS Nano provides author instructions with data requirements. Authors must provide solid proof for both the identity and the purity of new substances studied. Expected specifications are detailed for the different types of data.
Royal Society of Chemistry

The RSC displays a clear and detailed policy: “During the submission of a manuscript, authors must provide all the data necessary for understanding and verifying the research presented in the article.”
Instructions may differ depending on the type of data: x-ray crystallographic data must be deposited in a dedicated repository; chemical compounds and spectra can be included in the “supporting information”; so-called “additional” non-compulsory data that is potentially useful for future research, can be deposited in a discipline-based or general repository.

Spectral data must be supplied in JCAMP-DX format. Guidelines facilitate substance description through a dozen criteria (yield, melting point, spectra, refractive index, etc.). Crystallographic data must, of course, be in CIF format and accompanied by a checkCIF report.

In addition to these general guidelines, the journals published by RSC can add their own recommendations, depending on the field covered. For example, the journal ChemComm encourages authors to provide a complete description of the conditions in which magnetic measurements of samples were taken (type of capsule used, etc.). The journal Energy and Environmental Science requires detailed information on the calibration protocol for solar converters.

American Institute of Physics

AIP published 32 journals, as well as proceedings and books. Authors are recommended to deposit their data sets in publicly accessible repositories or to present them in the main manuscript so that readers have access to all data sets used to support the conclusions. AIP’s policy for research data is available here.
When the article is submitted, the author must provide a “declaration of data availability” (see: Data Availability Statement Templates).
The journals all follow the general recommendations of the AIP and do not adopt their own standards, unlike, for example, ACS journals.

Institute of Physics

The IOP publishes 96 journals and conference series. The IOP participates in the Committee for Publication Ethics (COPE).
General policy: “IOP Publishing supports the principle of transparency and open data. The goal is reproducibility for which data, codes, and research materials supporting research articles must be available.” The policy is available here. Two aspects on standards and data availability are discussed in detail: Standard data policy and Data availability policy.
All of the IOP journals follow these recommendations. It is recommended to consult possible specific journal policies in the section “About the journal” on their website.
Supplementary materials: supplementary material files must not exceed 10 MB. It is recommended to deposit larger files in a data repository. It is recommended to deposit larger files in a data repository.

Conditions for availability: IOP recommends publication in a discipline-based warehouse, but accepts general warehouses. The IOP cites the following repositories: Figshare, Dryad, Harvard Dataverse, Zenodo. It should be noted that: IOP Publishing has developed a dedicated repository on Figshare (see: dedicated repository on the Figshare platform), used by the journals Environmental Research Letters,IOP SciNotes, JPhys Complexity, Machine Learning: Science and Technology.

According to the policy of these journals, it is compulsory to make a data availability statement.

IOP Publishing journal published in the name of another society or organization is authorized to establish their own research data policy. This is the case, for example, for the American Physical Society which provides a data guide.

American Physical Society

APS publishes 15 journals including Physical Review A, B, C, D, E, Physical Review Letters, Reviews of Modern Physics, Physical Review Applied. The APS policy for data distribution is only provided for “Supplemental Material” archived by the publisher. It can include “multimedia files, raw or analyzed data tables, parameters used in or produced by calculations and computer codes as well as information about how the research was conducted (sample preparation, derivatives, etc.).
All files related to an article are stored in a single unique deposit and are attributed an URL for additional documents. The URL appears in the article’s list of references.

Society of Photo-Optical Instrumentation Engineers

SPIE, a learned society publishing 11 journals specialized in the study of light, mentions in its ethical provisions (SPIE Guidelines for Ethical Publishing), that “research results of research must be preserved in a form for analysis and examination.”
The organization does not publish large data sets at this stage. Authors can, however, refer to large data sets in their manuscripts with a link to the repository. The Advanced Photonics journal specifies that the total size of files attached as supplementary material must not exceed 100 MB.

American Geophysical Union

The American Geophysical Union publishes 22 journals (via Wiley) including Geochemistry, Geophysics, Geosystems, Reviews of Geophysicsor JGR: Solid Earth.
This organization adopted a data publication policy as early as 1993. In 2015, the AGU declared: “Scientific earth and space data should be widely accessible in multiple formats and that long-term preservation of data is the full responsibility of the scientists and institutions that sponsor them.” The AGU stipulates that “any data necessary for understanding, assessing, reproducing and developing published research must be made available and accessible whenever possible”. The AGU journals follow FAIR data principles (see guidelines for enabling FAIR data published by COPDESS, Coalition for Publishing Data in the Earth and Space Sciences)

Authors are recommended to identify and archive data related to their article in a repository widely used in the community, to preserve data for at least 5 years or to provide access using a clear process. If these conditions are not respected, the AGU reserves the right to refuse publication.
Authors have very complete instructions, with examples of templates to indicate the availability of data, types of recommended respositories etc.

European Geosciences Union

EGU, that publishes 19 journals in open access, such as Geoscientific Model Development or Atmospheric Chemistry and Physics (ACP) was a signatory to the Berlin Declaration in 2003 and in March 2020, approved the Open Access 2020 initiative. EGU has specified several issues concerning big data and earth sciences.

GENERAL PUBLISHERS

The major publishers of scientific journals (Elsevier, Springer Nature, Taylor and Francis, etc.) as well as specialist publishers have introduced policies on access to and dissemination of data relating to articles published in their journals.

Springer Nature

Springer Nature (which publishes Nature Physics, Nature Chemistry, Nature Materials, Nature Photonics, La Rivista del Nuovo Cimento) requires, as part of its standard policy, to include a declaration of data availability for every article published. This declaration must indicate what data is available, where they can be found and under what conditions, regardless of whether the data is original or re-used. The authors are strongly encouraged to deposit their data in a repository when they when they publish an article. The publisher also provides a list of recommended repositories for each discipline. For certain types of data (such as proteomics data, small molecule crystallography data or macromolecular structures) the depot in a public repository is mandatory, accompanied by a citation in the article including the permanent identifier of the dataset. The referees may request access to the data and code, if they find it necessary, when evaluating the manuscript. When the data is too confidential to be shared publicly, the conditions of access to the data must nevertheless be stipulated in the article. Springer Nature has stated its objective to see all its publications adopt this standard policy. This is already the case for BMC, Nature and Springer Open publication. The process is still underway for the other Springer publication and Palgrave Macmillan publication. Some newspapers may still have a less strict data policy than that described here.

Elsevier

“Our policy aims to encourage rather than force authors to share their research data”, the publisher states. Among potential warehouses, Elsevier recommends their own platform, Mendeley data, as well as about 60 other more specific repositories.
Elsevier systematically refers to data availability statement, that the authors are invited to provide. For the Journal of Computational Physics, co-submission to Data In Brief and to SoftwareX is also proposed. In chemistry, data can be highlighted in Chemical Data Collections.

Wiley

The publisher’s requirements vary greatly, from encouraging to requiring deposit, and even peer reviews. Wiley has a partnership with Dryad, a repository which charges fees for depositing data. Some of their journals, in the medical or environmental science fields, cover the cost of deposit. Generally, a “data availability statement” is required from researchers who can indicate whether their data is open in a repository, subject to embargo or available upon request. A FAQ gives more information about expectations of authors for data. In detail, we need to look at the policies in each of the magazines.

Angewandte Chemie International Edition, and Chemistry, a European journal, both offer a fairly detailed user manual, depending on the type of data to supply (NMR, infra-red, mass spectroscopy, catalysis, precise method used for energy conversion and storage, etc.)

By contrast, Polymers for advances technologies provides little information on the subject, simply indicating that an article may not contain more than 5 graphics, unless this is essential for the article to be understood, others can be placed in “supplementary information”. Where possible, data should be made public, but this is not an obligation.

Taylor and Francis

T&F policies for data communication range from optional to required. Just like Wiley, a “data availability statement” is requested from authors.
More extensive criteria exist for 6 Geoscience journals, where data deposits in suitable platforms are required. Like ACS, a partnership has been made with Figshare to host “supplementary information”.
Examination of chemistry journals policy terms shows two recurring features. On the one hand, the publisher urges researchers to provide a DOI for data sets associated with the article. On the other, it indicates that data is not reviewed by peers, implying that it is the author’s responsibility to ensure the reliability of data.

EDP Science

The policy of EDP Science, a publisher that stresses their commitment to open science, presents its data sharing policy in the context of research funding: “EDP Sciences journals encourage authors to share and publish their data if it is legally and ethically possible.”
Authors are encouraged to deposit their data in an on-line repository so it is “available for human and machine reading, in order to contribute to the acceleration of scientific discovery”. They are also invited to deposit their data according to FAIR data principles and to provide a data availability statement. These recommendations apply to the following journals Acta acustica, Astronomy and Astrophysics, Journal of Space Weather and Space Climate, are concerned by these recommendations.

Cambridge University Press

Concerned by the principles of transparency, the journal Journal of Fluid Mechanics published by Cambridge University Press, mentions its research data requirements in the research transparency section: “All information required for reproducing the study must be supplied, either in the body of the document, or in repositories accessible to the public”.

A search engine for rating the transparency of journals for data

The Centre for Open Science publishes an analysis grid on the transparency and openness of journals. This guide distinguishes 10 topics (Citation Standards, Data Transparency, Analysis Code Transparency, Research Materials Transparency, Design and Analysis Transparency, Study Preregistration, Analysis Plan Preregistration, Replication, Registered Reports and Publication Bias, Open Science Badges) and three levels of compatibility. A search engine can be used to check the compatibility of a journal with these principles. It only takes two clicks to find that Nature Chemistry (Springer) scores 7 points, compared with only one point for Forensic Chemistry (Elsevier).

  1. Miyakawa, T. No raw data, no science: another possible source of the reproducibility crisis. Mol Brain 13, 24 (2020). https://doi.org/10.1186/s13041-020-0552-2
  2. According to the authors of the study, this is because “journals with a high impact factor are likely to receive more attention from researchers and the media”, increasing pressure to comply with theses measures.