Guest Post — The Door to Data Sharing is Slowly Creaking Open

Guest Post — The Door to Data Sharing is Slowly Creaking Open

Editor’s Note: Today’s post is by Simon Linacre. Simon is Head of Content, Brand & Press atDigital Science, which invests in, nurtures and supports innovative businesses and technologies that aim to make all parts of the research process more open, efficient and effective.

Following August’s announcement bythe White House Office of Science and Technology Policy (OSTP) to extend public access to government-funded research in the US, progress towards open access, open data, and open science as a whole has perhaps never felt more tangible for those working in scholarly communications. But to what extent are researchers at the center of this transition embracing this change, specifically around data sharing? And how does this differ across different generations of researchers, nationalities, and funders?

As therecent Scholarly Kitchen post by Dylan Ruediger highlighted, data sharing practices were largely ignored in the ensuing commentary following the OSTP announcement. Since 2016, Digital Science (full disclosure, my employer) company Figshare— a data and paper repository provider – together withSpringer Nature, have distributed a survey to thousands of researchers globally to try and understand open data sharing and what they do with their own and with others’ data. The results of the 2022 survey were released on October 13th [DOI:10.6084/m9.figshare.21276984], with over 5,400 respondents — the largest number since 2019. 

The headline from the 2022 survey is that researchers are moving towards open data… but slowly. It is apparent that huge challenges and misunderstandings remain for all researchers, where knowledge on initiatives such as FAIR data is patchy and incentives for data sharing are often misaligned to researchers’ motivations. Indeed, perhaps most worrying of all is the finding that data management activities as a whole appear to be at a relatively low level.

For publishers, and the wider scientific communications community, there are a number of specific takeaways that will have a direct influence on strategy development over the next few years as the impact of the OSTP memo plays out, and other initiatives such as Plan S become more embedded:

In relation to open access (OA), respondents were most supportive of making research articles OA as a common scholarly practice (88% agreeing) and least agreeable with making preprinting a common scholarly practice (58%); almost four in five were in agreement that making research data openly available should be common practice The main circumstances that would motivate respondents to share their data openly are citation of their research papers with two thirds selecting this option, increased impact and visibility of their papers, some form of public benefit, and journal/publisher mandates followed close behind Data sharing builds trust in the quality of research. Almost two-thirds of respondents wrote that the primary way that they themselves benefited from other researchers sharing data was to validate findings Just over half of respondents had at least some understanding of data management plans, but considering that nearly three-quarters of respondents are sharing their data during publication, it may be possible to make that data sharing activity more impactful with training and awareness efforts around data management planning  With nearly three-quarters (71%) of researchers still primarily keeping data on their personal hard drives, data loss due to insufficient infrastructure or use of that infrastructure remains a concern. A third of respondents use personal cloud storage. These results suggest that many researchers are underserved by data infrastructure 

For publishers in particular, the results of the survey point to a number of pressure points they can focus on to ease the challenge of open data sharing for their authors. By understanding author’s motivations in this area, initiatives that satisfy their needs around citations, visibility, and meeting mandates can encourage submissions to their journals as well as repeat authorship. Specifically on the OSTP memo, promoting the necessary actions required and benefits of the guidance will be key. Clearly communicating the mandate and the pathways available will also be of utmost importance, given the lack of data management planning seemingly being done. 

Looking over these findings, one question is whether there could be any sample bias considering how strong the support appears to be for open science, as many people in academic publishing will have found the opposite to have often been the case. Data scientists at Springer Nature, who have analyzed the survey, agree there may be a bias asthe survey generates responses using convenience sampling where participants are selected based on their availability and their willingness to take part. This is one reason the survey team adopted latent-class analysis to show trends within groups of respondents with differing attitudes towards open science.For example, the data shows that those who were actively disengaged were significantly more likely to indicate having made a data management plan as a direct result of a funder requirement compared to advocates (62% vs 45% respectively), which highlights the difference in drive for participation in open data practices.While there is likely to be some bias, then, it can be mitigated and quantifying the level of bias would be very challenging — a collection of similar studies for comparison would be required, which would be difficult to find. 

Comparing the results of those in previous years reveals some interesting trends, particularly aroundFAIR data principles— ensuring data are findable, accessible, interoperable and reusable. In the 2022 survey, 73% of respondents indicated an awareness of FAIR data principles, compared to 66% in2021and just 35% in2018. Those reporting actual familiarity with the principles has also grown over the same period (from 15% in 2018 to 35% in 2022).

Looking to the future, it is interesting to dive deeper into researchers’ perceived incentives for sharing data. Overall, just 19% of respondents believed that researchers get sufficient credit for sharing data, while fully three-quarters indicated they receive too little credit. Those who report more ingrained behaviors to sharing their research data openly were more likely to agree that researchers get sufficient credit for sharing data – for example 40% of those who share their data immediately on collection believe that researchers get sufficient credit – however they are still in the minority. While causation shouldn’t be inferred, an interesting avenue to explore further would be to ask if researchers who share their data get a better experience of credit, or if those who are already unconvinced about credit just don’t share at all. 

We have seen that the four main motivations for sharing data are citations, impact/visibility, public benefit, and mandates, and yet less than a fifth believe sufficient credit is given for data sharing. The recent OSTP memo supporting open data sharing may provide additional motivation for US researchers and those further afield, and in so doing start to accelerate the gradual trend towards open data sharing. Watch this space in 2023.

Images Powered by Shutterstock