Toggle Main Menu Toggle Search

Open Access padlockePrints

Perspectives on tracking data reuse across biodata resources

Lookup NU author(s): Dr Phillip Lord

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

c The Author(s) 2024. Published by Oxford University Press.Motivation: Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge. Results: The article reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources.


Publication metadata

Author(s): Ross K, Bastian FB, Buys M, Cook CE, D'Eustachio P, Harrison M, Hermjakob H, Li D, Lord P, Natale DA, Peters B, Sternberg PW, Su AI, Thakur M, Thomas PD, Bateman A, Bateman A, Martin M-J, Orchard S, Magrane M, Ahmad S, Bowler-Barnett EH, Bye-A-Jee H, Denny P, Dogan T, Ebenezer T, Fan J, da Costa Gonzales LJ, Hussein A, Ignatchenko A, Insana G, Ishtiaq R, Joshi V, Jyothi D, Kandasaamy S, Lock A, Luciani A, Luo J, Lussi Y, Raposo P, Rice DL, Saidi R, Santos R, Speretta E, Stephenson J, Totoo P, Tyagi N, Vasudev P, Warner K, Zaru R, Wijerathne S, Ibrahim KT, Kim M, Marin J, Bridge AJ, Aimo L, Argoud-Puy G, Auchincloss AH, Axelsen KB, Bansal P, Baratin D, Batista Neto TM, Bolleman JT, Boutet E, Breuza L, Gil BC, Casals-Casas C, Coudert E, Cuche B, de Castro E, Estreicher A, Famiglietti ML, Feuermann M, Gasteiger E, Gehant S, Gos A, Gruaz N, Hulo C, Hyka-Nouspikel N, Jungo F, Kerhornou A, Le Mercier P, Lieberherr D, Masson P, Morgat A, Pedruzzi I, Pilbout S, Pourcel L, Poux S, Pozzato M, Pruess M, Redaschi N, Rivoire C, Sigrist CJA, Sundaram S, Sveshnikova A, Wu CH, Arighi CN, Chen C, Chen Y, Huang H, Laiho K, Lehvaslaiho M, McGarvey P, Natale DA, Ross K, Vinayaka CR, Wang Y, Zhang J

Publication type: Article

Publication status: Published

Journal: Bioinformatics Advances

Year: 2024

Volume: 4

Issue: 1

Online publication date: 25/04/2024

Acceptance date: 11/04/2024

Date deposited: 30/05/2024

ISSN (electronic): 2635-0041

Publisher: Oxford University Press

URL: https://doi.org/10.1093/bioadv/vbae057

DOI: 10.1093/bioadv/vbae057

Data Access Statement: The data underlying this article are available in the article and in its online supplementary material. Summaries of the survey results are available at: https://docs.google.com/forms/d/1j-VU2ifEKb9C-sW6l3ATB79dgHdRk5v_lESv2hawnso/viewanalytics (survey of data providers) and https://docs.google.com/forms/d/18WbJFutUd7qiZoEz-bOytFYXSfWFT61hVce0vjvIwIjk/viewanalytics (survey of users).


Altmetrics

Altmetrics provided by Altmetric


Funding

Funder referenceFunder name
National Institutes of Health
U24HG007822
U24HG007822-09S1

Share