Data Management

Guidelines for Data Management

The scope of these guidelines is to give recommendations to provide meaningful information on experiments, starting with the plant material used. Additionally, we have set up an ontology for the organs, some of them being not present in general plant ontologies, as well as some recommendations to describe the phenological stages. This will allow a more accurate and standard description of grapevine biological samples and will support the grapevine research community in opening its data according to FAIR principles.

Complementary useful information to support FAIR data management in Life Sciences on the dedicated European collaborative portal, RDMkit (https://rdmkit.elixir-europe.org/) and its pages focuses on the plant Science Domain (https://rdmkit.elixir-europe.org/plant_sciences)

State of the Art

The grapevine research community generates large omic datasets to address major challenges in viticulture and oenology. These datasets are processed in the context of a single study trying to answer one biological question. The value of data from individual experiments is further enhanced when considered in a wider context through meta-analysis. The meta-analysis of these integrated datasets helps identify mechanisms underlying interactions between plants, their environment and plant management techniques. However, the interpretation and re-processing of data requires quality metadata to provide an appropriate context and, in general, a compliance to FAIR principles.

There are several reasons behind developing specific guidelines for the grapevine research community:

Current guidelines and ontologies supporting data standardization cannot always be applied directly to grapevine because its unique traits differ from model organisms. For that reason, the Integrape consortium (COST Action CA17111, funded by the Horizon 2020 framework of the European Union) has developed specifications for grapevine data.
Grapevine researchers generally have a biologist’s background, with a very diverse level of skills in computational biology. This is true for the members of the grapevine research community that submit data to public archives, such as ENA, while these archives of the INSDC consortium are among the most cited tools in the community, for data archival and data reuse through associated data portals.