Reference Gene Catalogue and Nomenclature Recommendations

The scope of this section is 1) to offer a unified repository of functionally characterized genes and described gene families with reference to gene names, synonyms and publications, and 2) remind the grapevine community of the nomenclature guidelines recommended to name genes and assign symbols.

With the objective of providing a high quality and highly accessible annotation of grapevine genes, the International Grapevine Genome Project (IGGP) commissioned an international Super-Nomenclature Committee for Grape Gene Annotation (sNCGGA) that developed a standard nomenclature for locus identifiers and conventions for a gene naming system for grapevine genomics, focusing on: (i) providing a common annotation platform that enables community-based gene curation, and (ii) developing a gene nomenclature scheme reflecting the biological features of gene products that is consistent with that used in other organisms in order to facilitate comparative analyses.
The recommendations are published in: Grimplet et al, 2014.

Erratum:
 
In the manuscript, the link to the VIV catalog for species abbreviations is incorrect. Follow this one instead.

Highlights:

1- Check if your gene has been previously assigned a Full Name or Symbol in a previous work describing the whole gene family to which it belongs. You can still give it a name according to the phenotype or process in which it is involved and that you have demonstrated experimentally, but it’s recommended to search for these Synonyms.

2- The definition of a proper nomenclature needs to consider the level of confidence of the function as assigned to the full name (e.g. experimental validation in same or other species, role proposed by phylogenetic analysis, hypothetical function based in similarity to other species’ sequences, etc). See Figure 2 in Grimplet et al, 2014.

3- Use the convention for functional names and symbols proposed by the sNCGGA. See Figure 3 in Grimplet et al, 2014. Briefly:
  • Gene name based on complementation assays in other plant mutant species is allowed, however, please bear in mind that this doesn’t necessarily mean the role is going to be exactly the same in grapevine (maybe there are more roles or additional ‘evolved’ functions).
  • Gene name based on overexpression phenotypes in other plant species. Same recommendations as in the previous point.
  • Gene name based in overexpression, knock-out or silencing phenotype in grape: OK
  • Gene name based in QTL/mapped trait: OK
  • Gene naming based on phylogenetic trees: we recommend performing phylogenetic analyses using the complete gene family identified in grape and Arabidopsis (Maximum Likelihood is preferred over Neighbour Joining). A grape gene can be named as the Arabidopsis homologue if this relation is the closest compared to any other homologue/paralogue (orthologs one2one should be considered: the vitis gene is the closest among all the vitis genes for an Arabidopsis gene and vice-versa). Please consider methods (search for best fit model, consider only bootstrap values >70 or Bayesian probabilities >0.8). Letters can be added to the name (name1a, name1b…) if several grape genes show closest homology to one Arabidopsis gene.
  • Ordering according to the chromosome position should be avoided. It presents the disadvantage of being invalidated each time changes occur at the level of the genome assembly or when new members of the family are discovered.
  • Prefix Vvi should be used for Vitis vinifera (Vv prefix was originally created for the bacteria Vibrio vulnificus). For other grape species use this list.
4- There are 3 required identifiers for a gene (in grey/bold): -Locus Identifier (Locus ID): the unique identifier of the gene in the genome. This identifier is not intended to be related to a physical position on the chromosome. -Full Name and Symbol: refer to the description of the functional role of the protein encoded by the gene. The Symbol is a short abbreviation of the full name. To deal with pre-existing naming schemes we propose to add synonyms. These correspond to other types of names that have been encountered in the literature; they can be symbols or full names. For example,

Locus ID (V1)Locus ID (V.Cost)Full nameCurationPrefix

SymbolSynonyms
Example 1VIT_04s0008g05210Vitvi04g00464(Vitis vinifera) ELONGATED HYPOCOTYL5 validated (Loyola et al., 2016)VviHY5bZIP10 (Liu et al., 2014)
Example 2VIT_03s0180g00200Vitvi03g00557(Vitis vinifera) Resveratrol Glycosyltransferase, putativevalidated in V. labrusca

(Hall & Luca, 2007)
VviRsGT1GT9

(Bönisch et al. 2014)
DescriptionGenome localization in V1 annotationGenome localization in V3 annotationRelative descriptive functionLevel of curationVviConcise (3-10 characters), descriptive of function when possibleAny known synonyms

The INTEGRAPE Cost Action has unified several independent efforts for building a catalogue of characterized genes and surveyed gene families. Their gene symbols were included in the latest 12X.2 assembly V.Cost gff annotation file (to download VCost.v3_INTEGRAPEv2.gff3) and will also be present in the latest PN40024 40X assembly annotation (under construction).

Note on Catalogue’s Structure. Each row in the catalogue corresponds to a unique V.Cost.v3 Id. In cases where there is recent evidence of a V.Cost.v3 splitting in two genes, a composite gene symbol is provided. 

Access to the full catalogue here (version2.0, Last update: 19th October 2021; version 3 will be released in December 2021):

(Excel)⇒ catalogue.INTEGRAPEv2.xlsx

Gene Cards. This App serves as a hub for gene functional associations, represented as information cards with references and all the information from the most recent version of the catalogue. It also includes an interactive organ expression viewer thorough all SRA runs available.

GeneCard_button

Developed by David Navarro-Payá & José Tomás Matus (Coordinators of the Grape Gene Reference Catalogue Initiative).

Validation Levels 

ValidationLevel of experimental validation
Hypothetical: only based on similarity to other proteins (e.g. BLAST, Hidden Markov Models /PFAM search).1
Putative: complete family identification (using phylogenetic trees and including other species) and/or expression data/validation (RNA-seq, qPCR, others).2
Proposed:correlation experiments such as gene co-expression networks, correlation to metabolite profiles.3
Candidate: QTL mapping.4
Validated_other_sp.: If transcription factor: overexpression or negative dominant (in transient, stable experiments) in another species.5
Validated: knock-out/loss-of-function, silencing, overexpression or negative dominant (transient, stable) in the same species. If enzyme: in vitro/recombinant enzyme characterization (e.g. E. coli/yeast) or overexpression of enzyme (transient, stable) in another species.6

MPORTANT NOTICE: If you have recently characterized a gene or described a gene family (already published or accepted for publication) you can incorporate it in our catalogue.

Please fill out the following form. 

Make sure to consider the following levels of evidence/validation for gene functions

ValidationLevel of experimental validation
Hypothetical: only based on similarity to other proteins (e.g. BLAST, Hidden Markov Models /PFAM search).1
Putative: complete family identification (using phylogenetic trees and including other species) and/or expression data/validation (RNA-seq, qPCR, others).2
Proposed:correlation experiments such as gene co-expression networks, correlation to metabolite profiles.3
Candidate: QTL mapping.4
Validated_other_sp.: If transcription factor: overexpression or negative dominant (in transient, stable experiments) in another species.5
Validated: knock-out/loss-of-function, silencing, overexpression or negative dominant (transient, stable) in the same species. If enzyme: in vitro/recombinant enzyme characterization (e.g. E. coli/yeast) or overexpression of enzyme (transient, stable) in another species.6

Provider

José Tomás Matus / David Navarro-Payá

Primary contacttomas.matus (at) uv.es / davidnp7 (at) gmail.com.