Introduction
Plant metabolome analysis is now of crucial importance to describe
responses to environmental conditions and thus understand associated
metabolic mechanisms. Standardised metabolomics protocols have been
proposed to characterise crop metabolism (Zheng, Johnson, Mandal, &
Wishart, 2021). The term “metabolomics” is used to refer to techniques
exploited to investigate “small” biological molecules (metabolites),
i.e., extractible molecules with limited molecular weight, usually less
than 500 atomic mass units (a.m.u.) (Roessner & Bowne, 2009). Two
mainstream mass spectrometry techniques can be used: gas chromatography
coupled to mass spectrometry (GC-MS) and liquid chromatography coupled
to mass spectrometry (LC-MS) (Allwood et al., 2011; Perez de Souza,
Alseekh, Naake, & Fernie, 2019). The output of metabolomics is a
dataset of metabolic features (m/z metabolite peaks with retention time
in LC-MS; analytes resulting from metabolite derivatisation in GC-MS)
that can be used for statistics and detect metabolic changes between
samples.
GC-MS metabolomics (also sometimes referred to as metabolic profiling)
has been used extensively in plants under many conditions, species, or
genetic backgrounds: a simple search with a literature database with
query keywords ‘plant’, ‘metabolomics’ and ‘gc-ms’ returns 43,700
entries. When restricted to Arabidopsis, it returns 13,500 entries. It
shows the massive utilisation of GC-MS for plant physiology and
molecular biology. In particular, this technique is very useful to have,
in a single sample analysis, a relative quantitation of most important
metabolites of plant primary metabolism, such as amino acids, small
soluble sugars, polyamines, or organic acids. It has thus been used to
describe the response of C and N primary metabolism to major
environmental cues, for example, herbivores (Jansen et al., 2009),
CO2 mole fraction (Högy, Keck, Niehaus, Franzaring, &
Fangmeier, 2010; Misra & Chen, 2015), drought (Bowne et al., 2012;
Sanchez, Schwabe, Erban, Udvardi, & Kopka, 2012), nutrient conditions
(Cui, Abadie, Carroll, Lamade, & Tcherkez, 2019; Cui, Davanture,
Lamade, Zivy, & Tcherkez, 2021; Cui, Davanture, Zivy, Lamade, &
Tcherkez, 2019), or abiotic stress combinations (Ghatak, Chaturvedi, &
Weckwerth, 2018; Nakabayashi & Saito, 2015; Shulaev, Cortes, Miller, &
Mittler, 2008).
To date, the vast majority of GC-MS analyses for metabolic profiling
utilise nominal mass acquisition (i.e. at a.m.u. resolution).
Accordingly, databases associated with GC-MS metabolomics such as the
Golm Metabolomics Database (GMD) provide spectral data at a.m.u.
resolution, and in a recent review of metabolomics resources, only
nominal mass databases are discussed for GC-MS analyses (Vinaixa et al.,
2016). In other words, to our knowledge, there is no directly
accessible, high resolution (exact mass, i.e. at 0.0001 a.m.u.
resolution or lower) and comprehensive GC-MS resource for plant
metabolomics. This lack of curated, accessible and available resource
for GC-MS analyses has three origins: (i ) The availability of
(affordable) exact mass GC-MS instruments is relatively recent, since
the implementation of the orbitrap technology took place in the 2000s
(Makarov, 2000; Makarov, Denisov, & Lange, 2009; Peterson, McAlister,
Quarmby, Griep-Raming, & Coon, 2010) and the description of standard
practices for high resolution GC-MS analyses has been proposed in 2021
only (Misra, 2021); (ii ) Many ordinary applications of GC-MS
metabolomics profiling do not require exact mass resolution since they
are targeted on common, well-known compounds; and (iii ) Whenever
high resolution is required, LC-MS can be used. The use of high
resolution in LC-MS may be important using full scan analyses, because
there is a limited number of fragments (mainly parental ion and adducts)
and therefore, identification essentially relies on both exact mass and
isotopic pattern (De Vos et al., 2007; Kind & Fiehn, 2006). By
contrast, in GC-MS analyses, the fragmentation pattern along with the
retention index are used to identify analytes, with generally good
accuracy. Several tools have been recently proposed to automatically
annotate ions or fragments in mass spectra, in particular from LC-MS
spectral data, for example in (Doerfler et al., 2014; Gaquerel, Kuhl, &
Neumann, 2013; Matsuda et al., 2011; Qiu, Fine, Wherritt, Lei, &
Sumner, 2016).
However, there are circumstances where high mass resolution may be
desirable with GC-MS, since (i ) several compounds with the same
retention time could generate fragments with the same nominal mass and
(ii ) it could be useful to distinguish isotopic species
(isotopologues) using their mass difference (for example, there is a
mass excess of +1.003355 Da with 13C while it is
+1.006277 Da with 2H), and (iii ) one may desire
to perform untargeted GC-MS analyses with broad chemical coverage. Here,
we describe an exact mass GC-MS method for high resolution routine plant
metabolic profiling and provide the associated curated database, checked
with authentic standards. This allows us to address aspects (i )
and (ii ) directly. We also provide the list of current compounds
having similar nominal-mass fragments and similar retention time but can
be distinguished easily with exact mass, avoiding quantification errors.
We also take advantage of sulphur isotopes at natural abundance to allow
the identification of S-containing fragments in datasets. Finally, we
applied our protocol and the database using Arabidopsis leaves to show
how it can be applied to real samples, allowing facile differentiation
of genetic accessions.