Exact mass GC-MS database for routine analysis
The present curated database contains 336 compounds, 234 of them being
identified and quantified in Arabidopsis leaves. This might appear
relatively small compared with the total estimated number of small
metabolites (several thousands) in plants. However, this compares well
with most targeted routine GC-MS analyses for metabolic profiling, which
yield a list of about 80-100 metabolites in the vast majority of cases
(for example, there are 162 metabolites in (Cui, Davanture, et al.,
2019), and 178 in (Cui et al., 2021) found in leaves). Of course, the
ability of instruments and softwares to extract a proper dataset from
raw data using the database depends on the quality of analyses. In
effect, despite the considerable dynamic range of modern instruments
(here, 6 orders of magnitude in peak height), precise quantification can
only be carried out when analytes are not too concentrated (inadequate
peak shape do not allow peak extraction by softwares like Tracefinder®)
(Kaufmann & Walker, 2017). This can be challenging when some
metabolites are present in high amounts (e.g. sucrose or proline) while
others are present in trace amounts or generate a weak signal (e.g.
salicylamide) (Fig. 6). It should be noted that data extraction from raw
data can also be processed via untargeted peak searching, providing a
much more powerful way to appreciate the diversity of molecules present
in extracts (Perez de Souza et al., 2019). However, this has two
drawbacks: (i ) processing time is very long (at least 20 times
slower with Tracefinder®), and (ii ) many peaks would appear as
unidentified, with only the m/z value and retention time (and thus
post-hoc identification is required using exact mass and potentially,
co-occurring fragments). Therefore, for routine analyses, it is probably
more convenient to rely on targeted analyses with the database we
propose here.