Metabolite correlation networks
In general, correlations between metabolites can be used to assist in unravelling the biological basis of variation caused either by different environments or genetic backgrounds (Ursem et al. 2008). In order to understand the correlation between metabolite contents within the RIL sub-populations and how their interaction is influenced by the nutritional maternal environment, pairwise Spearman correlation analysis was performed between the metabolites. For each environmental condition, correlation analysis of all 118 detected metabolites has been performed and a correlation heatmap was generated (Figure S2, Table S5 ). The results showed that most of the unknown metabolites are highly correlated with annotated metabolites such as amino acids and organic acids including TCA cycle intermediates. Only known metabolites that showed significant correlations (FDR≤0.05) were selected for constructing correlation networks (Figure 5, Table 2 ). By using the network approach, the correlation between metabolites within each sub-population as a result of similar genetic regulation can be visualised, while different metabolic patterns in between the different maternal environments could provide more insight into the influence of environment and G×E on regulation of metabolites. Correlation networks have often been used in metabolomics studies (Morgenthal, Weckwerth & Steuer 2006; Steueret al. 2003) to provide additional information to multivariate approaches which have been described previously (Graffelman & van Eeuwijk 2005). In our study, the correlation network for the HP maternal environment contains in total 395 significant correlations (edges) between 56 metabolites (nodes). The HP condition resulted in a network with higher density (0.256) as compared to LN, which had in total 238 edges and 51 nodes (Table 2 ). In general, the network related to the HP environment showed higher levels of some attributes such as range of node degree, number of nodes and edges, network density and average number of neighbours by possessing more metabolite connections and correlations (Table 2 ). This higher connectivity in the network could be related to the overall higher metabolic levels under this specific condition. In our study dry seed metabolites were connected more under the HP condition, in comparison with LN, which indicates that the regulatory mechanisms under HP conditions induce several changes in metabolism. These metabolic changes could assist plants to cope with sub-optimal growing conditions and may result in acclimation of the plant (Hochberg et al. 2013).
The most highly connected metabolites in each condition can be found inTable S6 . Under LN, mainly amino acids are highly correlated with each other and thus could be predominantly involved in metabolic changes due to LN conditions (Figure 5A ). However, under HP maternal condition, in addition to the amino acids such as alanine, glycine, serine and threonine, some of the TCA cycle intermediates including malate, fumarate and succinate are also highly connected (Figure 5B ). In both environments we observed strong correlation between metabolites within the same category such as amino acids. Such a consistent correlation observed in both environments suggested that these metabolites are mainly under genetic control and not much influenced by the environment or G×E interactions. In our results under HP conditions glycine showed a strong correlation with malate (one of the TCA cycle intermediates, R = 0.6, FDR = 0.00021) while we could not find it back in the LN network. Such different network topologies indicate a strong environmental effect on the correlation between these metabolites. These examples show that the correlation networks and the differences amongst them may provide imperative information to understand the molecular basis of metabolic changes (Schauer et al. 2006).