2.1 An overview of CBM main concepts
Biological systems such as cellular metabolism are constrained by physiochemical laws, genetics, and the extracellular environment [34]. The most fundamental constraints of metabolism are the mass balance equations for each intracellular metabolite generated from biochemical reaction stoichiometry. Genome-scale network reconstructions are created from all known metabolic reactions within the system of interest. Furthermore, they are improved by additional information, such as gene-protein-reaction (GPR) associations [35]. A valuable manual protocol describes how to generate a high-quality genome-scale metabolic reconstruction from genome sequencing data and how to curate the model with empirical information [12]. A genome-scale network reconstruction can be transformed into a mathematical format. The mathematical representation of such reconstructed networks and implementing further details such as GPR associations is called the genome-scale metabolic model (GEM). This enables the quantitative and qualitative analysis of the GEMs via computational approaches such as constraint-based modeling [36].
Metabolic flux analysis (MFA) and Flux balance analysis (FBA) are two main CBM methods that aim to determine the reaction fluxes (fluxomics) within the metabolic network (Figure 1 ). These methods use a stoichiometric matrix (S) with the size of m * n to calculate the metabolic flux distribution. In the S matrix, each row represents a metabolite (m), and each column represents a metabolic reaction (n). Therefore, under the steady-state condition, the mass balance equation will be as follows: S. v=0. The v vector contains metabolic fluxes, some of which are known and some unknown. MFA is a data-driven method that determines reaction fluxes through experimental measurements. While MFA is useful in small-scale networks, FBA is a beneficial tool for analyzing large-scale networks such as the genome-scale metabolic network [37]. FBA is an optimization method that searches a solution space and maximizes one or more objective functions such as maximum growth rate and metabolite production via a linear programming approach [11]. FBA calculates the single optimal flux distribution or multiple optimal flux distributions in the GEM, which represents the ‘state’ of the metabolic network that relates to the physiological function generated from the network [38]. However, mass balance constraints alone cannot constitute a unique solution space. Therefore, multiple optimal solutions (i.e., flux vectors) to the problem are obtained. So, additional constraints such as flux capacity, thermodynamic feasibility, gene expression, etc., are imposed to shrink the solution space [39]. Moreover, MFA can combine with FBA to determine internal metabolic fluxes to increase the prediction power [40]. Besides, other forms of FBA and MFA, such as dynamic FBA and MFA, can be used based on the aim of the research [41, 42]. In addition to FBA and MFA, other CBM approaches can be used to rational strain designs and increase product yield. These FBA-based methods aim to determine gene deletion/addition targets, up/down regulations, data integration, and suggest appropriate strategies to increase productivity [43]. These computational methods also can be used in fermentation optimization. For example, up- and down-regulation targets have been used to identify enzyme activators and inhibitors for enhancing the production bound in a regulatory-defined medium (RDM) [44]. COBRA toolbox in MATLAB and COBRApy in python are two platforms for implementing FBA and other related algorithms to GEMs [45, 46].
Another efficient approach to increase the predictive power of the CBM models is the integration of omics data with GEMs. Omics data can be used both to narrow the solution space in the FBA and as a tool to evaluate and validate the model prediction [6]. As a result of integrating omics data with GEMs, context-specific models are created that provide the basis for studying metabolism under different conditions [47, 48]. Assuming that the system is steady-state, substrate concentrations, time, and various kinetic parameters are not taken into account to calculate the metabolic fluxes. Therefore, the predictive accuracy of the CBMs is less than the calculated fluxes resulting from solving ODEs in the kinetic models. As the solution space becomes tighter, the FBA solutions approach the kinetic model solution. Thus, the integration of omics data can overcome the limitations of the CBM over the kinetic models. However, data integration remains a major challenge, and existing methods do not perform at the expected level [49].