5. Challenges and perspectives
It is now clear that fermentation optimization and control are needed to achieve more efficient and economic bioprocess. Therefore, different holistic strategies have been developed to obtain specifically tailored fermentation parameters. In this review, we briefly summarized the applications of two important modeling methods for analyzing and optimizing fermentation parameters: (a) Constraint-based modeling (CBM) and (b) machine learning (ML). The former is a mechanistic method that aims to get more insight into the biological system of metabolism. The latter is a data-driven approach in which a specified algorithm can learn from previous data and make predictions with minimal human intervention. ML can be used to analyze fermentation parameters directly or to infer high-dimensional omics datasets. Furthermore, recent investigations showed that ML can be integrated with CBM to improve predictive power and get more biological insights. However, despite the advances in CBM, ML, and CBM-ML applications in fermentation analysis and optimization, several limitations remain to be discussed.
Here we review some of the major challenges in developing CBM and ML models. CBM can only estimate the metabolic flux distribution relying on an optimal steady-state assumption in which it operates well only in ideal limited time scales. Moreover, it does not account for metabolite concentrations, enzyme kinetics, and regulations. Therefore, CBM cannot always predict accurate metabolic fluxes [11]. Kinetic models of metabolism that can overcome these shortcomings have been successfully developed to analyze the effect of genetic and environmental factors at genome scales [30]. However, difficult and computationally expensive methods have limited kinetic model performance in an intracellular environment [123]. An alternative approach to kinetic models is using omics and multi-omics datasets to generate context-specific metabolic models and derive meaningful insights [49, 124]. To this end, several algorithms have been developed with different assumptions and predictive capabilities [125, 126]. Nevertheless, omics datasets also have some limitations, such as heterogeneity of individual omics, the necessity of intensive analysis, differences in representation formats [124], lack of mechanistic knowledge, and inefficient genome-scale integration tools. Thus, the construction of context-specific GEMs remains challenging. Therefore, many researchers prefer to build data-driven ML models in order to analyze metabolic networks [7]. Moreover, thanks to the significant amount of fermentation studies and advances in measuring tools, ML methods have been widely used to optimize fermentation media and conditions. Nevertheless, ML models are black-box models, which use previously experimental datasets and do not provide sufficient information on the underlying mechanism [6].
As described above, CBM and ML both can be used as powerful tools for analyzing metabolic networks and fermentation parameters. However, the remarkable capabilities of each method have shown promise for the construction of combined CBM-ML methods. The collaboration between ML and CBM is a reciprocal process. In other words, ML can be applied to the CBM input datasets and increase the predictive power of the metabolic model. Conversely, CBM is a practical tool for the generation of a new layer of omics data called fluxomics to improve the interpretability of a data-driven ML model [23]. However, to take full advantage of both CBM and ML, several challenges needed to be addressed. First, CBM-derived fluxomic data require several preprocessing steps to integrate with multi-omic data to obtain suitable biological data. This target is restricted due to the heterogeneous and high-dimensional datasets [111]. Second, despite the advances in genome-scale metabolic reconstructions, appropriate high-throughput data are only available for a small group of microorganisms [19]. Third, the results of the integrated models, although very accurate, are not necessarily appropriate for large-scale industrial fermentations. Finally, the limitations of each ML and CBM method are still not fully addressed in the integrated models.
Recent studies represent a hopeful future for improving fermentation processes and obtaining valuable biological products through the methods reviewed in this article. In our view, to overcome the challenges, it is required to enhance our biological knowledge in parallel with the development of novel mathematical and computational tools. This is possible through collaboration between chemical engineers, biologists, physicists, and computer engineers. For example, the integration of high-quality metabolic regulatory networks can improve the prediction power of CBM models. Moreover, further developments in high-throughput techniques and more effective methods are needed to integrate omics data with GEMs.