5. Challenges and perspectives
It is now clear that fermentation optimization and control are needed to
achieve more efficient and economic bioprocess. Therefore, different
holistic strategies have been developed to obtain specifically tailored
fermentation parameters. In this review, we briefly summarized the
applications of two important modeling methods for analyzing and
optimizing fermentation parameters: (a) Constraint-based modeling (CBM)
and (b) machine learning (ML). The former is a mechanistic method that
aims to get more insight into the biological system of metabolism. The
latter is a data-driven approach in which a specified algorithm can
learn from previous data and make predictions with minimal human
intervention. ML can be used to analyze fermentation parameters directly
or to infer high-dimensional omics datasets. Furthermore, recent
investigations showed that ML can be integrated with CBM to improve
predictive power and get more biological insights. However, despite the
advances in CBM, ML, and CBM-ML applications in fermentation analysis
and optimization, several limitations remain to be discussed.
Here we review some of the major challenges in developing CBM and ML
models. CBM can only estimate the metabolic flux distribution relying on
an optimal steady-state assumption in which it operates well only in
ideal limited time scales. Moreover, it does not account for metabolite
concentrations, enzyme kinetics, and regulations. Therefore, CBM cannot
always predict accurate metabolic fluxes [11]. Kinetic models of
metabolism that can overcome these shortcomings have been successfully
developed to analyze the effect of genetic and environmental factors at
genome scales [30]. However, difficult and computationally expensive
methods have limited kinetic model performance in an intracellular
environment [123]. An alternative approach to kinetic models is
using omics and multi-omics datasets to generate context-specific
metabolic models and derive meaningful insights [49, 124]. To this
end, several algorithms have been developed with different assumptions
and predictive capabilities [125, 126]. Nevertheless, omics datasets
also have some limitations, such as heterogeneity of individual omics,
the necessity of intensive analysis, differences in representation
formats [124], lack of mechanistic knowledge, and inefficient
genome-scale integration tools. Thus, the construction of
context-specific GEMs remains challenging. Therefore, many researchers
prefer to build data-driven ML models in order to analyze metabolic
networks [7]. Moreover, thanks to the significant amount of
fermentation studies and advances in measuring tools, ML methods have
been widely used to optimize fermentation media and conditions.
Nevertheless, ML models are black-box models, which use previously
experimental datasets and do not provide sufficient information on the
underlying mechanism [6].
As described above, CBM and ML both can be used as powerful tools for
analyzing metabolic networks and fermentation parameters. However, the
remarkable capabilities of each method have shown promise for the
construction of combined CBM-ML methods. The collaboration between ML
and CBM is a reciprocal process. In other words, ML can be applied to
the CBM input datasets and increase the predictive power of the
metabolic model. Conversely, CBM is a practical tool for the generation
of a new layer of omics data called fluxomics to improve the
interpretability of a data-driven ML model [23]. However, to take
full advantage of both CBM and ML, several challenges needed to be
addressed. First, CBM-derived fluxomic data require several
preprocessing steps to integrate with multi-omic data to obtain suitable
biological data. This target is restricted due to the heterogeneous and
high-dimensional datasets [111]. Second, despite the advances in
genome-scale metabolic reconstructions, appropriate high-throughput data
are only available for a small group of microorganisms [19]. Third,
the results of the integrated models, although very accurate, are not
necessarily appropriate for large-scale industrial fermentations.
Finally, the limitations of each ML and CBM method are still not fully
addressed in the integrated models.
Recent studies represent a hopeful future for improving fermentation
processes and obtaining valuable biological products through the methods
reviewed in this article. In our view, to overcome the challenges, it is
required to enhance our biological knowledge in parallel with the
development of novel mathematical and computational tools. This is
possible through collaboration between chemical engineers, biologists,
physicists, and computer engineers. For example, the integration of
high-quality metabolic regulatory networks can improve the prediction
power of CBM models. Moreover, further developments in high-throughput
techniques and more effective methods are needed to integrate omics data
with GEMs.