Real-world Problems

Based on results reported in Figures 4, 5, 6, we formulate the hypothesis that qNEHVI as a MOBO strategy is very sample efficient, i.e. able to arrive at the PF rapidly with few evaluations and is superior in maximizing hypervolume as a performance metric. In comparison, we found that U-NSGA-III provides a more consistent search due to its heuristic evolution nature over that of stochastic QMC sampling in qNEHVI, and furthermore maintain a larger pool of near-Pareto samples that is not reflected by the HV performance metric. We also report that smaller batch sizes are generally better in both strategies over the two-objective jobs used.
To test this hypothesis, we repeated our experiments on real-world multi-objective datasets. \cite{Yeh_2008,MacLeod_2022} An unavoidable issue of empirically benchmarking optimisation strategies on real-world problems is that some surrogate model must be used in-lieu of a black-box where new data is experimentally validated. Alternatively, a candidate selection problem can be used where optimisation is limited to only proposing new candidates from a pre-labelled dataset until eventually the ‘pool’ of samples is exhausted. \cite{Janet_2020,Hanaoka_2022,Gopakumar_2018,Liang_2021} The benefit of this method over surrogate-based methods is that only real data from the black-box is used, rather than data extrapolated from a model approximating its behaviour. However, the candidate selection approach assumes that the existing dataset contains all data points necessary to perfectly represent the search space and true PF. It is generally not possible to prove that this is the case, unless the exact function mapping input to output of the black box is known, or the dataset contains all possible combination of input/output pairs and is therefore a complete representation of the problem like that of inverse design.
Here, due to the relatively small size of the datasets (~102 data points), the candidate selection method was not implemented. Instead, we relied on training an appropriate regressor to model the dataset. The two real-world benchmarks used in this paper are presented in Table 2. Materials datasets with constraints are hard to find from available HTE literature, asides from simple combinatorial setups that need to sum to 100%. \cite{Erps_2021} Another example is Cao L. et al, \cite{Cao_2021} which included complex constraints in the form of solubility, although we were unable to attain their full dataset and solubility classifier.