To our knowledge, this is the most extensive computational validation set, both in terms of the number of compounds, geometries, and computational methods for studying low energy molecular conformers. We provide all data and analysis scripts as open data and open source to allow future reuse.[cite GitHub repo]
By considering a large number of diverse organic molecules with many poses per molecule, we seek to sample a wide variety of conformer energy preferences (e.g., intramolecular hydrogen and halogen bonding, electrostatic interactions, etc.). While using optimized low-energy conformers may under-estimate the degree of correlation for high-energy structures,\cite{Sharapa_2018} we believe the current metric is a difficult but useful comparison. Regardless of excluding high-energy geometries, many computational predictions rely on Boltzmann-weighted averages of multiple thermally accessible conformers, including NMR prediction, even understanding the effects of dipole moments on solvent viscosity.\cite{Vo_2019}
Comparison of single points vs. DLPNO-CCSD(T)
For comparison, we considered a wide variety of currently available computational methods:
- Common classical organic force fields: MMFF94,\cite{Halgren_1996,Halgren:1996kn,Halgren:1996ew,Halgren:1996hj,Halgren:1996ux} UFF,\cite{Rappe_1992} GAFF\cite{Wang_2004}
- Semiempirical wave function: PM7\cite{Stewart_2012}
- Density functional tight binding: GFN0,\cite{Pracht_2019} GFN1,\cite{Grimme_2017} GFN2\cite{Bannwarth_2018}
- Low-cost density functional approximations: PBEh-3c,\cite{Grimme_2015} B97-3c\cite{Brandenburg_2018}
- Dispersion-corrected density functionals: B3LYP,\cite{Lee_1988,Becke_1988,Stephens_1994,Vosko_1980} PBE\cite{Perdew_1997,Perdew_1996}, ωB97X-D\cite{Chai_2008} with dispersion correction (using def2-TZVP basis set\cite{Weigend_2005,Weigend_2006})
- Møller-Plesset RI-MP2\cite{Kossmann_2010} (cc-pVTZ basis set\cite{Dunning_1989,Kendall_1992})
In the case of B3LYP and PBE dispersion-corrected functionals, we also considered both the commonly-used double-zeta def2-SVP and triple-zeta def2-TZVP basis sets to understand the effects of basis set size.
[ table of correlations R^2]