Essential Maintenance: All Authorea-powered sites will be offline 4pm-6pm EDT Tuesday 28 May for essential maintenance.
We apologise for any inconvenience.

loading page

Improving protein tertiary structure prediction by deep learning and distance prediction in CASP14
  • +2
  • Jian Liu,
  • Tianqi Wu,
  • Zhiye Guo,
  • Jie Hou,
  • Jianlin Cheng
Jian Liu
University of Missouri

Corresponding Author:[email protected]

Author Profile
Tianqi Wu
University of Missouri - Columbia
Author Profile
Zhiye Guo
University of Missouri
Author Profile
Jie Hou
Saint Louis University
Author Profile
Jianlin Cheng
University of Missouri
Author Profile


Substantial progresses in protein structure prediction have been made by utilizing deep-learning and residue-residue distance prediction since CASP13. Inspired by the advances, we improve our CASP14 MULTICOM protein structure prediction system in three main aspects: (1) a new deep learning based protein inter-residue distance predictor (DeepDist) to improve template-free (ab initio) tertiary structure prediction, (2) an enhanced template-based tertiary structure prediction method, and (3) distance-based model quality assessment methods empowered by deep learning. In the 2020 CASP14 experiment, MULTICOM predictor was ranked 7th out of 146 predictors in protein tertiary structure prediction and ranked 3rd out of 136 predictors in inter-domain structure predic-tion. The results of MULTICOM demonstrate that the template-free modeling based on deep learning and residue-residue distance prediction can predict the correct topology for almost all template-based modeling targets and a majority of hard targets (template-free targets or targets whose templates cannot be recognized), which is a significant improvement over the CASP13 MULTICOM predictor. The performance of template-free tertiary structure prediction largely depends on the accuracy of distance pre-dictions that is closely related to the quality of multiple sequence alignments. The structural model quality assessment works reasonably well on targets for which a sufficient number of good models can be predicted, but may perform poorly when only a few good models are predicted for a hard target and the distribution of model quality scores is highly skewed.
22 Mar 2021Submitted to PROTEINS: Structure, Function, and Bioinformatics
24 Mar 2021Submission Checks Completed
24 Mar 2021Assigned to Editor
02 Apr 2021Reviewer(s) Assigned
14 Apr 2021Review(s) Completed, Editorial Evaluation Pending
16 Apr 2021Editorial Decision: Revise Major
16 May 20211st Revision Received
19 May 2021Assigned to Editor
19 May 2021Submission Checks Completed
19 May 2021Reviewer(s) Assigned
13 Jun 2021Review(s) Completed, Editorial Evaluation Pending
14 Jun 2021Editorial Decision: Revise Major
21 Jun 20212nd Revision Received
23 Jun 2021Submission Checks Completed
23 Jun 2021Assigned to Editor
23 Jun 2021Reviewer(s) Assigned
05 Jul 2021Review(s) Completed, Editorial Evaluation Pending
12 Jul 2021Editorial Decision: Accept
27 Jul 2021Published in Proteins: Structure, Function, and Bioinformatics. 10.1002/prot.26186