Single-sequence protein structure prediction using supervised transformer protein language models

Published in Nature Computational Science, 2022

In this article, we introduce trRosettaX-Single, a deep learning-based single-sequence protein structure prediction method with a supervised transformer protein language model. Benchmark tests show that our method outperforms AlphaFold2 and RoseTTAFold on orphan proteins. On human-designed proteins, trRosettaX-Single is competitive with AlphaFold2 and outperforms RoseTTAFold. trRosettaX-Single also generates much more accurate contact prediction than SPOT-Contact-LM on all independent test sets. Finally, as a demonstration, trRosettaX-Single is applied to protein design/hallucination and missense mutation analysis.

[Download paper here] [Supporting Information] [Web server]

This work was featured by Nature CS and selected as Research Highlights by Nature Methods.

Reference: Wenkai Wang, Zhenling Peng, Jianyi Yang*, Single-sequence protein structure prediction using supervised transformer protein language models. Nature Computational Science, 2: 804-814 (2022).