Genetics, energetics and allostery during a billion years of hydrophobic protein core evolution

Protein folding is driven by the burial of hydrophobic amino acids in a tightly-packed core that excludes water. The genetics, biophysics and evolution of hydrophobic cores are not well understood, in part because of a lack of systematic experimental data on sequence combinations that do - and do not - constitute stable and functional cores. Here we randomize protein hydrophobic cores and evaluate their stability and function at scale. The data show that vast numbers of amino acid combinations can constitute stable protein cores but that these alternative cores frequently disrupt protein function because of allosteric effects. These strong allosteric effects are not due to complicated, highly epistatic fitness landscapes but rather, to the pervasive nature of allostery, with many individually small energy changes combining to disrupt function. Indeed both protein stability and ligand binding can be accurately predicted over very large evolutionary distances using additive energy models with a small contribution from pairwise energetic couplings. As a result, energy models trained on one protein can accurately predict core stability across hundreds of millions of years of protein evolution, with only rare energetic couplings that we experimentally identify limiting the transplantation of cores between highly diverged proteins. Our results reveal the simple energetic architecture of protein hydrophobic cores and suggest that allostery is a major constraint on sequence evolution. Overall design: We built combinatorial libraries in the hydrophobic cores of three small protein domains (FYN-SH3, CI-2A and CspA) using a reduced alphabet consisting of the amino acids F, L, M, V, I encoded by the DTS degenerate codon. By bottlenecking and pooling the libraries, in the sparse_DTS_core_mutagenesis experiment we sparsely measured the intracellular abundance of protein variants in yeast cells using abundancePCA, a protein complementation assay that couples cell growth rate with query protein intracellular abundance under selection by methotrexate. For the SH3 domain of the human FYN kinase, we selected a few query core amino acid combinations that are severely deleterious in abundance fitness and designed a suppressor "permissivity" library by introducing non-core mutations associated with SH3 domains naturally carrying such query core combinations that are deleterious in FYN (FYN-SH3_core_permissivity experiment). Also for FYN-SH3, we assessed the impact of core reconfiguration in function by measuring the binding to its short linear motif ligand PRD1super using bindingPCA, a protein complementation assay that couples cell growth rate with query variant intracellular binding to an interacting partner under selection by methotrexate (FYN-SH3_core_DTS_binding experiment).

Identifier
Source https://data.blue-cloud.org/search-details?step=~0128C74A723777F8CE3472B1EA9A2060109EFEA36D0
Metadata Access https://data.blue-cloud.org/api/collections/8C74A723777F8CE3472B1EA9A2060109EFEA36D0
Provenance
Instrument NextSeq 500; NextSeq 2000; ILLUMINA
Publisher Blue-Cloud Data Discovery & Access service; ELIXIR-ENA
Publication Year 2024
OpenAccess true
Contact blue-cloud-support(at)maris.nl
Representation
Discipline Marine Science