Materials and Methods
Protein expression, purification and sample preparation. A synthetic gene encoding the HIV-1 protease monomer lacking the last 4 residues, mHIV-1-PR1-95 was a kind gift from Dr. Celia Schiffer, University of Massachusetts Medical School, and was cloned into a pET11a vector. The protein was expressed in Escherichia coli Rosetta (DE3) cells upon induction with 0.2 mM IPTG. For the synthesis of isotope labeled protein, Spectra9 LB media (Euriso-top, France – cod. CGM-3030-CN-1, 1L) enriched with 15N and 13C was used. Cells obtained from 0.4 L of culture were lysed by sonication at 4º C in extraction buffer: 20mM Tris/HCl, 1 mM EDTA and 10 mM DTT, pH 8. The protein was refolded as described previously26. For spectroscopic measurements, the protein was dialyzed against 20 mM sodium phosphate, pH 6.0.
Fluorescence and CD experiments . Fluorescence experiments were performed with a Varian Eclipse fluorimeter on 4 µM protein in 20 mM sodium phosphate at pH 6.0 and 25ºC by adding different concentrations of denaturant. CD measurements were conducted at 230 nm and a protein concentration of 15 μM in 20 mM sodium phosphate, pH 6, and containing different amounts of denaturant at 25°C using a JASCO J810 spectropolarimeter and a 1 mm path length. A total of 120 data points were recorded over 1 minute and averaged. The actual urea and GdmCl concentrations were confirmed by refractive index measurements. For the temperature transition, CD measurements were conducted at 205 nm and a protein concentration of 10 μM in 20 mM sodium phosphate, pH 6. The temperature was increased in 1 °C steps from 3 to 20 °C and in 2 °C steps from 20 to 90 °C using a Peltier control unit. To account for the slow refolding kinetics, each point was allowed to equilibrate 5 minutes prior to detection.
NMR experiments . Backbone assignment and R1, R2 and hetNOE relaxation experiments . All NMR spectra were recorded either on an Agilent DD2 800 MHz or a Varian INOVA 750 MHz spectrometer using a room temperature probe, and standard pulse programs from the Vnmrj BioPack. For assignment, we prepared 11 different aliquots of15N-13C-labelled ~200 μM protein solution in 20 mM sodium phosphate, pH 6.0, and 10 % D2O (v/v), 125 μM DSS (2,2-dimethyl-2-silanepentane-5-sulfonic acid) containing 4, 6 and 8 M urea, 0.75, 1, 2 and 4 M GdmCl, or 9% (v/v), 25% (v/v), and 45% (v/v) acetic acid, respectively, and one containing no extra additives. For relaxation experiments, identical samples were prepared containing a15N-labelled ~200 μM protein solution. The backbone nuclei were assigned using HSQC35, HNCA, HNCO36, HN(CA)CO37, HNCOCA38, HNCACB39, CBCACONH40, HNN41 and15N-edited NOESY-HSQC42 spectra recorded at 25°C for the samples containing 4 and 8 M urea, and using only HSQC, HNCA, HNCO, HNCOCA, HNN and 15N-edited NOESY-HSQC spectra for 1 M GdmCl and 25% (v/v) acetic acid. For the remaining samples, only the HSQC, HNCA, HNCOCA and HNCO spectra were used for backbone assignment. The assignment was completed for 95% of all non-proline residues for samples containing acetic acid, 96% for samples containing GdmCl, 97% for samples containing urea and 97% for cold denatured protein.
To analyze the T1 and T2 relaxation times and heteronuclear NOEs (hetNOEs), five series of spectra were recorded on 15N-labelled protein in 20 mM sodium phosphate, pH 6.0, and 10 % D2O (v/v), 125 μM DSS, also containing 4 or 8 M urea, 1 M GdmCl or 25% (v/v) acetic acid, at 25 ºC43. We chose 8 different delay times: 0 ms, 100 ms, 200 ms, 300 ms, 500 ms, 700 ms, 900 ms and 1200 ms for recording T1 and 9 different delay times: 10 ms, 50 ms, 90 ms, 130 ms, 170 ms, 190 ms, 210 ms, 230 ms and 250 ms for recording T2 relaxation times. For the hetNOE a relaxation delay of 8 s was used.
PFG NMR diffusion experiments . The above described protein samples were used to record sets of 60 bipolar pulse-pair stimulated echo experiments using a watergate scheme for water suppression with varying gradient strength 44. As internal reference, 0.5% (v/v) dioxane was added to all samples to correct for viscosity effects by the solvent. All spectra were obtained at 25°C using 32 transients on a 750 MHz Varian INOVA spectrometer.
2-D and 3-D NMR spectra processing . The X-carrier frequency was determined by referencing to internal DSS. The DSS frequency was obtained from a 1D 1H spectra recorded immediately before the remaining experiments. Indirect referencing was used in the15N and 13C dimensions by use of conversion factors 45. The spectra were processed using nmrPipe 46 and qMDD 47. Spectrometer frequencies and carrier frequencies in ppm were inserted with 4 decimals. Zero-filling to nearest power of 2 was used. The processed spectra were assigned and analyzed in CcpNmr Analysis48. The assigned HSQC spectra were further used to extract the relaxation decays from the series of spectra recorded to determine the T1 and T2 relaxation times. Relaxation decay curves were fitted to single exponentials and relaxation times determined using therelax software 49,50 The values ofR1 , R2 and the hetNOE recorded at 17.6 Tesla were used to derive the spectral density function at three frequencies (0, ωH and ωN) analyzed by reduced spectral density mapping using relax49,50.
DOSY processing . Each set of 60 1D-1H spectra was separately processed and analyzed using The DOSY Toolbox51 and MATLAB 52. Spectra were phased in zero order and smoothed using a 10 Hz Lorentzian efficiently removing most visible noise. The function msbackadj was used rather than the internal DOSY Toolbox baseline correction routine.
Analysis of the chemical shifts. Secondary chemical shifts associated with different atoms were systematized using the formula (Δ(δCα)+Δ(δC’)-0.5*Δ(δN))7.
Fit of dynamics parameters . The R2 parameters were fitted with the function described in Eq. 3 in the Supplementary Materials of ref.53. The fit was done with a nonlinear least-square algorithm employing a Levenberg-Marquardt algorithm. To avoid overfitting, we performed fits with different number of exponentials, eventually choosing the minimum number of exponentials which gave a chi2 lower than 5.
NMR data have been deposited at the BioMagResBank with the accession number: 25255.
Molecular dynamics simulations. The mHIV-1-PR1-95 system was described with the Amber 99SBdisp force field54 in Tip4/pd water and simulated with Gromacs 2020.455. The protein was prepared in a dodecahedric box of 571 nm3 with 19160 water molecules and 4 Cl- ions to neutralize the charge. A preliminary simulation of 50 ns at 700K and constant volume was carried out, followed by 100 ns at 300K and 1 atm. From the latter simulation, 110 conformations were extracted to act as starting conformations of the production run. A replica-exchange simulation was then performed with 110 replicas whose temperature range from 300K to 500K for a total of 68 µs.
Once the first 30 ns were removed, the replica at 300K was analyzed to validate the simulation against the NMR data. Secondary chemical shifts were calculated for each conformation with Sparta+56and averaged over all of them. To calculate secondary chemical shifts, we used Bax’s reference value56.
To predict the R1 relaxation parameters qualitatively we extracted 50 conformations from the 300K trajectory, using each of them as starting point of a 1ns simulation at fixed temperature. The root mean square fluctuations around each of the 50 average conformations were calculated and then averaged together. The experimental R2 values were compared to the solvent-accessible surface area of each residue, averaged over the full 300K trajectory.
The clustering of the 300K trajectory was performed with a tailor-made Python code that uses the fraction q of common contacts as underlying metric, normalized to the maximum between the numbers of contacts of the two structures. A contact is defined if the center of mass of two residues are closer than 0.65 nm. In the calculation ofq, only pairs of residues which were further apart by at least 3 other residues along the chain were considered.