Molecular basis for the regulation of membrane proteins through preferential lipid solvation

Visualization of lipid orientations around CLC monomers and dimers from CGMD simulations

A depiction of lipid orientations around CLC-ec1 is provided in Fig. 1 calculated from CGMD MARTINI 2:1 POPE/POPG simulation data acquired in a previous study25. Time-average lipid coordinates were computed using the trajectory data and ‘Mean Lipid Coords’ tool provided in MOSAICS38. In this case, the lipids are mapped onto a grid using a stamping radius of 2.3 Å and a lattice spacing of 4 Å. The resulting coordinates, which are provided for each CG bead and at every lattice point, are visualized as arrows connecting the phosphate bead and the last bead on chain A. Therefore, arrows represent the average orientation when a lipid is at a lattice point at a user-defined lattice spacing and do not reflect the actual lipid packing density in the membrane. Due to the isotropic nature of the lipids, the average coordinates of beads on chain A are identical to those on chain B. In Fig. 1, a surface is also shown around the protein for reference, where the coordinates are also averaged over the trajectory, and only a single layer of lattice points surrounding the protein is shown for clarity.

CLC-ec1 purification, fluorophore labeling and reconstitution into lipid bilayers

The protein was purified, labeled and reconstituted following established methods25,30. CLC-ec1 on the background of C85A/H234C was used in the studies to provide subunit-specific labeling. A C-terminal hexahistidine tag is present for purification and is left intact in these studies. Protein was labeled with Cy5-maleimide (sulfo-Cy5, GE Biosciences), with a labeling efficiency of ≈0.7 per subunit. CLC-ec1-Cy5 was reconstituted into POPC or 1:4 DLPC/POPC membranes30, where dialysis was carried out at 4 °C for ~3 days, unless otherwise noted. Chloride transport activity was measured using the ion electrode-based chloride dump assay54, where transport was quantified by measuring the initial rate (kinit) through linear fitting of the first 10 s of transport following addition of valinomycin/carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP). A biological replicate is defined as an independent sample that has been separately purified, labeled and reconstituted.

Freeze-thaw fusion of membranes

Fusion of 2:1 POPE/POPG, POPC and 1:4 DLPC/POPC membranes was monitored by Förster Resonance Energy Transfer in membranes (Extended Data Fig. 1c,d) detected by doping vesicles with 0.02% donor 18:1 PE-NBD (1,2-dioleoyl-sn-glycero-3-phos-phoethanolamine-N-(7-nitro-2-1,3-benzoxadiazol-4-yl)), 0.1% acceptor 18:1 PE-RhB (1,2-dioleoyl-sn-glycero-3-phosphoethanolamine-N-(lissa-mine rhodamine B sulfonyl)) or both following the procedures described previously36. Measurements were conducted using a Fluorolog 3–22 Fluorometer (Horiba). The ratiometric FRET signal is normalized between the negative control baseline FRET signal of the 0.02% NBD and 0.1% RhB combined samples before fusion (set to 0) and the positive control endpoint signal from the 0.02% NBD/0.1% RhB co-labeled proteoliposomes samples (set to 1.0).

Single-molecule photobleaching analysis

After dialysis, liposomes were freeze-thawed five times by cycling through −80 °C and room temperature. The fused membranes were then supplemented with 0.02% NaN3 and incubated at room temperature, in the dark, for 4–8 days. The samples were extruded through a 0.4 μm Avestin extruder 21x before examination by single-molecule total internal reflection fluorescence microscopy. Lanes on the slide were coated with 1 mg ml−1 polylysine and dilute glutaraldehyde to immobilize protein molecules and reduce lateral diffusion. Photobleaching probability distributions were measured and analyzed as described previously30,35.

All-atom MD simulations

Starting CLC-ec1 configurations are taken from the Protein Data Bank (PDB) entry 1OTS55. A single protomer is extracted for simulation and the first 30 residues are removed since these were shown in a previous study to be highly flexible and adopt multiple conformations56. Following truncation, the N and C termini are capped with acetyl and methylamine groups, respectively, and E113 (chain A and B) and D417 (chain A) were protonated, and all other residues were left in default protonation states (GLU−1/ASP−1, LYS+1/ARG+1, HIS0) in line with previous continuum electrostatics calculations predicting pKa states57. The protein was inserted into a pre-equilibrated lipid bilayer using the GRIFFIN tool58. The system is then buffered using a TIP3P water model, 150 mM NaCl ions and the net charge neutralized (see Supplementary Table 5 for details on molecular composition). Initial equilibrations of the systems are performed using NAMD 2.12 (ref. 59) in a multistaged protocol where structural restraints are gradually reduced over approximately 100 ns of simulation time (Supplementary Table 6). Production runs are then performed using the Anton 2 special computing cluster60 and the CHARMM36m force field61. For these simulations, the equations of motion are integrated with the Anton Multigrator scheme62 using a timestep of 2.5 fs. Likewise, the temperature and pressure are both maintained at 310 K and 1 bar, respectively, using a Langevin thermostat and a semi-isotropic Nose-Hoover barostat.

In some cases, cumulative force field inaccuracies can lead to deviations where important secondary and tertiary structural elements can be degraded after multiple microseconds of simulation time. Therefore, we introduce a term that biases the configurational sampling of backbone dihedral angles to favor a selected set of reference values, specifically those observed in the crystal structure (PDB ID: 1OTS). These terms do not preclude deviations of individual dihedral angles, as local structural fluctuations are allowed and are observed in the simulations. Yet collectively, large-scale structural changes are avoided since the bias limits the accumulation of many angles changing all at once. These restraints have been used in previous simulations on Anton 2 (refs. 37,63,64) and target the protein’s phi and psi angles using the following energy function:

$$U\left(_\right)=K\mathop\limits_^^\left(\frac_-\left(_-180\right)\right)\right)}\right)$$

(1)

where \(_\) is the angle computed at time t, \(_\) is the reference angle taken from the experimental structure, and K is the force constant chosen to be 1 kBT in our simulations, where kB is the Boltzmann constant and T is the temperature. Select contacts at the binding interface are also enforced using harmonic restraints (Supplementary Table 7) to ensure maximal integrity of the dimerization interface.

Noise-corrected projections of the lipid residence time

Projections of the time average lipid residence time are made with the MOSAICS v1.1.0 toolset (https://github.com/MOSAICS-NIH/MOSAICS) by computing Voronoi tessellations of the trajectory snapshots38. MOSAICS approximates Voronoi cells using a grid-based method that assigns lattice points to lipids that are closer to the lattice point than any other lipid. The method uses the lipid’s carbonyl carbons to define the vertices and has been shown to produce accurate tessellations in area per lipid calculations, provided the lattice spacing is small. Here we use a grid-point spacing of 0.7 Å, which is sufficient for mapping the lipid dwell time. We note that the dwell time estimates computed from tessellation data are subject to noise caused by internal fluctuations of the lipid atoms. For example, the Voronoi cell of a bound lipid molecule may appear to pop in and out of a binding pocket due to the motions of the lipid’s ester groups, which as noted are used to construct the Voronoi cells. In this case, the lipid will appear to bind to the protein multiple times, and the average dwell time underestimated. This problem is circumvented by consolidating the fragmented binding events into a single event. However, care must be taken to ensure that uncorrelated events are not included in the merge. This is accomplished by limiting the gap size between neighboring events that are mended to the timescale describing the lipid’s internal motions. The resulting gap size is determined from analysis of the lipid molecules’ mean square displacement (MSD) after an elapsed time τ. Figure 3b shows a plot of the MSD versus τ where the diffusion equation describes the linear region. Before this region, the lipid displacement deviates from the diffusion equation because the internal motions now account for a large portion of the overall movement. Still, the probability of a true lipid displacement event is negligible for this small timescale and only becomes significant when larger durations that exceed 50 ns are considered. We, therefore, minimize noise in dwell time estimates by merging all binding events that originate from a single lipid molecule and that are separated by no more than 50 ns.

Computation of residence time in, and lipid exchange probabilities between, solvation shells

Characterizations of the lipid dynamics in the protein’s solvation shells are made using the ‘Solvation Shells’ tool of MOSAICS38. This tool assigns the lipid molecules, represented in Voronoi tessellation data, to a solvation shell based on the lipid’s neighbors. These assignments are made using the following set of heuristics:

1.

If the lipid shares a border with the protein, it is a first shell lipid.

2.

If the lipid is not a first shell lipid but shares a border with one, then it is a second shell lipid.

3.

If the lipid is not a first or second shell lipid but shares a border with one, then it is a third shell lipid.

Following these rules, bordering lipids are determined by measuring the shortest distance between Voronoi cells and requiring this distance be within 3 Å. Lipids are then assigned a solvation shell number for each trajectory snapshot (Fig. 4b). Given this approach, the shell assignments will be susceptible to the same errors encountered when computing spatially resolved mean residence times. Namely, the internal vibrations of the lipid molecules lead to spurious transitions between shells that should be ignored in any serious analysis of the lipid dynamics. We thus perform noise filtering of the shell assignments with a width of 50 ns as determined by the time needed to observe lipid-lipid displacement events that initiate diffusion. In this case, the filtered shell number assumes the value most frequently observed within the window. The resulting shell assignments can be seen in Fig. 4b and Supplementary Video 2, and these assignments are used when reporting the number of lipids solvating each interface (Extended Data Fig. 2d,e).

With the lipid molecules sorted into shells, the dynamics in the first two shells are characterized. Specifically, we compute the lipid dwell time in each shell, as well as the probability that DLPC enters one of the shells when exchanges occur with that shell’s next neighbor. For example, the first shell exchanges with the second shell, and the second shell exchanges with the third. For lipid exchanges, we select pairs of lipids using a criterion designed to detect both direct and indirect exchanges. A direct exchange is defined as one in which the incoming lipid populates the same space previously occupied by the outgoing lipid. In contrast, indirect exchanges occur when the outgoing transition is accompanied by lateral movement along the protein interface by the remaining lipids in the same shell. In this latter case, the incoming lipid can enter the shell at a separate location, ultimately occupying a space within the shell that is distinct from where the outgoing lipid was positioned. To detect both cases, we select exchange partners to minimize the time separating when one lipid leaves the solvation shell and when the other enters.

Applying this approach, we find that lipid swaps are typically fast, occurring within 10 ns or less (Extended Data Fig. 2f), and that both direct and indirect exchanges are common. To distinguish between each type, we compute the distance separating exchange partners, or more precisely, the space occupied by each partner when it is in the solvation shell of interest. Exchanges occur over a range of distances, with many partners populating the same space and others residing in regions that are several nanometers apart (Extended Data Fig. 2g). We note that the exchange distances are extended for the nondimerization interface, as evident from the emergence of a peak centered at 6 nm. This peak corresponds to 40% of the sampled exchanges and represents events where the two lipids solvate different faces on the nondimerization interface (Extended Data Fig. 2h). These exchanges are not correlated and are removed from the analysis by splitting the nondimerization interface into two equally sized sections, ND1 and ND2, with each having about 15 lipids on average (Extended Data Fig. 2d). The analysis is then performed on each section separately, thereby reducing the number of uncorrelated events substantially.

With the exchange partners identified, we group them by the leaving lipid type and compute the probability that the incoming lipid is DL (Fig. 4i and Extended Data Fig. 2i). We then compute the probability that DL enters the first shell given that PO leaves or the probability that DL enters when DL leaves. We note that the probability that PO enters the same shell, given a specific leaving type, is computed as one minus the corresponding probability for DL (Extended Data Fig. 2j). The expected random exchange probability corresponds to the bulk lipid mole fraction, that is, 0.3 for DLPC or 0.7 for POPC. The difference between these bulk concentrations and the observed exchange probability thus highlights the exchange preferences in the solvation shells around the protein (Fig. 4i and Extended Data Fig. 2i).

Error estimates are reported for these quantities by splitting the trajectory into three equally sized segments and computing averages over each. The s.e.m. is then computed as:

where n is the number of measurements, in this case three, and σx is their s.d.

Calculations of diffusion coefficients

Diffusion coefficients (D) are computed for each lipid species using the Einstein relation:

$$\left\langle \limits^-\limits^}_\right)}^\right\rangle =2\tau$$

(3)

where n = 2 and is the dimensionality in which diffusion occurs, τ is the elapsed time, and \(\langle \limits^-\limits^}_\right)}^\rangle\) is the MSD; \(\limits^}_\) is then a position vector computed for the lipid’s geometric center and \(\mathop\limits^\) is this vector after τ. Of note, we only use heavy atoms, that is, excluding hydrogen atoms, when computing r and \(\limits^}_\). Moreover, to make the most of our data, we let the initial time, from which we take \(\limits^}_\), vary. Likewise, the MSD for each timepoint is averaged over the individual lipids. Before any calculations, the molecular system is unwrapped to remove periodic boundary effects. This step is performed using the PBC XY tool of MOSAICS, which accounts for fluctuations in volume that, if not removed, lead to exaggerated motions of the lipid molecules65.

Estimations of the DLPC lipid enrichment factor from exchange rates and lipid dwell times

Here we develop a relationship between the lipid’s mean residence time, their exchange probability, and the DLPC lipid enrichment factor. The DLPC enrichment factor is defined as:

$$\% E=\frac\left(\left(\frac}}_}}}}}}_}}}}\right)-R\right)100 \%$$

(4)

where ρ is the lipid density for either DLPC or POPC as indicated by the subscript and R is the corresponding ratio in the bulk, that is, \(}_}}}}_}}}]}_=\frac\). In practice, ρ is computed as the total amount of time the lipids of the desired type are present. This can be computed as:

$$}=_}}}}\times \bar$$

(5)

where nlipid-visit is the total number of times the lipid visits the interface and \(\bar\) is the average dwell time. Moreover, nlipid-visit is related to the lipid exchange probability by:

$$_}}}}\approx_}}}}\times P$$

(6)

where nlipid-exchange is the total number of lipid exchanges and P is the exchange probability. Plugging these results into our expression for enrichment, we get the desired result:

$$\% E=\frac\left(\frac_}}* }_}}}_}}* }_}}}-R\right)100 \%$$

(7)

Branching from equation (7), we can gauge the effects of the dwell times or exchange probabilities separately. For example, if we assume the average dwell time for POPC and DLPC is equal, then equation (7) simplifies to:

$$\% E=\frac\left(\frac_}}}_}}}-R\right)100 \%$$

(8)

Similarly, if we assume each exchange probability to be equal to the corresponding lipid concentration, then we get:

$$\% E=\left(\frac}_}}}}_}}}-1\right)100 \%$$

(9)

Equations (7)–(9) are used to estimate the overall and individual contributions to the DLPC enrichment factor from kinetics data. Estimates of each are provided for the first two solvation shells in Supplementary Table 4, where s.e.m. is computed by splitting the trajectory into three parts (n = 3) and analyzing each separately.

Contact analysis

Ordinary contacts between the protein and lipids are counted using a distance cutoff of 3.5 Å. These contacts were measured between the protein and lipids while including the full set of atoms from each. For counting the number of hydrogen bonds, we use a convention that considers the angle between the donor, acceptor, and hydrogen atoms, as well as the distance between the donor and acceptor atoms. Hydrogen bonds thus require the angle to be less than 30° and the distance below 3.5 Å (Extended Data Fig. 3a). Donor atom types on the protein include the backbone and side-chain nitrogen and oxygens. Since POPC and DLPC molecules contain only acceptor atoms, that being the carbonyl and phosphate oxygens, no acceptor types are required for the protein. Salt bridges are then counted as select hydrogen bonds between the lipid and charged amino acids Lys, Arg and His (Extended Data Fig. 3b). Salt bridges between choline and charged residues Glu and Asp are weak and rarely formed (Extended Data Fig. 3c) and are excluded in the main analysis (Fig. 4f).

Calculation of free-energy changes from variations in lipid composition

Considering the lipid dynamics observed in our all-atom MD simulations of monomeric CLC-ec1, we seek to demonstrate that the redistribution effects indeed explain the shift in the monomer-dimer equilibrium constant that is experimentally measured upon addition of DL lipids to PO membranes. To establish this interpretation based on simulation data, we must calculate the change in the free energy of dimerization when the lipid composition of the membrane is altered and compare the result with experimental data. In principle, this problem can be approached in the following two ways: a calculation might be designed to simulate the association and dissociation of CLC-ec1 monomers, in different membrane conditions, deducing the dimerization free-energy in each case; or alternatively, a series of simulations might be designed to calculate, separately for the dimer and dissociated monomers, the free-energy gain or loss upon a change in the lipid composition of the membrane, deriving from the results the change in dimerization free-energy (Fig. 5b). While the first approach seems more intuitive, it is also more prone to error, as the mechanism of monomer–monomer recognition is unknown at the molecular level. Thus, we follow the second approach; to do so, we adapt a simulation methodology designed to evaluate differentials in solvation energetics for small molecules, known as FEP40.

Here we use this ‘alchemical’ approach to transform not one solute, but the total composition of the lipid bilayer solvent. To do so, we gradually morph randomly-selected POPC lipids into DLPC, for both the dissociated and associated states of CLC-ec1, as well as for a protein-free membrane, which we use as a reference. Convergence of the calculated free energy changes requires extensive sampling; therefore, we purposely carried out these calculations using CGMD simulations and the MARTINI force field to facilitate this. Since the ultimate transition from 0 → 30 % DLPC involves hundreds of molecules, we split the computation into multiple segments such that DLPC lipids are gradually added to the system (Supplementary Table 8), which also allows comparison with the experimental titration curves. Four transformation steps are carried out, starting with 0 → 1%, 1 → 10%, 10 → 20%, and finally 20 → 30% DLPC. Of course, the FEP methodology is built on the assumption that the phase space dimensions are unaffected by a given transformation, that is, the number of atoms does not change. In cases where one or more atoms are deleted, this limitation is circumvented by introducing dummy atoms, whose through space interactions are muted (that is, set to zero). For our simulations, we create a modified DLPC molecule, referred to as DLPC*, that is identical to the original DLPC but with a dummy atom attached to the end of each alkyl chain. These added beads are given standard bonding and angle bending parameters (Supplementary Table 9), and their Lennard-Jones terms are set to nil. MD simulations containing DLPC* are indistinguishable to those containing DLPC, as indicated by established metrics like membrane thickness and leaflet interdigitation (Extended Data Fig. 7). Notably, the dummy molecules preserve the spatial distribution of the lipids, as can be seen by examination of the DLPC enrichment factor.

Given that DLPC* and DLPC are indistinguishable, we transform POPC molecules into DLPC* in our FEP simulations by transforming chain length, saturation and energy parameters (Fig. 5a). These simulations are carried out using Hamiltonian replica exchange66 where the scaling parameter λ is varied for each replica, thus improving sampling. To enable error estimates, we perform two independent simulations for each transformation, capturing the forward and backward paths. That is, we perform the following two simulations: one where λ is varied from 0 to 1 as we move across the replicas, and another where this distribution is reversed. The estimated free energy cost for each transformation as a function of simulation time is obtained using the GROMACS implementation of the Bennett acceptance ratio67 (Extended Data Fig. 8a,b). We note that these distributions are slow to converge, requiring 6 μs for the 1 → 10%, 10 → 20%, and 20 → 30% transformations and 40 μs for the 0 → 1% condition. In the case of the 20 → 30% transformation, it is evident that the systems are approaching convergence, although not entirely there compared to the other transformations. Still, when the uncertainty in these computations is propagated into the solvation free energy calculations (Fig. 5c–e), the errors show that the quantities are sufficiently converged to not change the conclusions in a meaningful way.

FEP simulations are based on the CLC-ec1 constructs simulated in umbrella sampling simulations described next, specifically a dimer in both associated and dissociated states; an additional system is constructed that contains no protein for reference. In cases where a protein is present, starting structures are taken from initial steps of the umbrella sampling simulations, that is, from the outside windows where the monomers are initially associated or separated by 5.5 nm. The protein is then converted into a CG model using the Martinize tool and the Martini 2.2 force field parameters68. The membrane and solvent use the Martini 2.0 force field69 and are built around the protein, if present, using the INSANE script70. The net charge is neutralized, water is added and the system is buffered to 0.150 M with Na+ and Cl− ions. An energy minimization step is then performed, and the system is equilibrated at constant pressure and temperature for 10 ns. This results in a box with approximate dimensions 25.2 nm × 15.1 nm × 9.6 nm. The coordinates of this system are used to create the initial structure for each transformation by modifying select residue types in the topology file (see Supplementary Table 8 for details). Each system is then equilibrated for 1 μs at constant temperature and pressure, thereby randomizing the distribution of DLPC* molecules, which were selected in sequence. This equilibration step is followed by a slow growth simulation where the transformation is performed over the course of 20 μs. This, along with our FEP simulations, restricts the distance between monomers and the protein orientation, following the protocol used in umbrella sampling simulations. The FEP simulations are then carried out using Hamiltonian replica exchange such that λ is varied for each replica.

It is important to note that the free energy cost of carrying out the transformations is, in each case, hundreds of kilocalories per mole. Yet the important quantity, which is how the cost differs between the associated and dissociated states, is a small fraction of this ΔG (Extended Data Fig. 8a,b). The calculations described here are thus exceedingly challenging, as relatively small statistical errors in the calculated values can completely obscure the effect we seek to evaluate. Long sampling times and judicial design of the alchemical transformations are crucial to achieve complete mixing of all the lipid components and thus minimize these errors (Extended Data Fig. 4b). Our simulations thus use a set of λ values derived by restricting the relative entropy between neighboring replicas to a value of 1 kBT or less (Extended Data Fig. 8c). This requirement ensures substantial overlap between the probability distributions of neighboring replicas and results in 38 replicas for the 0 → 1% transformation and 93 for the remaining (Supplementary Table 10). These λ distributions are determined by performing a short simulation with uniform spacing and measuring the relative entropy between neighboring replicas. Since the λ spacing is uniform, each replica can be thought of as a bin spanning the λ space. These bins are then assigned a density proportional to the relative entropy. The λ space is then split into blocks, and the replicas are redistributed over each based on the density summed over the bins within the block. Like the initial trial, the replicas in each block are distributed with equal λ spacing. Tweaks are then made as needed, following another short simulation with this refined distribution. It is noted that the slow growth simulation described previously provides equilibrium structures for launching the trial simulations described here, as well as the FEP simulations with the optimized λ distribution. Production runs are then carried out with GROMACS 2018.8 (ref. 71) using an integration step of 20 fs and exchange moves between neighboring replicas are attempted every 1 ns. This results in an average exchange rate of approximately 40%. The pressure and temperature are set to 1 bar and 303 K using the semi-isotropic Berendsen barostat72 along with the velocity-rescaling algorithm73. The electrostatics are treated with the reaction-field method. Trajectories are collected using 6 μs for each replica for the 1 → 10%, 10 → 20% and 20 → 30% DLPC conditions and 40 μs for the 0 → 1% condition. A total of 16.58 ms of simulation time is acquired.

Finally, it is worth noting that the molecular systems simulated here contain a lipid-to-protein ratio that is approximately 1,000 times smaller compared to the experimental conditions. Thus, to mimic the experimental degree of protein dilution, our simulation systems would have to be 1000 times larger, making the calculations unfeasible at the present time. The implication of this size discrepancy is that the number of DL lipids available to solvate each protein molecule is much greater under experimental conditions, and so the inhibitory effect of DL is necessarily more pronounced in experiment (Extended Data Fig. 5). To compare with simulation, therefore, we must factor out this size effect. A simple way to do so is to compare calculated and experimental free-energy values in reference to the largest DL molar fraction, that is the condition in which the DL enrichment is not limited by the number of DL lipids available, in either protein state; if we do this by comparing changes relative to the 20% DL condition, there is clear agreement between calculated and measured shifts in the dimerization free-energy (Fig. 5e).

Computations of the change in lipid solvation free energy, ΔΔG

ΔG values are reported for each transformation as the average over the forward and backward paths such that:

$$\Delta _=\frac\left(}}_}}+}}_}}\right)$$

(10)

where the subscripts f and b refer to the forward and backwards paths, whereas x is a placeholder for the associated state (a), dissociated state (d) or a membrane absent of any protein (m). Similarly, 0 → 1 indicates that the transformation is made from 0% to 1% DLPC. It then reasons that the free energy change taken over the full transformation, that is, from zero to thirty percent DLPC, is given as the sum of the individual steps:

$$\Delta _=\Delta _+\Delta _+\Delta _+\Delta _$$

(11)

Using equation (11), the solvation energy is computed as:

$$}}_}}^}}=\Delta _}}-\Delta _$$

(12)

where the subscript a/d indicates the corresponding energy for the associated or dissociated state. Similarly, \(\Delta \Delta _\) is computed as:

$$\Delta \Delta _=}}_-}}_$$

(13)

It is noted that, while the examples shown here focus on the complete transformation, the equations are easily adaptable to partial transformations, such as 0 → 1%, 0 → 10% or 0 → 20%. Moreover, we note that the s.e.m. is not computed for these free energies, as we only have two independent samples. However, we do include the individual data points in Fig. 5, as differences between the forward and backward paths quantify the simulation convergence. This property emerges since the free energy is a state function. Consequently, both paths should provide similar results within sampling variability.

Umbrella sampling simulations

The membrane’s contribution to the association free energy of the CLC-ec1 dimer is computed using umbrella sampling simulations. These simulations span multiple states of the association reaction for three different molecular systems, including a native dimer in POPC, a non-native dimer in POPC, and a native dimer in a 3:7 mixture of DLPC and POPC. In each case, multiple copies of the molecular system, referred to as windows, are simulated, where a biasing potential is applied to target a specific intersubunit distance that varies for each window. These biasing potentials are used to improve sampling along the reaction coordinate, which is the intersubunit distance of the protomers, thus providing an improved estimate of the free energy profile that is recovered in postanalysis using methods like the weighted histogram analysis method74. Since extensive simulation time is required to ensure proper sampling of all windows, we use a CG representation of the molecular systems using the MARTINI force field in the current study.

The first two molecular systems are constructed by inserting CLC-ec1 dimers into a pre-equilibrated membrane patch containing pure POPC; the mixed membrane system is then considered. The initial membrane patch is constructed by placing POPC molecules inside a box with dimensions 25 nm × 25 nm × 10 nm. Following this step, the bilayer is hydrated and buffered with 0.150 M NaCl. A simulation is then performed for 20 μs to allow the lipid molecules to redistribute away from their initial configuration. The equilibrated membrane patch is used to construct a series of CG models, each of which is used for a window in the first two umbrella sampling simulations. These models include a CLC-ec1 dimer whose protein orientation is varied, giving rise to a native and non-native dimer, where the dimerization interface of one protomer faces the other for the native dimer but not for the non-native (Extended Data Fig. 6a,b)75. Models for the native dimer are built using coordinates from the CLC-ec1 crystal structure (PDB ID: 1OTS)55, while the non-native dimer is created using the HDOCK server76. In both cases, the protein is truncated, removing the first 30 residues since these were shown to be highly flexible in a previous study56. The protein is then converted to a CG representation using the Martinize script, and the protomers are translated along the x axis in increments of 0.05 nm to achieve an intersubunit distance ranging from 0.0 nm to 5.5 nm. These translations result in a total of 111 windows for both the native and non-native dimers.

Given the proper orientation and separating distance, each protomer is embedded into the pre-equilibrated membrane patch, and the resulting molecular system is subjected to a pair of energy minimization steps followed by a four-stage equilibration that restrains the backbone beads of the protein while the lipids equilibrate around the protein. Production runs are then performed, where each window is simulated for 10 μs. The production runs use the velocity-rescale algorithm73 and the semi-isotropic Parrinello-Rahman barostat77 to maintain the temperature and pressure of 310 K and 1 bar, respectively. Notably, our simulations introduce a restraint that maintains the distance between the protomers and a second that fixes each protomer’s orientation relative to the other. A bumper potential is also added that penalizes the formation of interprotomer contacts.

Because each simulated window varies the distance separating the CLC-ec1 protomers, this distance defines the reaction coordinate used when describing the free energy of association. We thus define this separating distance by introducing two sets of atoms from each protomer and computing the geometric center of each set. These centers are shown as G1 and G2 in Extended Data Fig. 6a,b, and the specific atoms that make the sets are provided in Supplementary Table 11. Next, a single point is computed for each protomer as the center of points G1 and G2. This point defines the center of each protomer and the distance between them is restrained at their initial distance taken when the simulation is started, which, due to the translations described previously, ranges from 3.4 nm to 8.9 nm for the native dimer and 3.8 nm to 9.3 nm for the non-native dimer. The restraints described here consist of harmonic potentials with a force constant of 2,000 kJ mol−1 nm−2 and are introduced during the simulation using PLUMED78.

In addition to the separating distance just described, the protein orientation is also maintained by introducing a second set of restraints designed to prevent rotation around the z axis while maintaining fluctuations in the protein tilt angle. Typically, these restraints do not affect the sampling of the internal configurational space. However, they are introduced here to maintain either the native or non-native dimers throughout the simulations. The restraints work by fixing the y-component of the centers G1 and G2, that is, these centers are restrained to the XZ plane. An additional center is then computed as the center of the two protomer’s geometric centers, that is, this new center defines a point directly between the two subunits and is restrained to a line by holding fixed the x and z components. Both restraints described here are implemented using the upper- and lower-WALLS options available in PLUMED and use a force constant of 2,000 kJ mol−1 nm−2.

With the intersubunit distance and the protein orientation fixed, an additional potential, referred to as a bumper potential, is added to our simulations that penalizes the formation of interprotomer contacts. This bumper is needed since we are primarily interested in the membrane’s contribution to the association energy. The bumper aids in this task by providing a visual cue in the free

Comments (0)

No login
gif