Atomic packing has been an important metric for characterizing protein structures since 1974, when it was observed that the average packing density within proteins' interiors is roughly equivalent to that of small organic molecule crystals [1]. Although numerous methods had been developed to calculate the packing and interactions of amino acid residues within proteins, the use of packing density as a criterion for evaluating model protein structures was developed explicitly in 1990 [2].
To date, several approaches have been tested to measure the atomic packing in structures. The Voronoi procedure is a widely-used method, in which a unique volume is assigned to individual atoms in order to study variations in packing of proteins [[3], [4], [5], [6], [7]]. Another well-established method for analyzing packing interactions in proteins is based on the calculation of the occluded molecular surface [8]. Other methods and approaches for analyzing protein packing have also been reported [[9], [10], [11], [12], [13], [14], [15]].
Packing is an important aspect of protein structures. A compact packing of amino acid residues is known to affect both the thermal stability and folding rate of proteins. [[16], [17], [18], [19], [20], [21], [22], [23], [24], [25]]. Protein stability is of significant interest to the biotechnological, pharmaceutical, and food industries. The effects of packing on protein stability has extensively been studied to the point that modeling programs have incorporated packing as a parameter, aiming to predict protein stability after mutations [26]. Moreover, hydrogen bonds, which increase the packing density in the protein interior are known for their indirect contribution to protein stability [27]. Recent studies indicated that the major determinants of protein stability include packing and van der Waals interactions [[28], [29], [30]].
As the structural comparison shown in Fig. 1, Fig. 2 exemplifies, proteins do not exhibit uniform packing throughout their structure [31]. Localized packing defects appear as cavities, and their presence can compromise the stability of the protein [32]. Additionally, the distribution of these voids (cavities) is highly heterogeneous across different proteins [33]. Despite the presence of occasional cavities, the interior of spherical proteins remains tightly packed. The Voronoi volumes of surface atoms, modeled with solvent surrounding the protein, are approximately 7 % larger [34,35], indicating that packing is less dense on the protein surface.
Experimental studies have shown that mutations in protein cores, where small residues are replaced with larger ones, generally destabilize the protein. This suggests that there is minimal empty space available to accommodate additional atoms [23,36]. This can be explained by the α-helical and β-sheet secondary structures in globular proteins. These elements organize in a manner that allows non-polar side chains to interlock like jigsaw puzzle pieces, creating densely packed cores. As a result of this tight packing, van der Waals forces are considerably stronger in the interior [27]. However, due to energetically unfavorable atomic overlaps, protein cores cannot exceed some density limits [37]. The rigidity of protein cores is also shown to be strongly correlated with packing density [38,39]. Furthermore, studies have shown that the interior of proteins evolves slowly, in contrast to the surface which has more rapid evolution [40,41]. Solvent accessibility has become the de facto structural measurement to use in protein evolution studies. However, more recent work has called the central role of solvent accessibility into question and has identified packing as an important factor too [42]. The two packing measures most frequently employed in evolutionary studies are the contact number and the weighted contact number. For a given amino acid, the contact number represents the total count of other residues within its local structural neighborhood. In contrast, the weighted contact number considers all residues in the protein, assigning weights to them based on the square of their inverse distance to the amino acid under examination [43,44].
Atomic packing is affected by a combination of different factors. A statistical analysis of the radius of gyration for 3769 protein domains across four major classes (α, β, α/β, and α+β) revealed that each class exhibits a characteristic radius of gyration, indicating its specific level of structural compactness. For example, α-helical proteins exhibit the highest radius of gyration across the considered protein size range, indicating a less compact packing compared to β and (α + β) proteins. In contrast, α/β proteins display the lowest radius of gyration, characteristic of the most compact packing among the classes [45].
Another study showed that for proteins with a molecular weight below 20 kDa, the average density shows a positive deviation that becomes more pronounced as molecular weight decreases, indicating that smaller proteins are more densely packed than larger ones, which tend to have a looser packing structure [46]. Additionally, an analysis of 152 non-homologous proteins demonstrated that variations in protein packing are influenced by a complex interplay of protein size, secondary structure, and amino acid composition. They showed that helices appear to be more efficiently packed compared to strands and that large proteins are expected to have increased overall packing [47].
In this communication we attempt to approach the problem of characterizing and analyzing protein density not through average statistical or structural properties, but by building and directly comparing individual density profiles which were created for an extended set of more than 21,000 proteins. The essence of our approach is the following: For each atom of each structure we calculate the density (in Da/Å3) inside a sphere centered on that atom. If, for example, a given protein structure contains 5000 atoms, then we would calculate 5000 density values (one for each atom). These density values are then used to calculate a histogram of their distribution which is characteristic of the protein structure under examination. Having collected the density distributions, we can quantify and analyze their similarities and differences using established metrics such as the Euclidean distance (calculated between any given pair of distributions). By doing an all-to-all comparison of those distributions, we can quantitatively characterize structural and functional patterns present in these distributions, as well as evolutionary relationships between diverse families of proteins. In the following paragraphs we present details of this method, and of the structural, functional, and evolutionary results obtained from its application.
Comments (0)