You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I iterate through and process the downloaded proteins, my goal is to keep coordinates for each CA, C, O, and N atoms only. Upon loading some structures, such as this one, I see that some residues (in this case ASP11) have duplicated coordinates corresponding to multiple rotational conformations of the side chain:
In the raw PDB file, this is coded in the name of the residue. The alpha carbon of residue 11 is, for example, tagged both as AASP and as BASP.
ATOM 82 CA AASP C 11 14.656 6.921 -15.720 0.42 37.13 C
ANISOU 82 CA AASP C 11 4826 5151 4130 771 -213 -129 C
ATOM 83 CA BASP C 11 14.654 6.917 -15.721 0.58 37.11 C
ANISOU 83 CA BASP C 11 4824 5150 4128 771 -214 -130 C
Would it be possible to select and handle rotamers within proteinshake directly? In my current application, I can get away with selecting any conformation in these simple cases, but this may not always be applicable.
Best, and thank you!
Lucas
The text was updated successfully, but these errors were encountered:
Good catch and great issue. Programmatically there would be multiple ways to deal with this, but to keep things simple I would probably remove those in the preprocessing (unless there is a good reason not to?). I will keep it in mind for the upcoming release.
@timkucera beyond rotamers, some structures have multiple states (e.g. apo/holo states for proteins with ligand binders, or other complex proteins, e.g. https://www.rcsb.org/structure/5are).
Hi!
Upon working with
ProteinFamilyDataset
, I ran into some issues due to multiple rotamers being present in the loaded PDB files.First, and just in case, my code to load the data:
When I iterate through and process the downloaded proteins, my goal is to keep coordinates for each
CA
,C
,O
, andN
atoms only. Upon loading some structures, such as this one, I see that some residues (in this case ASP11) have duplicated coordinates corresponding to multiple rotational conformations of the side chain:In the raw PDB file, this is coded in the name of the residue. The alpha carbon of residue 11 is, for example, tagged both as
AASP
and asBASP
.Would it be possible to select and handle rotamers within proteinshake directly? In my current application, I can get away with selecting any conformation in these simple cases, but this may not always be applicable.
Best, and thank you!
Lucas
The text was updated successfully, but these errors were encountered: