Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of multiple rotamers in PDB files #270

Open
lucasmiranda42 opened this issue Mar 27, 2024 · 2 comments
Open

Handling of multiple rotamers in PDB files #270

lucasmiranda42 opened this issue Mar 27, 2024 · 2 comments

Comments

@lucasmiranda42
Copy link

lucasmiranda42 commented Mar 27, 2024

Hi!

Upon working with ProteinFamilyDataset, I ran into some issues due to multiple rotamers being present in the loaded PDB files.

First, and just in case, my code to load the data:

from proteinshake.tasks.pfam_task import ProteinFamilyDataset
pfam_dataset=ProteinFamilyDataset(only_single_chain=True, use_precomputed=True)

When I iterate through and process the downloaded proteins, my goal is to keep coordinates for each CA, C, O, and N atoms only. Upon loading some structures, such as this one, I see that some residues (in this case ASP11) have duplicated coordinates corresponding to multiple rotational conformations of the side chain:

image

In the raw PDB file, this is coded in the name of the residue. The alpha carbon of residue 11 is, for example, tagged both as AASP and as BASP.

ATOM     82  CA AASP C  11      14.656   6.921 -15.720  0.42 37.13           C  
ANISOU   82  CA AASP C  11     4826   5151   4130    771   -213   -129       C  
ATOM     83  CA BASP C  11      14.654   6.917 -15.721  0.58 37.11           C  
ANISOU   83  CA BASP C  11     4824   5150   4128    771   -214   -130       C  

Would it be possible to select and handle rotamers within proteinshake directly? In my current application, I can get away with selecting any conformation in these simple cases, but this may not always be applicable.

Best, and thank you!
Lucas

@timkucera
Copy link
Collaborator

Good catch and great issue. Programmatically there would be multiple ways to deal with this, but to keep things simple I would probably remove those in the preprocessing (unless there is a good reason not to?). I will keep it in mind for the upcoming release.

@pjhartout
Copy link
Collaborator

@timkucera beyond rotamers, some structures have multiple states (e.g. apo/holo states for proteins with ligand binders, or other complex proteins, e.g. https://www.rcsb.org/structure/5are).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants