Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Effect of -etbb value on conformer generation rate #140

Open
SeanR22 opened this issue Jun 28, 2021 · 5 comments
Open

Effect of -etbb value on conformer generation rate #140

SeanR22 opened this issue Jun 28, 2021 · 5 comments
Labels
discussion Discuss new ideas or implementations

Comments

@SeanR22
Copy link

SeanR22 commented Jun 28, 2021

I generated conformers for a high complexity ~120 amino acid protein varying the backbone energy cutoff -etbb value (shown in the figure legend) and kept the sidechain energy cutoff constant at -etss 5000. I also used -xp 0 0 1 1 1.

I measured the rate at which idpconfgen build could generate conformers that met a certain energy cutoff (shown on the x axis).

Interstingly, the -etbb value seems to only limit the rate at which low energy conformers can be generated. Likely because it gets stuck cycling and has a hard time finding a final solution.

low etbb values slows generation rate

So, it begs the question... Would it be more efficient to generate a bunch of random conformers from chunks then screen the energy of each conformer at the end of each run rather than constantly measuring the energy as the chunks are added? Is the -etbb value necessary or does it just make the process more convoluted and less efficient? There may be some upper -etbb value where the process becomes less efficient at finding low energy structures but I haven't found it yet.

From the runs I have done so far it seems that what really limits the ability to find low energy solutions is the particular arrangement of the amino acids in the sequence, its length and the ability to find low energy solutions from the PDB and not the input cutoff values.

@SeanR22
Copy link
Author

SeanR22 commented Jun 28, 2021

To add to the above comment, even though it may seem counter intuitive, if you are aiming to keep only conformers of 1000 kcal/mol or less you should set -etbb 3000 and -etss 1000. The particular amount that -etbb should be set above -etss will be sequence dependent.

@joaomcteixeira joaomcteixeira added the discussion Discuss new ideas or implementations label Jun 28, 2021
@joaomcteixeira
Copy link
Member

Hi @SeanR22

Would it be more efficient to generate a bunch of random conformers from chunks then screen the energy of each conformer at the end of each run rather than constantly measuring the energy as the chunks are added?

Definitively not. Despite you don't see, before each chunk is added, idpconfgen screen dozens, even hundreds of possibilities. If we don't block the process to a feasible energy cutoff we would not be able to get out of clashes. I would suggest the other way around: set -etbb to a higher value and then filter for only those conformers with lower energy. It could be a strategy.

I don't think -etbb is not needed, it allows more flexibility because backbone construction is an independent process of side-chain construction.

There may be some upper -etbb value where the process becomes less efficient at finding low energy structures but I haven't found it yet.

it can be. However, I would like to keep in focus what you suggested last time. How high can we set the bb cutoff such that a minimization afterwards is enough to relax the conformer without provoking pronounced deviations from its original structure?

From the runs I have done so far it seems that what really limits the ability to find low energy solutions is the particular arrangement of the amino acids in the sequence, its length and the ability to find low energy solutions from the PDB and not the input cutoff values.

Absolutely. Some sequence chunks are very rare. Or put in another way, sample very similar conformations despite the number of counts in the database (!!!). So, if you are sampling one of those regions at a position where there's a clash, idpconfgen will have a hard time finding a way out of it. That's why some sequences build very fast and others not.

To add to the above comment, even though it may seem counter intuitive, if you are aiming to keep only conformers of 1000 kcal/mol or less you should set -etbb 3000 and -etss 1000. The particular amount that -etbb should be set above -etss will be sequence dependent.

You are right, but not the trick also. The quality of the backbone will be 3000, while the quality of the whole conformer considering sidechains will be 1000. The protocol does not improves the quality of the backbone. Simply, it states that FASPR (currently) was able to pack sidechains such that the Lennard-Jones contributions have a positive (negative energy) impact of (at least) 2000.

Always keep in mind that in the current implementation whole LJ profile is calculated and individual energies summed. There is actually not a hard limit for a clash.

Also, while -etbb defines the energy threshold only for the backbone, the -etss defines the cutoff for the whole conformer (all-atoms). Is this clear from the documentation? Do you think there should be another parameter there for the energy of only the sidechains? Don't know if that makes sense at all.

To my experience I would say -etbb 3000 for ~150 residue protein is good. Is exactly the same you have found.

Thanks so much Sean for putting forward so many experiments.
Cheers!

@SeanR22
Copy link
Author

SeanR22 commented Jun 29, 2021

Thanks @joaomcteixeira

Yes, how the energy thresholds work is clear to me. It makes sense that there should be a separate energy threshold for building the backbone and then a total energy threshold at the end after sidechains are built.

I still don't understand how allowing the user to set the backbone threshold helps obtain a better result but we can discuss further. The comments made in my previous post are solely based on the observations from my runs and not the inner workings of the code. I'm obviously missing something here.

I do wonder if the -etbb value needs have a dynamic quality to speed up the build process when the user sets it to a value that is too low? I think the expectation of any user will be that if they set this threshold to a particular value that it should improve the ability to obtain structures below that value. This is why the opposite result was so unexpected for me!

Cheers!

@joaomcteixeira
Copy link
Member

I do wonder if the -etbb value needs have a dynamic quality to speed up the build process when the user sets it to a value that is too low?

This is interesting point. But I can't think of all implications right away. Let me finalize some implementations I am working on and then we will return to these discussions. Thanks so much @SeanR22 !!

@SeanR22
Copy link
Author

SeanR22 commented Jun 29, 2021

Got it @joaomcteixeira - I'll leave you alone for a bit! ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discuss new ideas or implementations
Projects
None yet
Development

No branches or pull requests

2 participants