Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questionable results #3

Open
cahuparo opened this issue Jan 3, 2023 · 1 comment
Open

Questionable results #3

cahuparo opened this issue Jan 3, 2023 · 1 comment

Comments

@cahuparo
Copy link

cahuparo commented Jan 3, 2023

Hi there,

I really like this concept. However, I am trying to make sense of my results. After plotting PC1 and PC2 for Ceratocystis fimbriata proteome. I observed clustering with very un expected taxa and the prediction as a SYMBIONT. This is concerning but I am happy to try something else.

I used both v7 and v10 databases and got the same result.

Any suggestion would be appreciated!

Best,

Camilo

Screen Shot 2023-01-03 at 8 46 20 AM

@darcyabjones
Copy link
Member

Hi Camilo,

Sorry for the late reply, i've been away for a while.
James would be the best person to talk to about interpreting the results etc, the concept and model design was theirs.
Send them an email ([email protected]) as they don't really use github.

As a quick response, yes i agree that it's odd for this to be clustering with the biotrophs.
From a techical point of view the only thing that could artificially bias the results is if you have a particularly large genome/proteome, as the CAZyme counts weren't scaled before the PCA and many of the biotrophs used have bigger genomes. That's assuming that I haven't made a mistake somewhere on the software side, but i think it's all ironed out now. Otherwise the model really just summarises data, so what you're seeing is a genuine similarity to the biotrophs in terms of CAZyme content.

My initial thought would be that potentially there is an expansion (or depletion) of some CAZyme families more commonly associated with biotrophy. I'd be looking at the PCA loadings and CAZyme counts of those species and yours to see what is causing them to be placed close together.
James should be able to help you do this (I don't work for them anymore).

All the best,
Darcy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants