Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per gene copy number signal. #128

Closed
sohrabsa opened this issue Mar 26, 2024 · 5 comments · Fixed by #132
Closed

Per gene copy number signal. #128

sohrabsa opened this issue Mar 26, 2024 · 5 comments · Fixed by #132
Labels
enhancement New feature or request

Comments

@sohrabsa
Copy link

Description of feature

I've noticed that on occasions, after running cnv.tl.infercnv the shape of adata.obsm['X_cnv'].shape is not equal to the size of adata.X.shape.
It is possible to please get gene specific copy number values? or otherwise, a mapping for genes to the segments that comprise the columns of X_cnv?

@sohrabsa sohrabsa added the enhancement New feature or request label Mar 26, 2024
@grst
Copy link
Member

grst commented Mar 28, 2024

Hi, this is currently not possible. Since infercnvpy aggregates gene expression of several genes in a sliding window the information about individual genes is lost.

In principle, it would be possible to store a mapping in which "bins" a certain gene is located - There's a PR open for that #58, but it has gone stale.

@sohrabsa
Copy link
Author

Okay, got it thanks. Could you confirm that the columns of X_cnv are bins of size window_size please?

@grst
Copy link
Member

grst commented Apr 1, 2024

That's correct, but note that the bins overlap, i.e. each gene occurs in multiple bins.
It's described here in more detail: https://infercnvpy.readthedocs.io/en/latest/infercnv.html

@knadia07
Copy link

Hi is there a way to pull out group of genes within selected chrm coordinates that have high cnv values? We want to know what are the most probable gene sets that are likely altered

@grantn5
Copy link
Contributor

grantn5 commented Jun 11, 2024

Hi I am working on PR and adding to #58 that will allow you to extract the per gene copy number by averaging across all the bins a given gene appears in, I will hopefully have something ready in the next 2 weeks!

Additionally, in my refactoring I have noticed that the current implementation uses the default np.convolve method full, this means that the pyramid array is fully moved over the gene array, which does not match with the documentation for the function I have therefore updated method to valid which does match what the documentation suggests. Is this the correct interpretation?

Edit: PR is here #132

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants