-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a new scRNA workflow for standard analysis using Scanpy #556
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
The Barcodes file is missing. |
keys: "uns/rank_genes_groups" | ||
pl_umap_marker_genes: | ||
path: test-data/pl_umap_marker_genes.png | ||
compare: sim_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are nowadays better asserts available for images. Maybe you can add them in addition.
Also if you just compare sim_size you don't need to have the files in the repo, do you?
@lldelisle can you please review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Thanks.
orcid: 0000-0002-2799-424X | ||
- name: Mehmet Tekman | ||
orcid: 0000-0002-4181-2676 | ||
- name: "B\xE9r\xE9nice Batut" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: "B\xE9r\xE9nice Batut" | |
- name: "Bérénice Batut" |
## Inputs dataset | ||
|
||
- The workflow needs 4 files as input | ||
- A singl-cell count matrix file in Matrix Market Exchange format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- A singl-cell count matrix file in Matrix Market Exchange format | |
- A single-cell count matrix file in Matrix Market Exchange format |
- A singl-cell count matrix file in Matrix Market Exchange format | ||
- A cell barcodes file with a single barcode in each line. The barcodes should correspond to the cells in the matrix file | ||
- A genes/feature tabular file with gene ids and gene symbols | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you forgot to describe the fourth file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As well as the 3 input values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. But maybe we do not need a parameters file because @mvdbeek suggested using individual parameters instead of a file. We will have only 3 input files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed, but it would be good to describe the input values in the README.
{ | ||
"class": "Person", | ||
"identifier": "0000-0001-9852-1987", | ||
"name": "B\u00e9r\u00e9nice Batut" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"name": "B\u00e9r\u00e9nice Batut" | |
"name": "Bérénice Batut" |
@mvdbeek Have you/we written down the naming convension for workflows? |
@@ -0,0 +1,17 @@ | |||
version: 1.2 | |||
workflows: | |||
- name: Standard-scRNA-seq-with-Scanpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: Standard-scRNA-seq-with-Scanpy | |
- name: main |
Annotate louvain clusters with these cell types: CD4+ T, CD14+, B, CD8+ T, FCGR3A+, | ||
NK, Dendritic, Megakaryocytes | ||
outputs: | ||
initial_anndata_general_info: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make all of these human readable please, no underscores. They are part of the primary output, so it should look nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am renaming all the outputs. Is that still a problem in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I don't think you need to rename the history items (but I don't mind of you do), workflow outputs should primarily be explored from the invocation view, not the history.
"name": "Workflow Params" | ||
} | ||
], | ||
"label": "Workflow Params", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be a file, but individual options.
"format-version": "0.1", | ||
"license": "CC-BY-4.0", | ||
"release": "0.1", | ||
"name": "Standard scRNA-seq with Scanpy", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"name": "Standard scRNA-seq with Scanpy", | |
"name": "scRNA-seq with Scanpy", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will rename it to "Preprocessing and Clustering of single-cell RNA-seq data with Scanpy".
@@ -0,0 +1,3096 @@ | |||
{ | |||
"a_galaxy_workflow": "true", | |||
"annotation": "Standard scRNA-seq workflow with Scanpy and Anndata. Based on the 3k PBMC clustering tutorial from Scanpy. Important workflow parameters can be read from a tabular file.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't make the users write a parameter file, that's quite bad for UX, validation etc.
I don't think we've agreed on anything yet. I would prefer to use |
@@ -0,0 +1,23 @@ | |||
# Standard scRNA-seq Workflow using Scanpy and Anndata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is even standard. Is clustering the thing you do here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preprocessing and clustering. I will rename it accordingly.
|
Follows mostly the 3k PBMC clustering tutorial. It uses a workflow parameters file for some important parameters. All the plots automatically use the highly ranked genes.