Pipeline for Automated Read ANalysis Of iCLIP Data
PARANOiD is a versatile software for the fully automated analysis of iCLIP and iCLIP2 data. It contains all steps necessary for preprocessing, the determination of cross-link locations and several additional steps which can be used to detect specific characteristics, e.g. definite distances between cross-link events or binding motifs. The cross-link sites are presented as WIG files that can be easily visualized e.g. using IGV, for which a config file is offered. Additionally, results are offered as statistical plots for a quick overview and as standardized bioinformatics file formats or TSV files which can be used for further analysis steps.
nextflow PARANOiD.nf --reads \<reads.fastq\> --reference \<reference_sequence.fasta\> --barcodes \<barcodes.tsv\>
Reads generated by iCLIP experiments. Can be provided as one or more files. If providing more than one file, regular expressions can be used within quotation marks.
Format: FASTQ
--reads reads_file.fastq
--reads "reads_{1,2}.fastq"
--reads "*.fastq"
File containing the reference to which the reads will be mapped.
Format: FASTA
--reference reference_file.fasta
Barcode sequences are used to assign reads to their experiment. The file is provided as TSV-file (tab separated value).
The first consists of the experiment name and the second of the nucleotide sequence representing the barcode of the experiment.
One experiment is described per lane and the columns are divided by a tab.
The experiment name should be named as follows:
<experiment_name>_rep_<replicate-number>
Example:
experiment1_rep_1 GCATTG
experiment1_rep_2 CAGTAA
experiment1_rep_3 GGCCTA
experiment2_rep_1 AATCCG
experiment2_rep_2 CCGTTA
experiment2_rep_3 GTCATT
--barcodes barcode_file.tsv
File containing annotations of the reference provided. Advised when working with splicing capable organisms. Necessary for RNA subtype analysis.
Formats: GFF GTF
--annotation annotation_file.gff
This section shows all tools and their versions required by PARANOiD. However, Docker containers are provided for every step and it is highly recommended to use them instead. This way only a Nextflow installation and the according container software are required. Currently supported are Docker, Podman and Singularity.
These tools are essential to run PARANOiD
Tool | Version |
---|---|
Nextflow | 23.04.1.5866 |
These tools are only required when running PARANOiD without the provided containers. However, running PARANOiD this way is not recommended.
Tool | Version | Note |
---|---|---|
FastQC | 0.11.9 | |
Cutadapt | 4.2 | |
Trim_Galore | 0.6.7 | |
fastx_toolkit | 0.0.14 | |
umi_tools | 1.1.4 | |
Python | 3.11 | |
Samtools | 1.16.1 | |
Bamtools | 2.5.2 | |
wigToBigWig | 2.9 | |
bigWigToBedGraph | ||
Bowtie2 | 2.5.1 | Only when using --domain pro or running the transcript analysis |
STAR | 2.7.10b | Only when using --domain eu |
subread | 2.0.3 | |
pureCLIP | 1.3.1 | Only when using peak calling |
multiqc | 1.13 | |
pysam | 0.19.1 | |
R | 4.0.3 | |
optparse | R package | |
wig | R package | |
reshape2 | R package | |
ggplot2 | R package | |
numpy | Python package | |
biopython | Python package | |
ggf3sort.pl | 1.0.0 | Only when providing an annotation file |
bgzip | 1.16 | Only when providing an annotation file |
tabix | 1.16 | Only when providing an annotation file |
meme | 5.4.1 | Only when using motif detection |
Containers offer an environment that is separated from your current working environment while providing all tools and dependencies in their correct version necessary to execute a specific task. This highly reduces the software a user needs to install and lessens the error potential that comes from installing wrong versions, contributing to the reproducibility of the results. Therefore, it is highly recommended to use containers when running PARANOiD. Currently supported are Docker, Podman and Singularity. They can be used by adding the following at the end of the command line command:
-profile docker
-profile podman
-profile singularity
For more detailed descriptions of input files, parameters and implemented analyses please visit the Read the Docs manual