-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.qmd
212 lines (174 loc) · 12 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
---
toc-depth: 2
---
![](img/banner-min.png)
<div style="text-align: justify">
On this website you can find documentation about software commonly used in bioinformatic data analyses as well as tutorials about various bioinformatic subjects. On this webpage you can find software organized by topic and for each topic you find a list of commonly used software tools.
If you are working at the University of Amsterdam (UvA) Institute for Biodiversity and Ecosystem Dynamics (IBED) and want to know more about what computational resources are available, please also visit the [computational support teams website](https://ibed.uva.nl/facilities/computational-facilities/ibed-computational-support-team/ibed-computational-support-team.html) and our website with more [computation resources](https://computational-resources-uva-ibed-cs-general-20079b9f20f96949e05.gitlab.io/)
Please, be aware that this page is a work in progress and will be slowly updated over time. If you want to add additional information or feel that something is missing feel free to send an email to [n.dombrowski\@uva.nl](mailto:[email protected]){.email}.
## Useful tutorials
The Carpentries teaches workshops around the world on the foundational skills to work effectively and reproducibly with data and code and they are an excellent resource to check out if you want to get started with bioinformatics
- [The software carpentries](https://software-carpentry.org/lessons/) provides tutorials on:
- Bash
- Git
- Python
- R
- [Data carpentries](https://datacarpentry.org/lessons/) provides domain-specific tutorials, such as for ecology or genomics
- [Library carpentries](https://librarycarpentry.org/lessons/) contain some useful tutorials if you want to transform data frames, map data to each other and work effectively with data
Next, to the carpentries you will find a list of tutorials for more specific topics below.
### Getting started with bash
- [A tutorial on using bash and an HPC](https://ndombrowski.github.io/cli_workshop/)
- [Version control with git](https://github.com/fkariminejadasl/ml-notebooks/blob/main/tutorial/git.md)
- [A tutorial on using AWK](https://ndombrowski.github.io/AWK_tutorial/), a command line tool for filtering tables, extracting patterns, etc... If you want to follow this tutorial then you can download the required input files from [here](https://github.com/ndombrowski/AWK_tutorial/tree/main/1_Inputfiles)
### Using R
- [An R cookbook](https://ndombrowski.github.io/R_cookbook/) including some [example files](https://github.com/ndombrowski/R_cookbook/tree/main/data) if you want to code along
- [Tutorial on data manipulation with dplyr](https://ndombrowski.github.io/Tidyverse_tutorial/)
- [Tutorial on data visualization with ggplot2](https://ndombrowski.github.io/Ggplot_tutorial/)
### Bioinformatic workflows
- [From sequence file to OTU table with Qiime](source/Qiime/3_evelyn_tutorial_notes.qmd)
- [Analysing an OTU table with R](source/Qiime/OTU_table_analysis.qmd)
- [Assembling a metagenome](https://ndombrowski.github.io/Assembly_tutorial/)
- [Metagenomic binning](https://ndombrowski.github.io/Binning_tutorial//)
- [Annotating microbial genomes](https://github.com/ndombrowski/Annotation_workflow)
- [How to do generate a species tree](https://ndombrowski.github.io/Phylogeny_tutorial/)
- [Accessing data from NCBI](source/core_tools/ncbi.qmd)
## Bioinformatic tools A-Z
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Bioinformatic Tools A-Z</title>
<style>
/* Basic styling for accordion */
.accordion {
background-color: #eee;
color: #444;
cursor: pointer;
padding: 3px 15px;
width: 100%;
border: none;
text-align: left;
outline: none;
font-size: 15px;
transition: background-color 0.4s;
}
.active, .accordion:hover {
background-color: #f4f4f4;
}
.panel {
padding: 0 40px;
background-color: white;
overflow: hidden;
display: block; /* Start with panels uncollapsed */
}
</style>
</head>
<body>
<button class="accordion active">A-D</button>
<div class="panel">
<ul>
<li><a href="source/metagenomics/atlas.qmd">ATLAS</a>: A metagenomic pipeline for QC, assembly binning and annotation</li>
<li><a href="source/metagenomics/augustus.qmd">Augustus</a>: A program that predicts genes in eukaryotic genomic sequences</li>
<li><a href="source/nanopore/autocycler.qmd">Autocycler</a>: A tool for generating consensus long-read assemblies for microbial genomes</li>
<li><a href="source/metagenomics/bakta.qmd">Bakta</a>: A tool for the rapid & standardized annotation of bacterial genomes and plasmids from both isolates and MAGs</li>
<li><a href="source/phylogenomics/bmge.qmd">BMGE</a>: A program to select regions in a multiple sequence alignment that are suited for phylogenetic inference</li>
<li><a href="source/core_tools/bowtie.qmd">Bowtie2</a>: A tool for aligning sequencing reads to genomes and other reference sequences</li>
<li><a href="source/core_tools/busco.qmd">BUSCO</a>: Quality assessment of (meta)genomes, transcriptomes and proteomes</li>
<li><a href="source/metagenomics/checkm2.qmd">CheckM2</a>: A tool to assess the quality of a genome assembly</li>
<li><a href="source/nanopore/chopper.qmd">Chopper</a>: A tool for quality filtering of long read data</li>
<li><a href="source/metagenomics/deeploc.qmd">DeepLoc</a>: A tool to predict the subcellular localization(s) of eukaryotic proteins</li>
<li><a href="source/core_tools/diamond.qmd">Diamond</a>: A sequence aligner for protein and translated DNA searches</li>
<li><a href="source/metagenomics/dnaapler.qmd">Dnaapler</a>: A tool to re-orient a genome, for example at dnaA</li>
<li><a href="source/metatranscriptomics/DSEq2.qmd">DeSeq2</a>: Analyse gene expression data in R</li>
</ul>
</div>
<button class="accordion active">E-H</button>
<div class="panel">
<ul>
<li><a href="source/metagenomics/fama_readme.qmd">FAMA</a>: A fast pipeline for functional and taxonomic analysis of metagenomic sequences</li>
<li><a href="source/metatranscriptomics/fastp.qmd">FastP</a>: A tool for fast all-in-one preprocessing of FastQ files</li>
<li><a href="source/metagenomics/fastqc_readme.qmd">FastQC</a>: A quality control tool for read sequencing data</li>
<li><a href="source/core_tools/featurecounts.qmd">FeatureCounts</a>: A read summarization program that counts mapped reads for genomic features</li>
<li><a href="source/nanopore/filtlong.qmd">Filtlong</a>: A tool for filtering long reads</li>
<li><a href="source/metagenomics/flye.qmd">Flye</a>: A de novo assembler for single-molecule sequencing reads</li>
<li><a href="source/metagenomics/gtdb.qmd">GTDB_tk</a>: A software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes</li>
<li><a href="source/core_tools/hmmer.qmd">HMMER</a>: A tool for searching sequence databases for sequence homologs, and for making sequence alignments</li>
</ul>
</div>
<button class="accordion active">I-L</button>
<div class="panel">
<ul>
<li><a href="source/phylogenomics/iqtree2.qmd">IQ-TREE</a>: A tool for phylogenomic inferences</li>
<li><a href="source/metagenomics/interproscan_readme.qmd">Interproscan</a>: A tool to scan protein and nucleic sequences against InterPro signatures</li>
<li><a href="source/ITSx/itsx_readme.qmd">ITSx</a>: A tool to extract ITS1 and ITS2 subregions from ITS sequences</li>
<li><a href="source/classification/kraken2.qmd">Kraken2</a>: A taxonomic sequence classifier using kmers</li>
</ul>
</div>
<button class="accordion active">M-P</button>
<div class="panel">
<ul>
<li><a href="source/phylogenomics/mafft.qmd">Mafft</a>: A multiple sequence alignment program</li>
<li><a href="source/metagenomics/metabolic.qmd">METABOLIC</a>: A tool to predict functional trait profiles in genome datasets</li>
<li><a href="source/metagenomics/metacerberus.qmd">MetaCerberus</a>: A tool for functional assignment</li>
<li><a href="source/metagenomics/motus.qmd">MOTUS</a>: A tool to estimate microbial abundances in Illumina and Nanopore sequencing data</li>
<li><a href="source/classification/minimap2.qmd">Minimap2</a>: A program to align DNA or mRNA sequences against a reference database</li>
<li><a href="source/metagenomics/multiqc.qmd">MultiQC</a>: A program to summarize analysis reports</li>
<li><a href="source/nanopore/nanoclass.qmd">NanoClass2</a>: A taxonomic meta-classifier for long-read 16S/18S rRNA gene sequencing data</li>
<li><a href="source/nanopore/nanoITS.qmd">NanoITS</a>: A taxonomic meta-classifier for long-read ITS operon sequencing data</li>
<li><a href="source/nanopore/nanophase_how_to.qmd">Nanophase</a>: A pipeline to generate MAGs using Nanopore long and Illumina short reads from metagenomes</li>
<li><a href="source/nanopore/nanoplot_readme.qmd">NanoPlot</a>: Plotting tool for long read sequencing data</li>
<li><a href="source/nanopore/nanoqc_readme.qmd">NanoQC</a>: A quality control tool for long read sequencing data</li>
<li><a href="source/nanopore/ngspeciesid.qmd">NGSpeciesID</a>: A tool for clustering and consensus forming of long-read amplicon sequencing data </li>
<li><a href="source/nanopore/porechop_readme.qmd">Porechop</a>: A tool for finding and removing adapters from Nanopore reads</li>
<li><a href="source/core_tools/prokka.qmd">Prokka</a>: A tool to annotate bacterial, archaeal and viral genomes</li>
<li><a href="source/metagenomics/pseudofinder.qmd">Pseudofinder</a>: A tool that detects pseudogene candidates from annotated genbank files of bacterial and archaeal genomes</li>
<li><a href="source/metagenomics/pycirclize.qmd">pyCirclize</a>: A tool for circular visualization, i.e. genome plots, in python</li>
</ul>
</div>
<button class="accordion active">Q-T</button>
<div class="panel">
<ul>
<li><a href="source/metagenomics/quast.qmd">QUAST</a>: A Quality Assessment Tool for Genome Assemblies</li>
<li><a href="source/metatranscriptomics/ribodetector.qmd">Ribodetector</a>: Detect and remove rRNA sequences from metagenomic, metatranscriptomic, and ncRNA sequencing data</li>
<li><a href="source/core_tools/rsem.qmd">RSEM</a>: A software package for estimating gene and isoform expression levels from RNA-Seq data</li>
<li><a href="source/core_tools/samtools.qmd">Samtools</a>: A tool to manipulating alignments in SAM/BAM format</li>
<li><a href="source/core_tools/seqkit.qmd">SeqKit</a>: A tool for FASTA/Q file manipulation</li>
<li><a href="source/metagenomics/signalp6.qmd">SignalP6</a>: A tool to predict the presence of signal peptides</li>
<li><a href="source/metatranscriptomics/sortmerna.qmd">SortMerNa</a>: A tool to filter ribosomal RNAs in metatranscriptomic data</li>
<li><a href="source/metatranscriptomics/star.qmd">Star</a>: An ultrafast universal RNA-seq aligner</li>
<li><a href="source/metagenomics/trycycler.qmd">Trycyler</a>: A tool for generating consensus long-read assemblies for bacterial genomes</li>
<li><a href="source/metatranscriptomics/trinity.qmd">Trinity</a>: A tool to assemble transcript sequences from Illumina RNA-Seq data</li>
</ul>
</div>
<button class="accordion active">U-Z</button>
<div class="panel">
<ul>
<!-- Add tools starting with U-Z if available -->
</ul>
</div>
<script>
var acc = document.getElementsByClassName("accordion");
var i;
for (i = 0; i < acc.length; i++) {
acc[i].addEventListener("click", function() {
this.classList.toggle("active");
var panel = this.nextElementSibling;
if (panel.style.display === "block") {
panel.style.display = "none";
} else {
panel.style.display = "block";
}
});
}
</script>
</body>
</html>
## Bioinformatic toolbox
- [For-loops-in-bash](source/core_tools/bash-for-loops.qmd): How can I scale my research and run software not only on one but multiple files?
## Useful databases A-Z
- [COG](source/core_tools/cog.qmd)
- [KOfam](source/core_tools/kegg_db.qmd)
- [NCBI-nr](source/core_tools/ncbi_nr.qmd)
- [Pfam](source/core_tools/pfam_db.qmd)
- [PGAP database](source/core_tools/pgap_db.qmd)
- [TIGRFAM](source/core_tools/tigrfam_db.qmd)