The tutorials include how to retrieve some of the more important results for each case: comparing relationships among genomes, finding clades, and evaluating genomes for completeness and contamination. To retrieve more detailed information it is helpful to understand MiGA's directory structure. The following sub-directories are included for each project under the project directory:
.
├── daemon
│ └── daemon.json
├── data
│ ├── 01.raw_reads
│ ├── 02.trimmed_reads
│ ├── 03.read_quality
│ ├── 04.trimmed_fasta
│ ├── 05.assembly
│ ├── 06.cds
│ ├── 07.annotation
│ │ ├── 01.function
│ │ │ ├── 01.essential
│ │ │ └── 02.ssu
│ │ ├── 02.taxonomy
│ │ │ └── 01.mytaxa
│ │ └── 03.qa
│ │ ├── 01.checkm
│ │ └── 02.mytaxa_scan
│ ├── 08.mapping
│ │ ├── 01.read-ctg
│ │ └── 02.read-gene
│ ├── 09.distances
│ │ ├── 01.haai
│ │ ├── 02.aai
│ │ ├── 03.ani
│ │ ├── 04.ssu
│ │ └── 05.taxonomy
│ ├── 10.clades
│ │ ├── 01.find
│ │ ├── 02.ani
│ │ ├── 03.ogs
│ │ ├── 04.phylogeny
│ │ │ ├── 01.essential
│ │ │ └── 02.core
│ │ └── 05.metadata
│ └── 90.stats
├── metadata
└── miga.project.json
The sub-directories under data and beginning with 01 thorugh 10 correspond to MiGA's sequential steps in processing data beginning with raw paired fastq files loaded into 01.raw_reads. Not all steps need to be done. In most of the tutorials, we submitted assembled genomes in fasta format; they were loaded into 05.assembly and processing began there. In the genome assembly exercise, we began by loading already trimmed and quality filtered data into 04.trimmed. The genome projects ended with step 09.distances. Step 10.clades was performed only for the clade project. Each of these sub-directories beginning with 01 through 10 contain results for the associated data processing step.
Some of the results are in text format (sometimes compressed), e.g. tables, sequence files, gff files, nwk files. Others are in pdf format or Rdata format. The information in each file is explained in the MiGA workflow section of the MiGA manual. If you are using MiGA on a cluster, you can download the files individually using FileZilla or a similar program.
For projects including classification, Fang Yuan has written a script to summarize the results. For each classified genome it determines the closest reference genome, the distances to it, and writes the results to the file summary.csv. To use this script, move into the project/data/09.distance/05.taxonomy directory and run the following commands:
wget https://github.com/jfq3/Miscellaneous-scripts/raw/master/miga_sumdb.sh
chmod 750 miga_sumdb.sh
./miga_sumdb.sh
For the miscellaneous genomes project in these tutorials, this produces:
name | closest | haai | aai | ani |
---|---|---|---|---|
Acidobacterium_capsulatum | Silvibacterium_bohemicum_GCA_001006305 | 99.9508952380952 | 60.0831465538606 | |
Acinetobacter_baumanii | Pseudomonas_pharmafabricae_GCA_002835605 | 98.7412788461538 | 48.5834215277602 | |
Bacillus_anthracis | Bacillus_flexus_NBRC_15715_GCA_001591565 | 98.9199428571429 | 59.5616830477613 | |
Bacillus_cereus | Bacillus_flexus_NBRC_15715_GCA_001591565 | 98.9702788461539 | 59.9991400823551 | |
Bifidobacterium_bifidum | Bifidobacterium_scardovii_JCM_12489___DSM_13734_GCA_000770985 | 99.7717980769231 | 68.9492871800872 | |
Campylobacter_jejuni | Campylobacter_helveticus_GCA_900176295 | 99.1678666666667 | 66.2667775679713 | |
Cytophaga_hutchinsonii | Nafulsella_turpanensis_ZLM_10_GCA_000346615 | 99.9811470588235 | 50.1443577165562 | |
Gemmatimonas_aurantiaca | Gemmatimonas_phototrophica_NZ_CP011454 | 99.9633431372549 | 68.505494616812 | |
Lacunisphaera_limnophila | Oleiharenicola_lentus_GCA_004118375 | 99.9875247524753 | 68.0251176025193 |
John Quensen has written a python script to summarize the completeness, contamination, and quality of each genome in a project and write it to a tab-delimited file that my be viewed as is or loaded into a spread-sheet program like Excel. It takes two arguments: the path to the MiGA project and the name of the output file. Download and run the script by entering the following commands:
wget https://github.com/jfq3/Miscellaneous-scripts/raw/master/miga_completeness.py
chmod u+x
python ./miga-completeness.py /path/to/miga/project completness_summary.txt
For the miscellaneous genomes project in these tutorials, this produces:
Completeness Contamination Quality Genome
94.6 0.9 90.1 P_putida
94.6 0.9 90.1 P_stutzeri
95.5 0.9 91.0 P_alcaligenes
94.6 0.9 90.1 P_syringae
93.7 0.9 89.2 P_fluorescens
94.6 0.9 90.1 P_mendocina
MiGA was originally provided as a web-based program, and the results were delivered via a web browser. You can view MiGA results generated on a cluster in the same manner if you first install the Docker version of MiGA on your computer. Steps are as follows:
-
If you have not already done so, install the Docker Version appropriate to your system (Windows, Mac OS or Linux).
-
If you have not already done so, install MiGA-Web:
-
Open a terminal and enter the following, one line at a time:
docker pull miga/miga:1.0.2.0 docker run -p 9090:3000 -it -v C:/miga:/root/miga-data -v db_volume:/miga/db --name miga miga/miga:1.0.2.0 /bin/bash cd miga/ export SECRET_KEY_BASE=`bundle exec rake secret` bundle exec rails server -e production -b 0.0.0.0 -p 3000 Puma
The docker run line creates the MiGA project directory as C:/miga on your computer. You may use a different directory name if you wish.
-
Enter control C and close the terminal.
-
Open a new terminal and enter:
docker stop miga
-
Close the terminal.
-
-
Compress the results on the cluster:
-
Using tar:
cd project_directory tar czf project_name.tar.gz .
-
Using miga archive. With MiGA running on the cluster:
cd project_directory miga archive -o project_name.tar.gz -P .
The miga archive option has the advantage that unnecessary files are not included in the archive. Thus the archive file is smaller and the MiGA results take up less drive space on your computer.
-
-
Download the compressed file to the project directory on your computer and decompress it. The installation commands above created the project directory as C:/miga-web.
tar xzf project_name.tar.gz
-
Start Miga on your computer.
-
Start Docker desktop.
-
Open a terminal and enter, one line at a time:
docker start miga docker exec -it miga /bin/bash cd miga/ export SECRET_KEY_BASE=`bundle exec rake secret` bundle exec rails server -e production -b 0.0.0.0 -p 3000 Puma
-
Leave the terminal running until you are finished browsing the results (see below).
-
Open a web browser, enter "localhost:9090" as the URL and log into MiGA.
-
-
Link the project to MiGA.
- Go to the Admin console page.
- Click on "Link existing projects"
- Under the name of the project, click either "Link publicly" or "Link privately." The choice only affects where the project will be listed.
-
Open the project and browse the results. Get explanations for each item; the "Learn more" links take you directly to the appropriate section of the MiGA manual. The download links copy the item to your downloads directory.
-
After you have finished, end MiGA as above after you installed it:
-
Enter control C into the terminal and close it.
-
Open a new terminal and enter:
docker stop miga
-
Close the terminal.
-