0.7.0
Pre-release
Pre-release
massie
released this
13 Mar 21:09
·
1678 commits
to master
since this release
ADAM IS BETA SOFTWARE AND SHOULD NOT BE USED IN PRODUCTION ENVIRONMENTS.
We are working very hard toward a 1.0 release that is production ready.
This release includes a new implementation of BQSR which is markedly faster and robust. In performance tests, we were able to BQSR a 238GB whole genome at 60x coverage in just 45 minutes using 100 EC2 instances (m2.4xlarge) with GATK concordance >99%.
Maven Dependencies
<dependency>
<groupId>edu.berkeley.cs.amplab.adam</groupId>
<artifactId>adam-core</artifactId>
<version>0.7.0</version>
</dependency>
<dependency>
<groupId>edu.berkeley.cs.amplab.adam</groupId>
<artifactId>adam-core</artifactId>
<version>0.7.0</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
NEW FEATURES
- Added ability to load and merge multiple ADAM files into a single RDD.
- Pairwise, quantitative ADAM file comparisons: the CompareAdam command has been extended to calculate metrics on pairs of ADAM files which contain the same reads processed in two different ways (e.g. two different implementations of a pre-processing pipeline). This can be used to compare different pipelines based on their read-by-read concordance across a number of fields: position, alignment, mapping and base quality scores, and can be extended to support new metrics or aggregations.
- Added FASTA import, and RDD convenience functions for remapping contig IDs. This allows for reference sequences to be imported into an efficient record where bases are stored as a list of enums. Additionally, convenience values are calculated. This feature was introduced in PR #79 and is a breaking change.
- Added helper functions for properly generating VCF headers for VCF export. This streamlines the process of converting ADAM Variant Calls to the legacy VCF format. This was added in PR#85.
- Added functions to the ADAMVariantContext which allows a Variant context to be built directly from genotypes. Previously, this operation could only be done at the RDD level. This was introduced in PR#88.
- Added API functions and CLI tools for merging multiple ADAM files. This code performs a smart merge and ensures that there are no collisions between reference IDs or read group IDs. These features were added in PR#73.
- Added ADAMRod model and Reads2Rods transformation; this is a pileup generation function that better takes advantage of locality for data that is already sorted. This was introduced in PR#36.
- ISSUE 101: Adding ability to call plugins from the command line not defined in the main Adam jar and included in the classpath.
- ISSUE 83: Add ability to perform a "region join" to RDDs of ADAMRecords.
Read the full CHANGES.txt file
The adam jar file below is a self-executing jar file that is built to run with Spark 0.9.0 and Hadoop 2.2.0.