Skip to content

0.7.0

Pre-release
Pre-release
Compare
Choose a tag to compare
@massie massie released this 13 Mar 21:09
· 1678 commits to master since this release

ADAM IS BETA SOFTWARE AND SHOULD NOT BE USED IN PRODUCTION ENVIRONMENTS.
We are working very hard toward a 1.0 release that is production ready.

This release includes a new implementation of BQSR which is markedly faster and robust. In performance tests, we were able to BQSR a 238GB whole genome at 60x coverage in just 45 minutes using 100 EC2 instances (m2.4xlarge) with GATK concordance >99%.

Maven Dependencies

<dependency>
   <groupId>edu.berkeley.cs.amplab.adam</groupId>
   <artifactId>adam-core</artifactId>
   <version>0.7.0</version>
</dependency>
<dependency>
   <groupId>edu.berkeley.cs.amplab.adam</groupId>
   <artifactId>adam-core</artifactId>
   <version>0.7.0</version>
   <type>test-jar</type>
   <scope>test</scope>
</dependency>

NEW FEATURES

  • Added ability to load and merge multiple ADAM files into a single RDD.
  • Pairwise, quantitative ADAM file comparisons: the CompareAdam command has been extended to calculate metrics on pairs of ADAM files which contain the same reads processed in two different ways (e.g. two different implementations of a pre-processing pipeline). This can be used to compare different pipelines based on their read-by-read concordance across a number of fields: position, alignment, mapping and base quality scores, and can be extended to support new metrics or aggregations.
  • Added FASTA import, and RDD convenience functions for remapping contig IDs. This allows for reference sequences to be imported into an efficient record where bases are stored as a list of enums. Additionally, convenience values are calculated. This feature was introduced in PR #79 and is a breaking change.
  • Added helper functions for properly generating VCF headers for VCF export. This streamlines the process of converting ADAM Variant Calls to the legacy VCF format. This was added in PR#85.
  • Added functions to the ADAMVariantContext which allows a Variant context to be built directly from genotypes. Previously, this operation could only be done at the RDD level. This was introduced in PR#88.
  • Added API functions and CLI tools for merging multiple ADAM files. This code performs a smart merge and ensures that there are no collisions between reference IDs or read group IDs. These features were added in PR#73.
  • Added ADAMRod model and Reads2Rods transformation; this is a pileup generation function that better takes advantage of locality for data that is already sorted. This was introduced in PR#36.
  • ISSUE 101: Adding ability to call plugins from the command line not defined in the main Adam jar and included in the classpath.
  • ISSUE 83: Add ability to perform a "region join" to RDDs of ADAMRecords.

Read the full CHANGES.txt file

The adam jar file below is a self-executing jar file that is built to run with Spark 0.9.0 and Hadoop 2.2.0.