Skip to content

Working with BioRad data

Caleb Lareau edited this page Jul 28, 2021 · 4 revisions

Debarcoding BioRad dscATAC and dsciATAC data

Update July 28, 2021*

Thanks to the nice engineering from @ghuls (and originally discussed in the issues), a more parsimonious and efficient way of processing the fastq files for pre-processing in bap can be found here: https://github.com/aertslab/single_cell_toolkit.

Method for reproducing Lareau*, Duarte*, Chew*, et al 2019 Nature Biotechnology

Depending on the version of the technology, the bap-barcode module extracts the corresponding barcode sequences per read (also correcting for sequencing errors) and then appends the sequence to the read name. Examples for the various technologies are available in the tests folder:

bap-barcode v1.0 -a fastq_br/biorad_v1_R1.fastq.gz -b fastq_br/biorad_v1_R2.fastq.gz --nmismatches 1
bap-barcode v2.1 -a fastq_br/biorad_v2_R1.fastq.gz -b fastq_br/biorad_v2_R2.fastq.gz --nmismatches 1
bap-barcode v2.1-multi -a fastq_br/biorad_v2-multi_R1.fastq.gz -b fastq_br/biorad_v2-multi_R2.fastq.gz --nmismatches 1

We note that v2.1-multi corresponds to the multiplexed Tn5 (dsciATAC) protocol. A full descriptor of valid barcodes and their corresponding sequences is available here

After de-barcoding, alignment using bwa follows --

bwa mem /path/to/hg19.fa debarcode-c001_1.fastq.gz debarcode-c001_2.fastq.gz | samtools view -bS - |  samtools sort -@ 4 - -o debarcode.bam
samtools index debarcode.bam

Finally, to append the barcode as a SAM flag, use the bap-reanno module.

Help / user options

bap-barcode --help
bap-reanno --help