-
Notifications
You must be signed in to change notification settings - Fork 8
Working with BioRad data
Thanks to the nice engineering from @ghuls (and originally discussed in the issues), a more parsimonious and efficient way of processing the fastq files for pre-processing in bap
can be found here: https://github.com/aertslab/single_cell_toolkit.
Depending on the version of the technology, the bap-barcode
module extracts the corresponding
barcode sequences per read (also correcting for sequencing errors) and then appends the sequence
to the read name. Examples for the various technologies are available in the tests
folder:
bap-barcode v1.0 -a fastq_br/biorad_v1_R1.fastq.gz -b fastq_br/biorad_v1_R2.fastq.gz --nmismatches 1
bap-barcode v2.1 -a fastq_br/biorad_v2_R1.fastq.gz -b fastq_br/biorad_v2_R2.fastq.gz --nmismatches 1
bap-barcode v2.1-multi -a fastq_br/biorad_v2-multi_R1.fastq.gz -b fastq_br/biorad_v2-multi_R2.fastq.gz --nmismatches 1
We note that v2.1-multi corresponds to the multiplexed Tn5 (dsciATAC) protocol. A full descriptor of valid barcodes and their corresponding sequences is available here
After de-barcoding, alignment using bwa
follows --
bwa mem /path/to/hg19.fa debarcode-c001_1.fastq.gz debarcode-c001_2.fastq.gz | samtools view -bS - | samtools sort -@ 4 - -o debarcode.bam
samtools index debarcode.bam
Finally, to append the barcode as a SAM flag, use the bap-reanno
module.
bap-barcode --help
bap-reanno --help
Please raise an issue here