Skip to content
taejoon edited this page Dec 9, 2014 · 12 revisions

Table of Contents

Mapping single-end reads with 'bwa mem'

BWA="$HOME/src.HTseq/bwa/0.7.10/bwa"
SAM2HIT="$HOME/git/HTseq-toolbox/sam/sam-to-sam_hit.py"

THREADS=12

DB="$WORK/PSEAE.db/bwadb/PSEAE_PA14_genome"
DBNAME=$(basename $DB)
DBNAME=${DBNAME/.fa/}

for FQ in $(ls ../fastq.tx/*fastq)
do
  SAM=$(basename $FQ)
  SAM=${SAM/.called.fastq/_called}
  SAM=$SAM"."$DBNAME".bwa_mem.sam"

  $BWA mem -t $THREADS $DB $FQ > $SAM
  $SAM2HIT $SAM
done

Filter best hits

SAM2BEST="$HOME/git/HTseq-toolbox/sam/unpaired_sam-to-sam_best.py"

for SAM in $(ls *sam_hit)
do
  $SAM2BEST $SAM
done

Make t_base_log

TLOG="$HOME/git/HTseq-toolbox/sam/sam-to-t_base_log.py"

for SAM in $(ls *sam_hit_best)
do
  $TLOG $SAM
done
  • Output t_base_log file format
Target	Pos	Count+	Count-	Quadruple	Border	A  T  G  C
#Target: PSEAE_PA14_NC_008463
PSEAE_PA14|NC_008463	0	2	5	7	0	0  7  0  0
PSEAE_PA14|NC_008463	1	2	5	7	0	0  7  0  0
PSEAE_PA14|NC_008463	2	2	5	7	0	0  7  0  0

Calculate gene_tpm with t_base_log and GFF

Clone this wiki locally