You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question about the new gene overlap implementation in 1.4.0:
If a gene already overlaps with an antisense gene, does it ignore this restriction?
It looks like that's the case since e.g. Zcchc17 extends into Fabp3.
Additionally, can you think of a reason why peaks2utr would only extend halfway through a peak? I compared long reads (shortened to 500bp to ensure clear peaks) vs 10X data and while long reads worked well, the UTRs are only extended about halfway through the peaks (see below). I think the long read data is more reliable so it'd be nice to use long reads to generate references for 10X data as long as the extensions are complete.
Thanks for using the tool. You bring up a couple of interesting points.
For your first, I can confirm that your assumption is correct. The code logic basically looks for other genes in the vicinity when deciding whether to annotate a UTR and orders them by start base. If the --no-strand-overlap option is given it will essentially be strand-agnostic when doing this (i.e. the next closest gene is allowed to be on the opposing strand). However, if the gene is already overlapping (as in your highlighted case), when applying the criteria it will be looking to truncate to the 5' end of the next gene, which it assumes will occur at a lower base (for reverse strand) than the existing gene's 3' end (see
As for your second point, a truncation of the UTR mid-peak would usually only occur if either
MACS2 had determined that that is where the peak ends (see the .cache/forward_peaks.broadPeak file for the forward stranded peaks explicitly called by MACS2)
or there were a significant number of reads with polyA tails which terminated at that point (see the definition of SPAT algorithm in the paper https://doi.org/10.1093/bioinformatics/btad112). If you check the attributes for that UTR annotation in the GFF3/GTF output, and see colour=4, then this is likely what has happened.
Hello,
First , thanks for the tool!
I have a question about the new gene overlap implementation in 1.4.0:
If a gene already overlaps with an antisense gene, does it ignore this restriction?
It looks like that's the case since e.g. Zcchc17 extends into Fabp3.
Additionally, can you think of a reason why peaks2utr would only extend halfway through a peak? I compared long reads (shortened to 500bp to ensure clear peaks) vs 10X data and while long reads worked well, the UTRs are only extended about halfway through the peaks (see below). I think the long read data is more reliable so it'd be nice to use long reads to generate references for 10X data as long as the extensions are complete.
Thanks again!
Jesse
Versions:
python/3.9.6
bedtools/2.30
peaks2utr/1.4.0
code:
peaks2utr --max-distance 10000 --extend-utr --no-strand-overlap -p 64 -o $out $gtf $bam
The text was updated successfully, but these errors were encountered: