You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a small problem during base recalibration step as its dropping some of the reads that fall outside the interval regions that I am providing. Currently our panels involves sequencing about 150 genes. When we are doing our own internal QC we look at GC content not exceeding a certain threshold. However, some of the regions in our intervals regions exceed this amount, so we are are sequencing bases in the adjacent region as a "proxy" to check for any sequencing issues. Although we use this region only as a qc check, we do not want to call any variants in this region.
However the way currently BaseRecal is setup, both training and applying the scores is using this interval file which causes the reads that fall outside this region to be dropped. However, ideally I do not want to remove any reads in any part of the pipeline. Is there an easier way of removing an argument using the config file. Currently I have manually modified the code to remove the interval argument which works as expected.
I can also change the interval to include these regions and add extra padding, but ideally don't want any reads ever dropped in case if we have to investigate it.
Note this step is also not recommended by the broad institute team
please see:
Hey! You can run sarek with --no_intervals to skip this entirely. For doing variant calling again with intervals you could restart with --step variant_calling
Thanks for getting back to me. It is still recommended to use the intervals to train the base recalibration model. So I would like to train the models based on the intervals then apply the scores to all the reads. I am guessing I have to run it with intervals up to base recalibrator step (not sure if it dose both training and applybqsr?) and restart it at baser recalibrator (not sure how I would strictly start at applybqsr step in this instance) and restart again with variant calling.
In terms of restarting from the previous run and how it works, do we treat them as separate commands/analysis, so running them in isolation or is there a way to let sarek know about the previous steps undertaken?
Hi
I have a small problem during base recalibration step as its dropping some of the reads that fall outside the interval regions that I am providing. Currently our panels involves sequencing about 150 genes. When we are doing our own internal QC we look at GC content not exceeding a certain threshold. However, some of the regions in our intervals regions exceed this amount, so we are are sequencing bases in the adjacent region as a "proxy" to check for any sequencing issues. Although we use this region only as a qc check, we do not want to call any variants in this region.
However the way currently BaseRecal is setup, both training and applying the scores is using this interval file which causes the reads that fall outside this region to be dropped. However, ideally I do not want to remove any reads in any part of the pipeline. Is there an easier way of removing an argument using the config file. Currently I have manually modified the code to remove the interval argument which works as expected.
so from
to
I can also change the interval to include these regions and add extra padding, but ideally don't want any reads ever dropped in case if we have to investigate it.
Note this step is also not recommended by the broad institute team
please see:
The text was updated successfully, but these errors were encountered: