IndelRealigner and RealignerTargetCreator for amplicon sequencing
2
1
Entering edit mode
5.1 years ago
kulvait ▴ 260

Hi,

I am trying to incorporate indel realignment step to my processing pipeline. I use myeloidampliconpanel from Illumina that contains ~1000 amplicons and in total ~100Kb of genome coverage. Prior to the indel realignment step I clean my bams so that no reads are present outside of my amplicons.

Unfortunately when I construct intervals file from bam file and some indel databases (whole genome) I will get the interval file that covers whole genome. I do not understand why it constructs intervals in areas where there is zero coverage?

There is no documentation of interval file. Since I have relatively small genomic area IndelRealigner should do the more work there than in whole genome project. I guess I can somehow pool all the intervals from all my bam files to create list of all possible indels (including those present in all my files) and then run RealignerTargetCreator with this file.

Do anybody of you know what is the correct format of intervals file? I mean if there might be two indels chr1:2-3 and chr1:2-4 should i have interval file with

chr1:2-3
chr1:2-4


or

 chr1:2-4


or even

chr1:2
chr1:3
chr1:4


Thanks, Vojtěch.

amplicon-seq NGS indel • 1.8k views
2
Entering edit mode
5.1 years ago

The -L flag also accepts a bed file with (tab-separated) chromosome - begin - end fields. I would also suggest to use the -ip flag (interval padding) to add extra space up and downstream, to make sure the region isn't too small.

0
Entering edit mode

Thank you, it did the trick.

0
Entering edit mode
5.1 years ago
Zaag ▴ 800

You could restrict the construction of the intervals with the -L option to your amplicon intervals.

It should be chr1:2-4

0
Entering edit mode

Thank you, would it be then wise to just add all amplicon coordinates to interval file?