Entering edit mode
8.2 years ago
Nicolas Rosewick
10k
Hi,
I've a bunch of HTS data enriched for specific microsatellite (MS) regions. My idea was to create a bwa index based on these microsatellite sequences as:
Here for one of the MS region, produced sequences as:
seq1: [ Upstream flanking region ][ Repeat ][ Downstream flanking region ]
seq2: [ Upstream flanking region ][ Repeat ][ Repeat ][ Downstream flanking region ]
seq3: [ Upstream flanking region ][ Repeat ][ Repeat ][ Repeat ][ Downstream flanking region ]
seq4: [ Upstream flanking region ][ Repeat ][ Repeat ][ Repeat ][ Repeat ][ Downstream flanking region ]
.
seqN: [ Upstream flanking region ][ Repeat ]...[ Repeat ][ Downstream flanking region ]
And then align the reads (paired-end 2x100bp) on it, and count how many reads for each different variant of MS. Will bwa manage the alignment; and not report multiple alignment for one read?
edit: Repeats can be mono / di / tri nucleotides
Maybe someone has an other idea / aligner / method to compute such thing?
Thanks