Duplication BWA aligner
0
0
Entering edit mode
4.1 years ago
Mick ▴ 30

Hi guys,

I have the following two reads:

@read1
TTTTCCTTCTCTTTCTCCTTCTCTTTTTCTGGCAGAAGTTCTAACTCTGGTATTAGCTGACAGATATTTGGAGGTTCTTCTGG
+
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@read1-mut
TTTTTCCTTCTCTTTCTCCTTCTCTTTTTCTGGCAGAAGTTCCTTCTCTTTTTCTGGCAGAAGTTCTAACTCTGGTATTAGCTGACAGATATTTGGAGGTTCTTCTGG
+
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

the second one has a 24 base pair long duplication at position 17: TCCTTCTCTTTTTCTGGCAGAAGT

How can I get the BWA aligner (option bwa+samse) to align the read with the duplication properly to the human reference genome GRCh37/hg19? I fiddled around with the gap open -O and extend -E options but I couldn't seem to make it work. When I set -n to a high number like 100 I get the desired CIGAR-String: 16M24I68M but it seems error prone to me to just allow such a high number of mismatches.

What is the best way to deal with this? thanks

alignment • 538 views
ADD COMMENT
0
Entering edit mode

Try bwa mem which is recommended for that read length. It should take care of that. What kind of CIGAR are you expecting?

ADD REPLY
0
Entering edit mode

Hi ATpoint, thanks for the answer. I tried bwa mem, initially it didn't perform the alignment either, giving me soft and hard clipped alignments. 40S68M as primary alignment and 42M66H as supplementary alignment.

I tried some more options and eventually I found that bwa mem -A 10 -E 1 does the trick and give me 16M24I68M. Not sure if this is the best option but I guess I'll go with it for now.

ADD REPLY
0
Entering edit mode

Yes of course it gives clipped alignments, there is something in the read that is not in the reference. Therefore the question => what would be your desired output? These clipped and supplementary alignments are the starting point that structural variant callers use in order to identify variants. You cannot force a structural variant to be perfectly matched to the reference.

ADD REPLY

Login before adding your answer.

Traffic: 1593 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6