BWA-mem - Different alignment numbers to same reference sequence
0
1
Entering edit mode
21 months ago
Steven ▴ 20

My first post on here.........

I am using BWA-mem in quite an unusual situation, I am aligning small sequences of around 150bp to small reference sequences of variable length. When I give a reference file with multiple similar reference sequences my overall alignment rates increase considerably.

ref1.fasta:

>seq1
CAGGCTCTGCTCTTCATAATCATACCTTTGTGACTCAGGATGCTGT

>seq2
CAGGCTCTGCTCTTAATATCTGGCCGTCGTATTCCACCTCTGCGACTCATGATGCTGT (100,000 aligned)

>seq3
CAGGCTCTGCTCTTCATAATTTCTATCTTGCCCACCCTACTCGACACAGAGCAAAAATCCAACACTCCCAATATTGCCGTGGCTTCGACCTCTTGCTCAGATTTTCTTGTTACCTTTGTGACTCAGGATGCTGT

>seq4
CAGGCTCTGCTCTTCATAACCCTCCCTGCGAGTCCTTAAGTCTGACTCGGATCCTTAAACAACCTTTTCTTACCTTTGTGACTCAGGATGCTGT

ref2.fasta:

>seq2
CAGGCTCTGCTCTTAATATCTGGCCGTCGTATTCCACCTCTGCGACTCATGATGCTGT (25,000 aligned)

My fastq files align at higher numbers to ref1.fasta than ref2.fasta, but allow a far greater number of deletions and mis-matches with ref1.fasta.

I realize this is not what BWA-mem was really designed to do, but would be really grateful if you could help explain this activity, could it be something to do with the initial seeding of the alignment?

Many thanks, Steve W.

BWA-mem • 854 views
ADD COMMENT
0
Entering edit mode

Are those 75k reads longer than seq2?

ADD REPLY
0
Entering edit mode

Likely since OP says

small sequences of around 150bp

ADD REPLY
0
Entering edit mode

Yes my reads are around double the length of seq2, with a maximum of 150bp.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thank you, I think I came across this one when I was searching for an answer, I'm fairly sure the seeding has something to do with this strange behavior, but I'm yet to pin-point exactly what the cause is.

ADD REPLY

Login before adding your answer.

Traffic: 1860 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6