BWA mapping with selected mismatch allowed?

0

Entering edit mode

7.9 years ago

lim24m • 0

This was a problem presented to me from a biologist: I have .fastq file with about 1 million DNA short reads. The design of the DNA templates that I sent for sequencing was something like: ... ...GGTATNNNNNNNNNATGT... ... where the N's are randomized sequence of 9 nucleotide bases, A T G or C.

I have to align them right now, either de novo or to a reference genome (we have the reference genome) without looking at the 9 randomized sequence of bases.

How should I go about this? What tools can I use? Are there any existing DNA/RNA alignment tools out there that can do this for me?

Thank you!

RNA-Seq alignment • 1.9k views

ADD COMMENT • link 7.9 years ago by lim24m • 0

0

Entering edit mode

Do these randomized nucleotides have a meaning? Something UMI like? Do you still need them downstream?

ADD REPLY • link 7.9 years ago by WouterDeCoster 47k

0

Entering edit mode

Are the nine nucleotides barcodes to identify samples?
Maybe you want to trim or split your sequences to remove those nucleotides?
Concerning the title of the question: There is a parameter to allow a certain number of mismatches in bwa aln (see -n), but not in bwa mem.

ADD REPLY • link 7.9 years ago by dschika ▴ 320

Login before adding your answer.