Bwa Mismatches And Seed Length
2
0
Entering edit mode
11.5 years ago
GPR ▴ 390

Hello, I am setting up BWA to align my reads and want to allow a reasonable number of mismatches, without compromising the alignment quality. My ultimate goal is to identify SNPs. I have started by running BWA with the default parameters: bwa aln genome.fa *.fastq > *.sai My question is: what's the most appropriate mismatch/max diff value(s) (-n, -k?) in this case? Another thing, my reads are 75bp paired-end. Do I have to input any parameter to indicate this? I have tediously tried to find examples in publications and forums, including the BWA manual, but there isn't much out there. Help will be appreciated. G.

bwa • 7.9k views
ADD COMMENT
2
Entering edit mode
11.5 years ago
Irsan ★ 7.8k

Maybe you already know but doing paired-end alignment you should do something like this

[prompt]$ bwa aln hg19.fa forward.fastq > forward.sai
[prompt]$ bwa aln hg19.fa reverse.fastq >reverse.sai
[prompt]$ bwa sampe hg19.fa forward.sai reverse.sai forward.fastq reverse.fastq > my_alignment.sam

Where "sampe" stands for sam paired-end.

You might benefit from reading Heng Li's presentation on BWA alignment and look what he has to say about BWA and SNP calling or reading Workflow Or Tutorial For Snp Calling?.

ADD COMMENT
0
Entering edit mode
11.5 years ago

I think you should use the default values for most of the BWA parameters. They are already tested and most of the people use the default values. The maximum edit distance or mismatches allowed in a read is automatically decided depending on the read length. so you may not have to worry about that.

ADD COMMENT
0
Entering edit mode

This is useful. Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 3254 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6