Question: Paired-End Reads Alignment For Variant Calling ?
1
gravatar for newDNASeqer
5.3 years ago by
newDNASeqer600
United States
newDNASeqer600 wrote:

I'm trying to do variant calling (SNPs, Indels) from exome-sequencing data, and the sequencing was done with paired end reads. I would like to use BWA for mapping/alignment, followed by PiCard and GATK to do variant calling.

The question now is how to do sequencing alignment with BWA. Should I use the short paired end reads to generate a single SAM file, like this:

bwa mem -M -v 1 -t 4 human_genome_ref.fasta read_For.fastq.gz read_Rev.fastq.gz > read_PE.sam

is this okay? or should I map individual reads to reference separately?

thanks a lot for your reply.

ADD COMMENTlink modified 5.3 years ago by Ashutosh Pandey11k • written 5.3 years ago by newDNASeqer600
0
gravatar for Ashutosh Pandey
5.3 years ago by
Philadelphia
Ashutosh Pandey11k wrote:

From the BWA site:

"BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads."

So it really depends on how long the reads are?

If greater than 100 bp try:

bwa mem ref.fa read1.fq read2.fq > aln-pe.sam

if less than 100 bp try:

bwa aln ref.fa read1.fq > aln_sa1.sai

bwa aln ref.fa read2.fq > aln_sa2.sai

bwa sampe ref.fa aln_sa1.sai aln_sa2.sai read1.fq read2.fq > aln-pe.sam

ADD COMMENTlink modified 5.3 years ago • written 5.3 years ago by Ashutosh Pandey11k

Thanks for your reply. I was reading BWA manual and thought that producing the aligned .sam file would require two steps if I use "bwa aln": first, I need to align individual reads to generate .sai files and secondly I need to merge the two separate .sai files to generate .sam file. Is this strongly recommended over "bam mem" even if my reads are less than 100 bp?

ADD REPLYlink written 5.3 years ago by newDNASeqer600

Yup thats the way that software was written. BWA-MEM does the same task in one step but that doesnt mean that it is better for reads with less than 100 bp. If ur reads are less than 100 bp, then use bwa aln and bwa pe or the two step process.

ADD REPLYlink written 5.3 years ago by Ashutosh Pandey11k

I think the recommendation re: use of bwa-mem vs bwa aln/sampe is now to use mem for anything over 70bp (which would now be the vast majority of Illumina runs even after trimming).  Not sure if anyone has assessed this on a human data set yet...

ADD REPLYlink written 4.5 years ago by Chris Fields1.9k

Yup you are right. That comment is pretty old when BWA MEM was still in its beta phase I think. I think everyone should use BWA-MEM now for alignment purpose given the fact that almost all the sequencers produce reads of length greater than 75 bp now. 

ADD REPLYlink written 4.5 years ago by Ashutosh Pandey11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1276 users visited in the last hour