An unusual problem regarding bwa alignment has been bugging me for about a week which I have not been able to troubleshoot. If anyone can help me resolve this, I'll ask Santa to get you 3 kegs of beer!
I work with two plant species -- A & B -- and both have reference genomes. Ref seq sizes are ~11 and 14 Gbp for A and B, respectively. I indexed both references using
~/bwa-0.7.16a/bin/bwa index A_ref_seq.fa and the output looks normal.
Also from both species, I have few hundred individuals that were sequenced using Illumina as we want to do multi-sample SNP calling. Number of sequences in the fastq files vary of course but all have an average Q score of > 30, and their length is 90-95 bp. I have not seen anything unusual in the fq files for individuals from both species.
A couple of weeks ago, I aligned all individuals from A with A_ref_seq:
~/bwa-0.7.16a/bin/bwa mem -t 16 A_ref_seq.fa A_Ind1.fq > A_Ind1.sam
This went smoothly. To write sam file for each individual fq file, bwa took 4-5 minutes, on avereage. After this, I did sam to bam, sort bam, then mpileup for SNP calling (version 1.6 for both samtools and bcftools). All ran fine, no problems at all.
However, when I did
bwa mem with individuals of species B last week, bwa took almost 2 hours for a fq file with roughly 1.3 million reads!! I am using same software version, same number of threads (16), same memory (24 gigs) and yet the time it takes is astronomically high! The sam outputs from B individuals are of similar size to the A individuals, so not like bwa is writing junk during the several hours it takes to churn out one sam file.
What could be the reason? And what can I do to make this go faster?
For troubleshooting, I tried making a new index, created new fastq files, but nope, the problem persists. I even moved the storage area where I was doing this ("scratch" vs. home directory) but this did not help. If I run the pipeline for species A right now, it runs fast and works as expected which indicates that any hardware/software changes on our server is not causing this.
I would really appreciate any suggestions to improve this. Thank you.