Question: Error while running BWA mem
3
gravatar for ravi.uhdnis
4.3 years ago by
ravi.uhdnis170
United States
ravi.uhdnis170 wrote:

Hi, I am running BWA mem for aligning my PE reads against Human reference genome (GRCh38) although it is running but i am encountering with an error message. Please have a look on the command and the resulting message :

/usr/local/bwa-0.7.12/bwa mem -t 14 \
    -M /san/illumina_two/rsindhu_sge/Human_ref_genomes/GRCh38/FINAL/GRCh38.fa \
    Read1.fastq.gz Read2.fastq.gz > Read12.bwa.sam

A part of message:

[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 1429588 sequences (140000077 bp)...
[M::process] read 1431216 sequences (140000054 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (19, 558476, 545, 5)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (126, 180, 401)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 951)
[M::mem_pestat] mean and std.dev: (243.78, 179.56)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1226)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (334, 387, 448)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (106, 676)
[M::mem_pestat] mean and std.dev: (391.91, 87.32)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 790)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (24, 49, 80)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 192)
[M::mem_pestat] mean and std.dev: (54.23, 40.99)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 248)
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_process_seqs] Processed 1429588 reads in 1170.454 CPU sec, 83.775 real sec
[M::process] read 1429172 sequences (140000054 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (15, 558767, 537, 10)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (100, 207, 290)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 670)
[M::mem_pestat] mean and std.dev: (212.33, 136.90)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 860)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (332, 385, 446)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (104, 674)
[M::mem_pestat] mean and std.dev: (390.32, 87.22)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 788)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (24, 46, 86)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 210)
[M::mem_pestat] mean and std.dev: (56.51, 44.37)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 272)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (379, 1210, 2961)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 8125)
[M::mem_pestat] mean and std.dev: (1898.40, 2191.92)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 10707)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
..........

Also, The size of the .sam file was found to be small as compared to the file when there was no such error message (Earlier it was 34G and now 14G). Please provide your suggestions, Many Thanks ...Ravi.

ADD COMMENTlink modified 13 months ago by RamRS24k • written 4.3 years ago by ravi.uhdnis170
2

What error message? All I see is the typical bwa status information.

ADD REPLYlink written 4.3 years ago by Devon Ryan92k

Thank you for the response Devon and Pierre. I was assuming them as error messages as it seemed something different that the BWA run which i ran few days ago. Anyway, thanks for clearing that these are normal run log messages. Moreover, it is alignment of 2nd lane of one sample of NGS data (PE of human sample with ~30X coverage) with initial size of input 2 files (Read1.fastq.gz & Read2.fastq.gz as 3.1 G each), out of total 8 lanes. The output file .sam formed of 14G.

Since Lane 1 input files were of same size (3.1G) but the output .sam file formed of 34G so i got confused that something is wrong with my run. I still have to find out why this much difference in size of output sam files of these two lanes alignment files so i'll re-align my lane 1 data again. Also, i am naive in this field so i am seeking help for clarifying my doubts. Thank you, Ravi.

ADD REPLYlink written 4.3 years ago by ravi.uhdnis170
1

Presumably the lane with a larger SAM file either had a higher mapping rate or more multimapped and/or chimeric alignments.

ADD REPLYlink written 4.3 years ago by Devon Ryan92k

Please let me know where and how i can check this information about 'mapping rate/multimapped or chimeric alignments'. Thanks

ADD REPLYlink written 4.3 years ago by ravi.uhdnis170

Give the number of reads in the input FASTQ and the number of lines in the output SAM.

ADD REPLYlink written 4.3 years ago by lh331k

Hi, here is the information

Lane 1 read count and information

Read1.fastq.gz : 56649567
Read2.fastq.gz: 56649567

 

samtools flagstat lane1.sam

113465731 + 0 in total (QC-passed reads + QC-failed reads)
166597 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
111551346 + 0 mapped (98.31%:nan%)
113299134 + 0 paired in sequencing
56649567 + 0 read1
56649567 + 0 read2
108344712 + 0 properly paired (95.63%:nan%)
110066734 + 0 with itself and mate mapped
1318015 + 0 singletons (1.16%:nan%)
1151298 + 0 with mate mapped to a different chr
741592 + 0 with mate mapped to a different chr (mapQ>=5)

Lane 2 read count and information

Read1.fastq.gz :56174670
Read2.fastq.gz : 56174670

 

samtools flagstat lane2.sam

[W::sam_read1] parse error at line 46009700
[bam_flagstat_core] Truncated file? Continue anyway.
46009673 + 0 in total (QC-passed reads + QC-failed reads)
67271 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
45284112 + 0 mapped (98.42%:nan%)
45942402 + 0 paired in sequencing
22971201 + 0 read1
22971201 + 0 read2
44011320 + 0 properly paired (95.80%:nan%)
44697132 + 0 with itself and mate mapped
519709 + 0 singletons (1.13%:nan%)
457348 + 0 with mate mapped to a different chr
292473 + 0 with mate mapped to a different chr (mapQ>=5)

Since, it gave me error that this file may be incomplete/truncated so i rerun it and found that the resulting file is similar to lane 1 output size (approx 34G), here is it's information

112514714 + 0 in total (QC-passed reads + QC-failed reads)
165374 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
110712711 + 0 mapped (98.40%:nan%)
112349340 + 0 paired in sequencing
56174670 + 0 read1
56174670 + 0 read2
107595288 + 0 properly paired (95.77%:nan%)
109259740 + 0 with itself and mate mapped
1287597 + 0 singletons (1.15%:nan%)
1104610 + 0 with mate mapped to a different chr
701402 + 0 with mate mapped to a different chr (mapQ>=5)

So, finally i got answers of both my queries. Ist one was not an error, it was regular run messages. 2nd, since BWA run was incomplete or truncated .sam formation resulted into the small sized file (14G in comparison of 34G). Thank you guys for all your comments. Regards, Ravi.

ADD REPLYlink modified 13 months ago by RamRS24k • written 4.3 years ago by ravi.uhdnis170
1

More recent versions of BWA print total CPU and wall-clock time at the end of the run. If you don't see it, the output is very likely to be incomplete.

ADD REPLYlink written 4.3 years ago by lh331k

Yes, I noticed it in few runs. Thank you for your comment, I appreciate it . Best wishes, Ravi.

ADD REPLYlink written 4.3 years ago by ravi.uhdnis170

I met the same issue as you.So how do you deal with this error?It is the memory no enough to compute or the reference is not complete or any other reason?Thanks !

ADD REPLYlink written 20 months ago by whaiyu0620

Hi,

I have this problem too.do you find any solution for this problem?

Best Regards, Mohammad

ADD REPLYlink written 17 months ago by modarzi80
6
gravatar for Pierre Lindenbaum
4.3 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum123k wrote:

these are just  log messages , not errors .

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by Pierre Lindenbaum123k

Alright, but i didn't understand why the final file size is much less ?. Thanks for the comment.
 

ADD REPLYlink written 4.3 years ago by ravi.uhdnis170
1

We have no clue what you're comparing this to, so no one can give you an answer.

ADD REPLYlink written 4.3 years ago by Devon Ryan92k

Is there any command to not print this Log messages in the SAM file when aligning?

ADD REPLYlink written 24 months ago by elisa_peripolli0

You can redirect stderr to /dev/null:

bwa mem ... > foo.sam 2> /dev/null

Granted, then you won't see any actual error messages...

ADD REPLYlink written 13 months ago by Devon Ryan92k

Or simply put a if (bwa_verbose >= 3) in front of every printf in the bwamem_pair.c source code and recompile. Then use -v 2 in the command. I prefer that solution, as it prevents bwa from being so talky, but still prints errors. I have a version of bwamem_pair.c with these modification is you want.

ADD REPLYlink modified 13 months ago • written 13 months ago by ATpoint24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1577 users visited in the last hour