BWA MEM 0.7.12 Error : [gzread] <fd:4>: invalid distance code
1
1
Entering edit mode
8.4 years ago
gsr9999 ▴ 300

Dear BioStars Leaders,

I have been running BWA MEM 0.7.12 on a handful of paired-end sample fastq files and it worked out pretty well. I have also included BWA as part of a bash and python based pipeline.

When I run BWA on a specific lane of a specific sample fastq files (paired-end sequencing files), I received this error and wondering what the issue might be.

[gzread] <fd:4>: invalid distance code

Here is the command that I ran on a Linux server :

$bwa mem \
  -t 18 \
  -M \
  -R "@RG\tID:development_run_070_WES-VAL3_L002\tSM:Sample_13016\tPL:IlluminaNextSeq500\tLB:Lib1\tPU:Unit1" \
  /home/hg19/ucsc.hg19.fasta S4_L002_R1_001.fastq.gz S4_L002_R2_001.fastq.gz > bwaAlignReads.sam 2> bwa.stderr.log

I am curious to hear from others if you have got similar error, and it would be great if anyone could suggest any possible solutions

Here are the last 30 rows from the bwa-mem error log file that I saved :

[M::process] read 1314136 sequences (180000260 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (2, 531348, 12, 3)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (134, 177, 227)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 413)
[M::mem_pestat] mean and std.dev: (182.75, 71.19)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 506)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (125, 227, 409)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 977)
[M::mem_pestat] mean and std.dev: (210.27, 167.43)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1261)
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_pestat] skip orientation RF
[M::mem_process_seqs] Processed 1309276 reads in 506.988 CPU sec, 28.235 real sec
[W::bseq_read] the 2nd file has fewer sequences.
[M::process] read 774742 sequences (106186594 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (2, 533411, 8, 4)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (134, 177, 226)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 410)
[M::mem_pestat] mean and std.dev: (182.10, 71.36)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 502)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 1314136 reads in 497.150 CPU sec, 27.629 real sec
[gzread] <fd:4>: invalid distance code

Thanks

bwa • 5.8k views
ADD COMMENT
4
Entering edit mode
8.4 years ago

can you please run :

gzip --test S4_L002_R1_001.fastq.gz S4_L002_R2_001.fastq.gz
ADD COMMENT
0
Entering edit mode

Pierre,

Thank you for a quick reply, and for suggesting gzip --test .

S4_L002_R2_001.fastq.gz failed the gzip test. I also looked into my fastQC results and realized that this file failed at fastQC check as well. I started my analysis with fastq files and Lane 2 R2 file seems to be the issue. It looks like I need to re-generate fastq files from bcl files for this sample using bcl2fastq2, Please let me know if there is an alternative solution. Thank you again.

*************************************************************************************
***                                     gzip test                                                            ***
*************************************************************************************

[sgr@genlabcs 0200_FASTQ]$
[sgr@genlabcs 0200_FASTQ]$ gzip --test S4_L002_R1_001.fastq.gz
[sgr@genlabcs 0200_FASTQ]$
[sgr@genlabcs 0200_FASTQ]$ gzip --test S4_L002_R1_001.fastq.gz S4_L002_R2_001.fastq.gz
gzip: S4_L002_R2_001.fastq.gz: invalid compressed data--format violated
[sgr@genlabcs 0200_FASTQ]$
[sgr@genlabcs 0200_FASTQ]$
*************************************************************************************
*************************************************************************************
*************************************************************************************
***                                     fastQC log file                                                   ***
*************************************************************************************
Started analysis of S4_L002_R2_001.fastq.gz
Approx 5% complete for S4_L002_R2_001.fastq.gz
Approx 10% complete for S4_L002_R2_001.fastq.gz
Approx 15% complete for S4_L002_R2_001.fastq.gz
Approx 20% complete for S4_L002_R2_001.fastq.gz
Approx 25% complete for S4_L002_R2_001.fastq.gz
Approx 30% complete for S4_L002_R2_001.fastq.gz
Approx 35% complete for S4_L002_R2_001.fastq.gz
Approx 40% complete for S4_L002_R2_001.fastq.gz
Approx 45% complete for S4_L002_R2_001.fastq.gz
Approx 50% complete for S4_L002_R2_001.fastq.gz
Approx 55% complete for S4_L002_R2_001.fastq.gz
Approx 60% complete for S4_L002_R2_001.fastq.gz
Approx 65% complete for S4_L002_R2_001.fastq.gz
Approx 70% complete for S4_L002_R2_001.fastq.gz
Approx 75% complete for S4_L002_R2_001.fastq.gz
Failed to process file S4_L002_R2_001.fastq.gz
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: ID line didn't start with '@'
        at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:158)
        at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:125)
        at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:76)
        at java.lang.Thread.run(Thread.java:745)
*************************************************************************************
*************************************************************************************
ADD REPLY

Login before adding your answer.

Traffic: 1882 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6