Paired FASTQ files do not show equal amount of reads
1
2
Entering edit mode
7.4 years ago
K.Nijbroek ▴ 100

Hi,

I've got a data sample consisting of two mate-paired FASTQ files. I've applied paired-end Trimmomatic on both these FASTQ files and retrieved a forward_accepted.fq and a reverse_accepted.fq. Now I applied Bowtie2 to align both FASTQ files in forward-reverse order, and it seems like there seems to be inconsistency in the amount of reads of the forward_accepted.fq and the reverse_accepted.fq.

Is there some easy way to fix this, such as a picard/gatk/samtools command that is able to delete the incorrect read pairs, or continue with the correct read pairs? Best solution for me would be input of 2 incorrect FASTQ files, and output of 2 correct FASTQ files.

mate-pairs Paired FASTQ • 9.0k views
ADD COMMENT
0
Entering edit mode

did you check gunzip -c your.fastq.gz | wc -l for all your files?

ADD REPLY
0
Entering edit mode

Both FASTQ files have the exact same amount of lines. Seems like Bowtie2 is doing something odd then?

ADD REPLY
0
Entering edit mode

Is there any empty fastq? length(seq)==0?

ADD REPLY
0
Entering edit mode

Bowtie2: 'Error, fewer reads in file specified with -2 than in file specified with -1'. Both FASTQ files have exact same amount of lines and seqs are not 0 (I require minlen 50). I can't figure out why all other samples prepared with the same processing steps do work, while this one won't...

ADD REPLY
0
Entering edit mode

I wonder if one of the lines got misformatted such that it thinks the file ends earlier. Perhaps try using -u to see how far it can get before terminating.

ADD REPLY
0
Entering edit mode

I will do that. I've also prepared a Bowtie2 command on the raw .FASTQ files (before Trimming), and that works fine. It seems like trimming is doing something wrong... although the number of lines in trimmed files are the same...

ADD REPLY
0
Entering edit mode

I did an another additional run for the trimmed FASTQ files and somehow everything went well this time. Now I'm completely flabbergasted....

ADD REPLY
1
Entering edit mode
7.4 years ago

It's likely that you trimmed the files separately, rather than as a pair. Trimmomatic can handle the files as a pair, so just have it do so.

Note that there are also threads on resyncing the fastq files if you absolutely insist on going that route. How To Sort Two Mate Pair (Fastq) Files So That The Order Of The Identifiers Is The Same? and Combining The Paired Reads From Illumina Run are two go-to references for this.

ADD COMMENT
0
Entering edit mode

I didn't do separate Trimming, I used Paired-End. That's why I find it so strange that in this case the FASTQ files show inconsistency, because for other samples it does work.

ADD REPLY
0
Entering edit mode

That's quite odd then. If you can make a reproducible example of that then it'd be helpful if you submitted a bug report to the trimmomatic authors. In any case, the two threads that I linked to provide some ways to recover from this.

ADD REPLY

Login before adding your answer.

Traffic: 1960 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6