Trimmomatic trims most of the reverse strand
19 months ago
hkarakurt ▴ 150

Hello everyone, I have a paired-end data. It is from small part of human genome not a WGS or WES. I use BWA for alignment and do variant calling. Since I had some false positives I wanted to do trimming.

I used trimmomatic with command:

trimmomatic PE -phred33 -threads 2 read1.fastq.gz read2.fastq.gz read1_trimmed.fastq read2_trimmed.fastq LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15

I do not know why but I have this output:

Input Read Pairs: 1834513 Both Surviving: 1822034 (99.32%) Forward Only Surviving: 5740 (0.31%) Reverse Only Surviving: 417 (0.02%) Dropped: 6322 (0.34%)
TrimmomaticPE: Completed successfully

I do not have any adapters since the are removed in bcl2fastq step. In FASTQ quality control I have seen that quality of reverse and forward reads are similar in quality but trimmomatic trimmed many reads in reverse file. I cannot do alignment with these files, BWA gives a warning that says:

[bseq_read] the 2nd file has fewer sequences.
[process] read 155012 sequences (16693324 bp)...
[bseq_read] the 2nd file has fewer sequences.

Is there anyone who faced a problem like this before?

Thank you in advance

fastq trim trimmomatic bwa • 623 views
19 months ago
GenoMax 115k

Looks like you did not specify files to separate the unpaired reads from Read 1 and 2 so likely those reads ended up in your trimmed files. That is why your trimmed reads are likely not in sync any more. Either repeat the trimming with

trimmomatic PE -phred33 -threads 2 read1.fastq.gz read2.fastq.gz read1_trimmed.fastq read1_unpaired.fastq read2_trimmed.fastq read2_unpaired.fastq  LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15

or you can use from BBMap suite to re-sync your trimmed data files separating the orphan reads out.

Edit: Title of the thread is not correct since the stats show a high % of both reads surviving.


