Entering edit mode
3.0 years ago
tomas4482
▴
430
I used fastp and fastqc for quality control and trimming of some concatenated fq files. But I found something weird.
The GC/AT base content line at the tail crossed at around 90bp.
.
At first I thought trimming the tail could solve this problem. But it fails. When trim the front 15bp and tail 10bp using fastp -f 15 -t 15
, the crossed base content line still crossed at around 60bp.
This situation occurs only in a few concatenated fq. Others are fine. It seems concatenation is not the cause. Does anyone know what happens here?
Thanks.
For the first 15 based have you seen: https://sequencing.qcfail.com/articles/positional-sequence-bias-in-random-primed-libraries/
I've read this document before. I don't think the front bias is problematic.
The real problems are: 1. I don't understand why and how the nt base content largely changed at the tail (but the ratio of G:C and A:T remains normal). 2. No matter how long I trim the tail (I've tried 10bp and 15bp), it does not remove the bias. This bias will move to a upstream position after filtering rather than disappear.
What does it mean? Do you have any idea?
What kind of data is this? Have you tried to align it? That pattern may be a result of library prep method.