GC/AT base content crossed at the tail of read

0

Entering edit mode

3.0 years ago

tomas4482 ▴ 430

I used fastp and fastqc for quality control and trimming of some concatenated fq files. But I found something weird.

The GC/AT base content line at the tail crossed at around 90bp.

crossed tail .

At first I thought trimming the tail could solve this problem. But it fails. When trim the front 15bp and tail 10bp using fastp -f 15 -t 15 , the crossed base content line still crossed at around 60bp.

after trimming

This situation occurs only in a few concatenated fq. Others are fine. It seems concatenation is not the cause. Does anyone know what happens here?

Thanks.

DNA sequence WGS WES • 857 views

ADD COMMENT • link updated 3.0 years ago by GenoMax 147k • written 3.0 years ago by tomas4482 ▴ 430

0

Entering edit mode

For the first 15 based have you seen: https://sequencing.qcfail.com/articles/positional-sequence-bias-in-random-primed-libraries/

ADD REPLY • link 3.0 years ago by GenoMax 147k

0

Entering edit mode

I've read this document before. I don't think the front bias is problematic.

The real problems are: 1. I don't understand why and how the nt base content largely changed at the tail (but the ratio of G:C and A:T remains normal). 2. No matter how long I trim the tail (I've tried 10bp and 15bp), it does not remove the bias. This bias will move to a upstream position after filtering rather than disappear.

What does it mean? Do you have any idea?

ADD REPLY • link 3.0 years ago by tomas4482 ▴ 430

0

Entering edit mode

What kind of data is this? Have you tried to align it? That pattern may be a result of library prep method.

ADD REPLY • link 3.0 years ago by GenoMax 147k

Login before adding your answer.