Question: Why sequence length disturbution failed after adapter trimming: fastqc?
gravatar for newbie
5 months ago by
newbie90 wrote:

Dear all,

I have downloaded some already published raw data (fastqs). Initially, I did QC and found adapter content in both forward and reverse reads.

Below you can see the fastqc details before adapter trimming of both forward and reverse reads:

enter image description here

To remove the adapter content I used cutadapt like below:

cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT -o tr_sample_R1.fastq.gz -p tr_sample_R2.fastq.gz sample_R1.fastq.gz sample_R2.fastq.gz

With adapter trimming I see like below:

enter image description here

So, I have some questions:

1) Before adapter trimming, sequence length distribution was looking fine but after adapter trimming I see that something went wrong. Why is it like that?

2) I see that there is some bias in the first 10-15 bases. What I should do for that? Is it really a problem?

3) Why the GC content have multiple peaks?

Please clarify my doubts. thanks in advance.

ADD COMMENTlink modified 5 months ago by Sam120 • written 5 months ago by newbie90
gravatar for swbarnes2
5 months ago by
United States
swbarnes29.6k wrote:

I don't think any of this is a problem. You didn't really even have to trim adapters.

ADD COMMENTlink written 5 months ago by swbarnes29.6k
gravatar for Sam
5 months ago by
Sam120 wrote:

You can read here about the bias in the first bases.

As to the sequence length distribution, just think of what cutting adapters means... Are reads expected to be of the same length, once you cut adapters, or not?

ADD COMMENTlink modified 5 months ago • written 5 months ago by Sam120
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1148 users visited in the last hour