Error in sequence length distribution
1
0
Entering edit mode
3.4 years ago

Hello everyone,

I have question regarding the sequence length distribution. I used cutadapt tool to trim the adapter from paired sequence using the following command below:

cutadapt    -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC   -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT   -j 6 -o filter/R-2T1_R1_001.filter.fastq.gz -p filter/R-2T1_R2_001.filter.fastq.gz   R-2T1_R1_001.fastq.gz  R-2T1_R2_001.fastq.gz

However after the adapter removal it showed error in sequence length distribution. There was no error prior to adapter removal. I shall be highly appreciate if you can give suggestion how to overcome.

enter image description here

Thank you so much

RNA-Seq • 2.4k views
ADD COMMENT
0
Entering edit mode

A couple of points:

  1. This should be a Question-type post, not a Tool-type post. The latter is used to introduce new tools, while the former covers all questions about existing tools.
  2. Please see How to add images to a Biostars post to add your images properly - we recommend imgbb as an image hosting service, as it seems to work all over the world and doesn't need a user account to be created.
  3. Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
    code_formatting
ADD REPLY
0
Entering edit mode

I made the changes now, bioinformatics.queries please stop editing the post. For the future (not in this post anymore) please apply what _r_am listed.

ADD REPLY
2
Entering edit mode
3.4 years ago
ATpoint 81k

If after trimming a fraction of reads is shorter than the bulk (which is expected) then this warning/error gets triggered. You can probably ignore it. Proceed with downstream analyis, fastqc (as good as it is) is a very basic QC tool and most warnings can be ignored or are not critical. Only thing I really look at are adapter contamination and per-base quality to see whether there is a general error//failure in the sequencing run. I would then only come back to it and look at the other metrics if something odd happens during downstream analysis which one needs to chase down.

ADD COMMENT
0
Entering edit mode

Thanks for your response. But it doesn't show warning rather it shows error in the sequence length distribution. Is it still required to ignore it ? Could you please also suggest as to how to check fraction of reads is shorter than the bulk?

ADD REPLY
0
Entering edit mode

Yep. Warning/error is basically the same, the tool has some internal threshold when it triggers error or warning. I would just go on with analysis in your case.

ADD REPLY
0
Entering edit mode

Ok, I just have one last question, I see 40% duplicated sequences. Is it required to remove the duplicated sequences once the alignment is done in RNA-seq dataset?

ADD REPLY

Login before adding your answer.

Traffic: 2026 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6