Question: About The Priority Of Trimming
2
gravatar for Tonyzeng
6.0 years ago by
Tonyzeng300
Tonyzeng300 wrote:

HI, I have a question here, my reads have over all good per base and per sequence quality but there are three potential problems at the same time after QC. These are per base sequence content (the first 10bp bases are unbalanced), a 50bp overrepresented sequences and Kmer content is bad.

So do I need to remove the first 10bp bases first and then do trim 50bp overrepresented?

I have tried to move remove the first 10bp unbalanced bases and I found the QC did not show overrepresented sequence anymore. However, my Kmer content report looks more mess...

So now, I changed to trim 50bp overrepresent sequence, however, I got a variety of length ( from 0 to 88bp) of reads by using cutadapt software. what I need to do next? continue to trim the first 10bp? or ...?

trimming • 4.1k views
ADD COMMENTlink modified 6.0 years ago by Devon Ryan91k • written 6.0 years ago by Tonyzeng300
6
gravatar for Devon Ryan
6.0 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

The unbalanced ~10bp at the 5' end of reads that you mention is likely due to "random hexamer priming", which isn't exactly perfectly random. There's no need to trim these bases off, they won't actually bias mapping. An overrepresented Kmer content could also be normal, depending on the Kmer and the type of sequencing that you're doing. In general, try using something like trim_galore, that clips adapters and removes low quality bases for you.

ADD COMMENTlink written 6.0 years ago by Devon Ryan91k

Hi Devon, I remember that you discussed that in RNA-Seq, we only need to do 'gentle' trimming (e.g. only remove the adapters). However, how about de-novo assembly RNA-Seq? After FASTQC control, I found 'overrepresented sequence' shows that there are 0.2% TruSeq Adapters index 27 , and some other sequence (all of those <0.5%) . Do I need to (1) trim all of these overrepresented sequences,(2) just remove the adapters and (3) leave them? Thank you!

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by super60

You would want to trim all extraneous sequence (as far as you can recognize it) for any type of NGS analysis, more so for any "de novo" work.

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by genomax70k

In reference genome based RNA-Seq, I read some papers said we need to do 'gentle' trimming (i.e. only remove the adapters). But I am not sure in de novo assembly is it same situation.

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by super60

I can't find the exact tweet at the moment, but Titus Brown happened to tweet about this recently and his recommendation for de novo assembly is to trim adapters but nothing else. I tend to go with him on assembly related questions, since this is really not my forte.

ADD REPLYlink written 3.5 years ago by Devon Ryan91k

Hi Devon, do you have a source for "There's no need to trim these bases off (eg. the ~10bp at the 5' end) , they won't actually bias mapping".

ADD REPLYlink written 18 months ago by Picasa460
0
gravatar for Biomonika (Noolean)
6.0 years ago by
State College, PA, USA
Biomonika (Noolean)3.1k wrote:

Generally speaking, first thing that you need to do is to remove adapters. Those might be those 10bp at 5' end you are speaking about. Then, low quality regions should be excluded. This might mean that some reads will become too short to continue to work with them. Therefore, it might be good idea to remove all sequences shorter than certain threshold. After these steps, you are free to go. You should be careful about removing over-represented k-mers - these might as well result from the nature of the data (sample).

Also, please include screenshots of the quality reports of your concern.

ADD COMMENTlink modified 6.0 years ago • written 6.0 years ago by Biomonika (Noolean)3.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 755 users visited in the last hour