About The Priority Of Trimming
2
2
Entering edit mode
10.6 years ago
Tonyzeng ▴ 310

HI, I have a question here, my reads have over all good per base and per sequence quality but there are three potential problems at the same time after QC. These are per base sequence content (the first 10bp bases are unbalanced), a 50bp overrepresented sequences and Kmer content is bad.

So do I need to remove the first 10bp bases first and then do trim 50bp overrepresented?

I have tried to move remove the first 10bp unbalanced bases and I found the QC did not show overrepresented sequence anymore. However, my Kmer content report looks more mess...

So now, I changed to trim 50bp overrepresent sequence, however, I got a variety of length ( from 0 to 88bp) of reads by using cutadapt software. what I need to do next? continue to trim the first 10bp? or ...?

trimming • 6.2k views
ADD COMMENT
6
Entering edit mode
10.6 years ago

The unbalanced ~10bp at the 5' end of reads that you mention is likely due to "random hexamer priming", which isn't exactly perfectly random. There's no need to trim these bases off, they won't actually bias mapping. An overrepresented Kmer content could also be normal, depending on the Kmer and the type of sequencing that you're doing. In general, try using something like trim_galore, that clips adapters and removes low quality bases for you.

ADD COMMENT
0
Entering edit mode

Hi Devon, I remember that you discussed that in RNA-Seq, we only need to do 'gentle' trimming (e.g. only remove the adapters). However, how about de-novo assembly RNA-Seq? After FASTQC control, I found 'overrepresented sequence' shows that there are 0.2% TruSeq Adapters index 27 , and some other sequence (all of those <0.5%) . Do I need to (1) trim all of these overrepresented sequences,(2) just remove the adapters and (3) leave them? Thank you!

ADD REPLY
0
Entering edit mode

You would want to trim all extraneous sequence (as far as you can recognize it) for any type of NGS analysis, more so for any "de novo" work.

ADD REPLY
0
Entering edit mode

In reference genome based RNA-Seq, I read some papers said we need to do 'gentle' trimming (i.e. only remove the adapters). But I am not sure in de novo assembly is it same situation.

ADD REPLY
0
Entering edit mode

I can't find the exact tweet at the moment, but Titus Brown happened to tweet about this recently and his recommendation for de novo assembly is to trim adapters but nothing else. I tend to go with him on assembly related questions, since this is really not my forte.

ADD REPLY
0
Entering edit mode

Hi Devon, do you have a source for "There's no need to trim these bases off (eg. the ~10bp at the 5' end) , they won't actually bias mapping".

ADD REPLY
0
Entering edit mode
10.6 years ago

Generally speaking, first thing that you need to do is to remove adapters. Those might be those 10bp at 5' end you are speaking about. Then, low quality regions should be excluded. This might mean that some reads will become too short to continue to work with them. Therefore, it might be good idea to remove all sequences shorter than certain threshold. After these steps, you are free to go. You should be careful about removing over-represented k-mers - these might as well result from the nature of the data (sample).

Also, please include screenshots of the quality reports of your concern.

ADD COMMENT

Login before adding your answer.

Traffic: 2790 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6