Question: Adapter Trimming Length Cutoff And Quality Trimming Quality Cutoff
2
gravatar for epigene
4.9 years ago by
epigene440
United States
epigene440 wrote:

I want to do some quality control on raw fastq files by doing adapter trimming and quality trimming. I'm wondering what is a good read length cutoff to use to keep a trimmed reads? i.e. if after trimming, the read is discarded if its length is smaller than a cutoff.

Similar thing, what quality score cutoff should I use to trim off bad quality reads? Is 20 a good start? And is it a good idea to do quality trimming? Many aligners have quality considered so I'm not sure which option is better.

Thanks!

ngs • 4.1k views
ADD COMMENTlink modified 4.9 years ago by Charles Warden6.1k • written 4.9 years ago by epigene440
1
gravatar for Charles Warden
4.9 years ago by
Charles Warden6.1k
Duarte, CA
Charles Warden6.1k wrote:

You can also use FastQC to try and access the quality of your reads (for example, to see where the quality scores start to significantly drop off and/or at what position that drop occurs):

http://www.bioinformatics.babraham.ac.uk/projects/fastqc/

You can then carry out the length, quality, etc. trims with the fastx-toolkit:

http://hannonlab.cshl.edu/fastx_toolkit/

ADD COMMENTlink written 4.9 years ago by Charles Warden6.1k

I know these tools. do you have a recommendation on the minimum length of the trimmed reads to keep?

ADD REPLYlink written 4.9 years ago by epigene440

If you are asking about minimum final read length, you can only allow unique alignments if you want to remove ambiguous reads. Alternatively, I think the first short read sequences were ~35 bp, so I probably wouldn't go below that.

If you are asking about how much to trim off the read, I think it will depend upon your own samples.

I don't typically trim any reads when working with reference-based alignments.

For de novo assembly, it can sometimes help to trim based upon quality scores (such as a sequence of 20 or 30 nucleotides with >Q20), trim out adapter sequences, and trim out mono-nucleotide reads. However, I don't believe I have actually simply trimmed based upon length. Perhaps an extra 2-3 nt could help detect small adapter sequences, but I think there are cases where even the steps that I listed are not necessary to get good contigs (meaning that lack of any trimming might have been OK).

ADD REPLYlink modified 4.9 years ago • written 4.9 years ago by Charles Warden6.1k

yeah, i was thinking about minimum final read length. I guess only keep unique alignments is the better option here. I think the decision to trim or not comes down to the size of the fragments. it's more important to trim for small size libraries.

ADD REPLYlink written 4.9 years ago by epigene440
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2509 users visited in the last hour