How can I filter RNA-seq reads to remove reads where more than 50% of the read is low quality (Q<20)?
1
0
Entering edit mode
3.9 years ago
mar77 ▴ 40

I'm trying to filter out any reads that consist of more than 50% low quality reads (Q<20). I've already performed filtering steps to remove adapter contamination and remove reads with more than 10% unknown reads, both using cutadapt. Can anyone recommend a tool to use to perform this filtering step? I've previously seen NGSQC Toolkit recommended but I believe this is not supported any longer and I am having trouble accessing it.

Also I am using paired end reads so do these need to be processed together and do I need to use a tool that supports this?

RNA-Seq filtering quality genome • 990 views
ADD COMMENT
0
Entering edit mode
3.9 years ago
Papyrus ★ 2.9k

Assuming you wanted to say " consist of more than 50% low quality bases", I would recommend fastp, it is quite easy to use and has quality filtering options for % of bases within a read not meeting a quality threshold (--unqualified_percent_limit, --qualified_quality_phred should be what you're looking for).

It handles paired-end. Personally I would do all the filtering with the same tool.

(Also, as anticipated by its name, it is quite fast).

ADD COMMENT

Login before adding your answer.

Traffic: 3480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6