Question: What is the optimal read length filter for Kallisto post adapter removal?
gravatar for bipin
18 months ago by
bipin20 wrote:

I am using 151bp paired end RNA seq reads to study differentially expressed genes between two conditions. The reads are aligned to a reference transcriptome using Kallisto(index created using default kmer size of 31).

However ~16% of the reads have an adapter contamination with the adapter sequence starting in the middle of the read in some cases. The fastqc plot for adapter contamination look like this

I am using trim-galore to remove the adapter contamination however I am unsure as to what min length cutoff post adapter removal I should keep to optimize between preventing multimapping and losing reads.

I tested with 50 bp which results in loss of ~100000 read pairs(0.5%) and adds/removes ~30 genes from the significant list from DESeq2(total significant genes ~2400).

Kallisto works fine without the adapter removal too but I suspect it might result in spurious multimapping for reads which have very small >31 & <50 bp non adapter portion.

What would be an optimal read length cutoff in this scenario or how can I figure out the cutoff in this case?

rna-seq kallisto • 754 views
ADD COMMENTlink modified 17 months ago • written 18 months ago by bipin20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 896 users visited in the last hour