Question: What is the optimal read length filter for Kallisto post adapter removal?
gravatar for bipin
2.0 years ago by
bipin20 wrote:

I am using 151bp paired end RNA seq reads to study differentially expressed genes between two conditions. The reads are aligned to a reference transcriptome using Kallisto(index created using default kmer size of 31).

However ~16% of the reads have an adapter contamination with the adapter sequence starting in the middle of the read in some cases. The fastqc plot for adapter contamination look like this

I am using trim-galore to remove the adapter contamination however I am unsure as to what min length cutoff post adapter removal I should keep to optimize between preventing multimapping and losing reads.

I tested with 50 bp which results in loss of ~100000 read pairs(0.5%) and adds/removes ~30 genes from the significant list from DESeq2(total significant genes ~2400).

Kallisto works fine without the adapter removal too but I suspect it might result in spurious multimapping for reads which have very small >31 & <50 bp non adapter portion.

What would be an optimal read length cutoff in this scenario or how can I figure out the cutoff in this case?

rna-seq kallisto • 1.1k views
ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by bipin20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1672 users visited in the last hour