Question: What is the optimal read length filter for Kallisto post adapter removal?
1
gravatar for bipin
11 months ago by
bipin10
bipin10 wrote:

I am using 151bp paired end RNA seq reads to study differentially expressed genes between two conditions. The reads are aligned to a reference transcriptome using Kallisto(index created using default kmer size of 31).

However ~16% of the reads have an adapter contamination with the adapter sequence starting in the middle of the read in some cases. The fastqc plot for adapter contamination look like this

I am using trim-galore to remove the adapter contamination however I am unsure as to what min length cutoff post adapter removal I should keep to optimize between preventing multimapping and losing reads.

I tested with 50 bp which results in loss of ~100000 read pairs(0.5%) and adds/removes ~30 genes from the significant list from DESeq2(total significant genes ~2400).

Kallisto works fine without the adapter removal too but I suspect it might result in spurious multimapping for reads which have very small >31 & <50 bp non adapter portion.

What would be an optimal read length cutoff in this scenario or how can I figure out the cutoff in this case?

rna-seq kallisto • 510 views
ADD COMMENTlink modified 11 months ago • written 11 months ago by bipin10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 843 users visited in the last hour