What is the best way to handle unmapped reads from RNA-Seq data
2
3
Entering edit mode
7.2 years ago

I have used tophat2 to map rna-seq reads to a draft genome. The alignment percentage is around 75-80% for all samples. When I take the unmapped reads and blast them, they hit the same organism, indicating the unmapped reads might have potential information. How do I deal with the unmapped reads and include them in DE analysis or any other downstream analysis ? Should I go with entirely different pipeline like trinity ?

RNA-Seq tophat2 trinity • 4.4k views
0
Entering edit mode

You may want to look at this paper.

0
Entering edit mode
maybe allow a few more mismatches with tophat?
0
Entering edit mode

thanks. but I thinks Its more about incomplete genome rather than alignment problem.

2
Entering edit mode
7.2 years ago

I have tried STAR and the mapping percentage increased up to 90-92% ( with tophat2, it was only up to 75-85%). I will try BBMap soon.

1
Entering edit mode
7.2 years ago

I suggest using a more sensitive aligner (BBMap), so you have fewer unmapped reads and thus less bias.

0
Entering edit mode

Okay. I will try that.

0
Entering edit mode

Note that BBMap has a parameter "maxindel" which defaults to "maxindel=16000". This is fine for plants, fungi, and microbes, but if you are sequencing vertebrates (or anything else with introns longer than ~16kbp) you should increase it to about the 98th percentile of intron length in that organism (in mammals this means around 100kbp to 200kbp). All other parameters can be left as default.

0
Entering edit mode

Hi, is BBMap output comparable with cufflinks/StringTie ?

0
Entering edit mode

It's sam, so you can conver it and sort it with bam. By the way, did you trim your reads for quality?

0
Entering edit mode

For Cufflinks, you should add the flag "xs=firststrand" or whatever because Cufflinks needs that, and "intronlen=10" to make introns in cigar strings printed as 'N' instead of 'D'.  If samtools is installed, BBMap can directly output bam files rather than sam files, if you name the output file something.bam.