What is the best way to handle unmapped reads from RNA-Seq data
2
3
Entering edit mode
7.2 years ago

I have used tophat2 to map rna-seq reads to a draft genome. The alignment percentage is around 75-80% for all samples. When I take the unmapped reads and blast them, they hit the same organism, indicating the unmapped reads might have potential information. How do I deal with the unmapped reads and include them in DE analysis or any other downstream analysis ? Should I go with entirely different pipeline like trinity ?

RNA-Seq tophat2 trinity • 4.4k views
ADD COMMENT
0
Entering edit mode

You may want to look at this paper.

ADD REPLY
0
Entering edit mode
maybe allow a few more mismatches with tophat?
ADD REPLY
0
Entering edit mode

thanks. but I thinks Its more about incomplete genome rather than alignment problem.

ADD REPLY
2
Entering edit mode
7.2 years ago

I have tried STAR and the mapping percentage increased up to 90-92% ( with tophat2, it was only up to 75-85%). I will try BBMap soon.

ADD COMMENT
1
Entering edit mode
7.2 years ago

I suggest using a more sensitive aligner (BBMap), so you have fewer unmapped reads and thus less bias.

ADD COMMENT
0
Entering edit mode

Okay. I will try that.

ADD REPLY
0
Entering edit mode

Note that BBMap has a parameter "maxindel" which defaults to "maxindel=16000". This is fine for plants, fungi, and microbes, but if you are sequencing vertebrates (or anything else with introns longer than ~16kbp) you should increase it to about the 98th percentile of intron length in that organism (in mammals this means around 100kbp to 200kbp). All other parameters can be left as default.

ADD REPLY
0
Entering edit mode

Hi, is BBMap output comparable with cufflinks/StringTie ?

ADD REPLY
0
Entering edit mode

It's sam, so you can conver it and sort it with bam. By the way, did you trim your reads for quality?

ADD REPLY
0
Entering edit mode

For Cufflinks, you should add the flag "xs=firststrand" or whatever because Cufflinks needs that, and "intronlen=10" to make introns in cigar strings printed as 'N' instead of 'D'.  If samtools is installed, BBMap can directly output bam files rather than sam files, if you name the output file something.bam.

ADD REPLY

Login before adding your answer.

Traffic: 1643 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6