I was using gencode mouse annotation file vM17 for quantifying genes for RNA-seq. I am also interested in lncRNA quantification. I used two different annotation files. First to quantify all transcripts I used the comprehensive annotation file (primary assembly). Then to quantify just the lncRNA, I used the gencode mouse lncRNA annotation file. Now I know that the comprehensive files should have all the lncRNA, so I compared the FPKM values calculated from comprehensive and lncRNA annotation files.
What I observe is that the FPKM values are different. The trend is same, so for example in three condition if using comprehensive annotation file I get following values :A= 2, B=4, C=8; then when I use lncRNA annotation file I get A=6, B=11 , C=23 (example for representation purpose only). I just wanted to ask opinion of experts if I should use FPKM values from lncRNA annotation or the comprehensive file.
I am assuming that in the lncRNA notation, when the reads fall in a region that might overlap with mRNA, it is counted towards lncRNA; as there is no mRNA annotation. However, in case of comprehensive annotation; the read is decided based on where the overlap is more prominent. This is just my thinking.
Please guide me understand what should be my choice: comprehensive or lncrna?