I searched the previous posts and the nanopore community forum but couldn't find a definitive answer so here I am looking for your advice. I know working with long-reads RNA-seq data is still complicated because it's hard to find a general consensus on how to analyse them but I'd be interested in having your opinion.
I'm interested in measuring the number of genes detected in various RNA-seq libraries that I have generated with various nanopore library kits (DNA ligation of RT-PCR products, direct-cDNA and direct-RNA).
So far I've been using HTseq with the mode
intersection-nonempty but it seems that now the
union mode is the most recommended one. And i've also seen posts stating that featureCounts is much better and there's even an option for
I tried to compare the output of the three 'possibilities' but I cannot say it really helped me in the end:
HTseq + intersection-nonempty mode: 238K assigned + 0 ambiguous + 18K no_features + 24K too low quality
HTseq + union mode: 210K assigned + 44K ambiguous + 2K no_features + 24K too low quality
featureCounts + long-reads mode: 219K assigned + 59K ambiguous + 2K no_features + 0 too low quality
Based on those observations, would you be able to recommend me to use one versus the two others ? And if yes, why ? I'm trying to understand what makes one more suited to my need and/or generally more suited to the analyse of long reads RNA-seq data.
Thanks a lot.