Hello everyone !
I am trying to figure out a way to estimate how long the transcripts we are getting from a cDNA experiment with nanopore are.
For Illumina data, we have been using RSeQC : tin.py module : https://academic.oup.com/bioinformatics/article/28/16/2184/325191 .
Naively, we have tried to run this tool for our test samples, sequenced on both Illumina and Nanopore.
The result we got are unexpected and puzzling. For a given subset of transcripts of interests (not to run on the whole transcriptome), we are in general getting lower TIN values for Nanopore sequencing than for Illumina sequencing.
I can't really wrap my head around it. I am questioning wether this tool is appropriate for long read or not. Hence my post here to gather some opinions !
Any idea why this difference is observed ? Are their others tools that could do what we are looking for ?
Best,
Roxane
What kind of aligner did you use? Also, how good is the reference gene model?
I used dorado aligner. And the reference is Hg38
doradointernally usesminimap2so you should have got the best results already .. more or less. You could manually runminimap2withand confirm.
Was this a direct RNA sequencing or cDNA sequencing? What do you mean by "integrity"? Are you thinking that the sequence you got from nanopore does not reflect reality?