Question: What is a better strategy for finding novel transcripts?
gravatar for jsw940
8 months ago by
jsw94010 wrote:

Hi, I've started learning RNA-seq only recently. I use nanopore technology(cDNA sequencing) for finding novel transcripts. But I don't know exactly which analyze tools are suit for this. I read several papers about this but I couldn't fully understand.

For example, I wanted to visualize my sequencing data with IGV. But IGV only take mapped sequencing data, which exclude novel transcripts(as I know...). And when I use Gffcompare, I cannot extend my analyze with data categorized as "u".

These several lack of my knowledge makes me confused. So, is there patterned pipelines for finding novel transcripts using informatics before doing actual verification such as RT-qPCR, cloning, and so on...?

And, thank you for all of you! every question and answer was very helpful for my studying.

sequencing rna-seq • 200 views
ADD COMMENTlink modified 8 months ago by Amitm1.9k • written 8 months ago by jsw94010

Do you have a reference genome available? If so, reads from novel transcripts should still be mapped.

ADD REPLYlink modified 8 months ago • written 8 months ago by genomax89k

Yes, I used minimap2 with GRCh38.p13.genome. Honestly, I have difficulty to choose reference genome and annotation. How can I choose genome version compatible with my analyze plan? Until now, I followed genome version used in papers.

ADD REPLYlink written 7 months ago by jsw94010
gravatar for Amitm
8 months ago by
Amitm1.9k wrote:

Hi, I assume this is an organism for whom reference genome/ annotation are available. I have experience with human/ mouse using Illumina platform. For a reference genome known situation, the first scenario would be to do transcriptome assembly after alignment. So using STAR alignment and then using StringTie. You would have a GTF as one of the result files which would have the known as well as novel transcripts assembled (read manual for the appropriate parameters to use). Once you are at this stage, then load your STAR aligned BAM file onto IGV and the StringTie GTF as well. Choose any gene locus and then you could check the transcripts assembled by StringTie. The GTF gets loaded beneath the 'Gene' track in IGV. The 'Gene' track (set in blue colour in image) would have the known isoforms and your GTF (from StringTie) (set in pink and green colour in image; 2 samples) should show the known and any novel transcripts (Tx) assembled. IGV link - This strategy works well if you are suspecting a couple of loci. If you want to do transcriptome-wide comparison where 'novel' transcripts are also considered, then JunctionSeq is another approach. This is not assembly-based, but rather uses (junction) counts but evaluates all novel uses of exons in known transcripts. As far as I am aware, completely novel transcripts are beyond this tool's scope, but the advantage here is statistical rigour for transcriptome-wide comparison, especially if you have a bunch of 'test' vs. 'control' samples.

Finally, you could try de novo Tx assembly, if reference annotation is not available, or you suspect something really wild is going on. Time and down-stream interpretation are limiting factors here.

ADD COMMENTlink written 8 months ago by Amitm1.9k

Thank you very much! Now I'm following your guidelines. But my raw data was not mapped with STAR(it said that my read is too short! so I'm adjusting some parameters.), Instead, I used minimap2 and stringtie2. And finally I could visualize my data and check my transcripts. Your mention is so helpful to me. I can feel I need to study more and more. Thank you!

ADD REPLYlink written 7 months ago by jsw94010
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1649 users visited in the last hour