What is a better strategy for finding novel transcripts?
1
0
Entering edit mode
4.2 years ago
jsw940 ▴ 10

Hi, I've started learning RNA-seq only recently. I use nanopore technology(cDNA sequencing) for finding novel transcripts. But I don't know exactly which analyze tools are suit for this. I read several papers about this but I couldn't fully understand.

For example, I wanted to visualize my sequencing data with IGV. But IGV only take mapped sequencing data, which exclude novel transcripts(as I know...). And when I use Gffcompare, I cannot extend my analyze with data categorized as "u".

These several lack of my knowledge makes me confused. So, is there patterned pipelines for finding novel transcripts using informatics before doing actual verification such as RT-qPCR, cloning, and so on...?

And, thank you for all of you! every question and answer was very helpful for my studying.

RNA-Seq sequencing • 1.5k views
ADD COMMENT
0
Entering edit mode

Do you have a reference genome available? If so, reads from novel transcripts should still be mapped.

ADD REPLY
0
Entering edit mode

Yes, I used minimap2 with GRCh38.p13.genome. Honestly, I have difficulty to choose reference genome and annotation. How can I choose genome version compatible with my analyze plan? Until now, I followed genome version used in papers.

ADD REPLY
5
Entering edit mode
4.2 years ago
Amitm ★ 2.2k

Hi, I assume this is an organism for whom reference genome/ annotation are available. I have experience with human/ mouse using Illumina platform. For a reference genome known situation, the first scenario would be to do transcriptome assembly after alignment. So using STAR alignment and then using StringTie. You would have a GTF as one of the result files which would have the known as well as novel transcripts assembled (read manual for the appropriate parameters to use). Once you are at this stage, then load your STAR aligned BAM file onto IGV and the StringTie GTF as well. Choose any gene locus and then you could check the transcripts assembled by StringTie. The GTF gets loaded beneath the 'Gene' track in IGV. The 'Gene' track (set in blue colour in image) would have the known isoforms and your GTF (from StringTie) (set in pink and green colour in image; 2 samples) should show the known and any novel transcripts (Tx) assembled. IGV link - https://ibb.co/GC2qTFc This strategy works well if you are suspecting a couple of loci. If you want to do transcriptome-wide comparison where 'novel' transcripts are also considered, then JunctionSeq is another approach. This is not assembly-based, but rather uses (junction) counts but evaluates all novel uses of exons in known transcripts. As far as I am aware, completely novel transcripts are beyond this tool's scope, but the advantage here is statistical rigour for transcriptome-wide comparison, especially if you have a bunch of 'test' vs. 'control' samples.

Finally, you could try de novo Tx assembly, if reference annotation is not available, or you suspect something really wild is going on. Time and down-stream interpretation are limiting factors here.

ADD COMMENT
0
Entering edit mode

Thank you very much! Now I'm following your guidelines. But my raw data was not mapped with STAR(it said that my read is too short! so I'm adjusting some parameters.), Instead, I used minimap2 and stringtie2. And finally I could visualize my data and check my transcripts. Your mention is so helpful to me. I can feel I need to study more and more. Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1608 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6