Question

Intron within the mRNA stranded-specific library

0

Entering edit mode

7.5 years ago

seta ★ 1.9k

Hi all friends,

As we all know, alternative splicing survey is one of the RNA-seq applications. In the case of using "TruSeq Stranded mRNA Sample Prep Kit" for library preparation, could you please kindly tell me if there are the unspliced mRNA transcripts within the library? and how I can identify the intron region in the absence of sequenced genome?

Thanks in advance

RNA sequencing alternative splicing • 1.5k views

ADD COMMENT • link updated 7.5 years ago by harold.smith.tarheel ★ 4.9k • written 7.5 years ago by seta ★ 1.9k

score 1 · Answer 1 · 2016-11-04

1

Entering edit mode

7.5 years ago

i.sudbery 19k

You will probably get some intronic reads, even in polyA selected libraries, but unless it is a genuine intron retension event, I would expect the expression level of the assembled, intron retained transcripts to be much lower than the spliced.

Thus, you could try removing transcripts that one account for a small fraction of the output of a gene.

ADD COMMENT • link 7.5 years ago by i.sudbery 19k

0

Entering edit mode

Those intron retained transcripts would potentially help answer the second question @seta has, correct?

and how I can identify the intron region in the absence of sequenced genome?

Question is how reliably can every instance of those be identified (I assume those would be few and far between unless the library has been deeply sequenced), especially in absence of genomic sequence.

ADD REPLY • link 7.5 years ago by GenoMax 141k

score 1 · Answer 2 · 2016-11-04

Unspliced (pre-)mRNA is a small but non-zero fraction of the total. In principle, you could build gene models from your data, map the raw reads to the gene models, and identify those reads that contain insertions relative to the models. Note that this will also identify reads derived from low-abundance isoforms with alternative exon insertions, and those may swamp your pre-mRNA signal. You may be able to discriminate some of those by translating the encoded insertion - most (but not all) alternative exons will maintain the reading frame of the flanking exons.

You may also be able to discern the signature of conserved splice elements (donor and acceptor sites) in introns, but those require either a clean intronic data set to identify those elements or prior knowledge of those sequences. If you have any genomic sequence, you may be able to build a training data set from known intronic sequences (i.e., align your gene models to identify bona fide introns from gDNA sequence).