Question: Best Approach To Predict Novel And Alternative Splicing Events From Rna-Seq Data
Hi All,

I am looking for the best approach to predict novel and alternative splicing events from RNA-seq data. I tried to use Cufflink and Trans-AByss so far, however they both have limitations that makes using them difficult. Cufflink requires huge amount of memory and takes weeks to run (And I had only 20 human transcriptome!). While Trans-AByss does not predict the expression level (ex. FPKM) for transcripts, and has high false positive rate! I have also heard about this method called RSEM (going to try it), but I think it only quantify known transcripts from RNAseq data.

So I think I am wondering what is the best approach for identification and quantification of both novel and alternative splicing events in your experience?

Thanks very much :-)

Washington University School of Medicine, St. Louis, USA
I think you are going to find that the field is still very much an area of active research. You may find that clearly separating the challenges of alignment, transcript assembly/prediction, and quantification is helpful.

For example you might try some combination of:

  1. STAR or TopHat for alignment
  2. Cufflinks (in de novo mode) or Trinity for transcriptome assembly
  3. Flux Capacitor or Cufflinks (in reference only mode) for transcript quantification

Here are some relevant threads that you may not have noticed:

Here is a recent paper showing a workflow: RNA sequencing of cancer reveals novel splicing alterations. They started with Cufflinks/Cuffdiff and combined these with MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data and ASprofile a suite of programs for extracting, quantifying and comparing alternative splicing (AS) events from RNA-seq data.

A list of alternative splicing tools and resources can be found in this post: Recommended tools for alternative splicing detection from RNA-seq data

As you are experienced in the field, may I ask a question? Comparing cufflink and Trans-AByss, Trans-AByss has a very nice output format, clearly points to the novel event (ex. skipped exon at some position), and even you can extract the sequence of novel transcripts! However, I could not find this kind of information in Cufflinks results. Cuffcompare marks novel transcripts, but the position and type of event is not clear. Where do you find this type of information? Also in my experience with cuffdiff, it took 2 weeks to run on 12 human transcriptomes. Is cufflinks running time is usually this long? Then how to apply it to large scale studies? Many thanks in advance,

I have modified my answer to include a link to recent paper describing a group that started with Cufflinks/Cuffdiff output and extracted useful differential splicing results from it by using additional downstream tools and scripts.

Thanks for the paper. It is really helpful. I just have a question from the paper, which I was hoping you could help me with. They mentioned that they aligned RNA from the novel isoforms to human ORFeome to find functional ORFs. I did not understand how they extracted the cDNA sequences form the Cufflinks analysis? Many many thanks for all your help,

ADD REPLYlink written 5.9 years ago by Sahel250

Hey Sahel

Did you find a way to extract the cDNA sequences from Cufflinks? Can you please share!

Providing a review of the + and - of the different methods to discover novel and alternative splicing events is, I think, beyond the scope of this forum. I found this review a good start (pay wall unfortunately). NAR has also a one or more software paper in every issue on new method to detect splicing event.

I also found those links useful:

A comprehensive list of software used to study splicing in RNA-seq:

There is also a nice slide from Eduardo Eyras on figshare describing how to study splicing from RNA-seq.

I hope this help.

Did you try MISO ?

