Question: Annotating gene fusion predictions
1
gravatar for Chris Miller
3.4 years ago by
Chris Miller20k
Washington University in St. Louis, MO
Chris Miller20k wrote:

I've got a list of SV calls, including breakpoints, and can easily enough winnow them down to those that are candidate gene fusions (intersect gene body or intron, same direction, strand, etc).

Now I'd like to know whether they're predicted to give in-frame or out-of-frame fusions. So far, I'm unable to find any annotation tool that can do this in a straightforward manner.  Visualization would also be nice, but isn't necessary.

Any suggestions?

ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by Chris Miller20k
2
gravatar for Chris Miller
3.4 years ago by
Chris Miller20k
Washington University in St. Louis, MO
Chris Miller20k wrote:

I had good results using PRADA, (suggested by Roel Verhaak on Twitter). http://bioinformatics.mdanderson.org/main/PRADA:Overview.  Specifically, the prada-frame command takes easy input (gene name, breakpoint location) that I already had and spits out a list of consequences in every transcript that matches.  

ADD COMMENTlink written 3.4 years ago by Chris Miller20k
0
gravatar for mikhail.shugay
3.4 years ago by
mikhail.shugay3.3k
Czech Republic, Brno, CEITEC
mikhail.shugay3.3k wrote:

Have you tried this one?

Oncofuse: Prediction Of Driver Gene Fusions From Ngs Data 

Note that it is most straightforward to use with input from some popular fusion detection software tools, i.e. one needs to get the data in a right format. 

ADD COMMENTlink written 3.4 years ago by mikhail.shugay3.3k

Thanks for the suggestion - I've got Oncofuse up and running and the output makes sense. I'm a bit concerned at how it's dropping a large number of fusion candidates, though. I already have these events mapped to Ensembl transcripts and believe them to be valid. Are there options that will allow for retaining these events? Or is there a straightforward way to replace the refseq annotations with ones from Ensembl that may be more inclusive?

ADD REPLYlink written 3.4 years ago by Chris Miller20k

Indeed, it focuses on canonical transcripts from RefSeq, one per RefSeq gene. Extending Oncofuse to isoform level and dealing with junction mapping ambiguity will definitely require a substantial re-write. Adding other genes/transcripts will require additional rounds of annotation and feature selection.

As for your original post, I believe it is not that hard to write a script that tells you if junction combines exons that are in/out of frame. Tables downloaded from UCSC GB for Ensembl genes and transcripts (Gencode V20/Ensembl 76) has exon frames. One has to compute exon remainders which are for 0-based coordinates (end - start + frame) % 3  and check if 5' exon remainder corresponds to 3' exon frame. Of course the hard part is to handle ambiguous cases.

ADD REPLYlink written 3.4 years ago by mikhail.shugay3.3k

Thanks for the response, Mikhail. I still have a handful of transcripts for which neither program annotates the frame, so your suggestions about calculating it myself may still come in handy.  Thanks for a nice package and the advice!

ADD REPLYlink written 3.4 years ago by Chris Miller20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1037 users visited in the last hour