Question: Gene Fusion Detection: Rna-Seq Data
10
gravatar for KS
6.3 years ago by
KS340
KS340 wrote:

Hello everyone

I am trying to analyze RNA-Seq data. I am a beginner in this process and trying to learn software's used for analyzing RNA-Seq data. I have 10 tumor cancer samples with matching normal samples and I need to find gene fusions in these samples as a part of my school exercise. Could any one please suggest any process of how to proceed?

Thanks

ADD COMMENTlink modified 23 months ago by aditisk0 • written 6.3 years ago by KS340
1

Here are some reviews of RNA fusion detection tools:

http://www.hindawi.com/journals/bmri/2013/340620/
http://www.oapublishinglondon.com/article/617
http://www.nature.com/articles/srep21597

ADD REPLYlink modified 18 months ago • written 4.7 years ago by Malachi Griffith16k
35
gravatar for stianlagstad
3.0 years ago by
stianlagstad890
Oslo, Norway
stianlagstad890 wrote:

Tools capable of detecting fusion genes:

Most of these use RNA-seq data, some use WGS data, and some use both. They are listed alphabetically. I will add to the list when I discover more.

1. Barnacle: http://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-14-550
2. Bellerophontes: http://bioinformatics.oxfordjournals.org/content/28/16/2114.long
3. BreakDancer: http://www.nature.com/nmeth/journal/v6/n9/abs/nmeth.1363.html
4. BreakFusion: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3389765/
5. BreakPointer: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3561864/
6. ChimeraScan: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3187648/
7. Comrad: http://bioinformatics.oxfordjournals.org/content/27/11/1481.long
8. CRAC: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053775/
9. deFuse: http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1001138
10. Dissect: http://bioinformatics.oxfordjournals.org/content/28/12/i179.abstract
11. EBARDenovo: http://bioinformatics.oxfordjournals.org/content/early/2013/03/01/bioinformatics.btt092
12. EricScript: http://bioinformatics.oxfordjournals.org/content/28/24/3232
13. FusionAnalyser: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439881/
14. FusionCatcher: http://biorxiv.org/content/early/2014/11/19/011650.full-text.pdf+html
15. FusionFinder: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384600/
16. FusionHunter: http://bioinformatics.oxfordjournals.org/content/27/12/1708.long
17. FusionMap: http://bioinformatics.oxfordjournals.org/content/27/14/1922
18. FusionQ: http://www.biomedcentral.com/1471-2105/14/193
19. FusionSeq: http://www.genomebiology.com/2010/11/10/R104
20. IDP-fusion: http://nar.oxfordjournals.org/content/early/2015/06/03/nar.gkv562.full
21. iFUSE: http://bioinformatics.oxfordjournals.org/content/29/13/1700.long
22. InFusion: https://bitbucket.org/kokonech/infusion/wiki/Home
23. INTEGRATE: http://www.ncbi.nlm.nih.gov/pubmed/26556708
24. JAFFA: http://www.genomemedicine.com/content/7/1/43
25. LifeScope: http://www.thermofisher.com/no/en/home/life-science/sequencing/next-generation-sequencing/solid-next-generation-sequencing/solid-next-generation-sequencing-data-analysis-solutions/lifescope-data-analysis-solid-next-generation-sequencing/lifescope-genomic-analysis-software-solid-next-generation-sequencing.html
26. MapSplice: http://www.ncbi.nlm.nih.gov/pubmed/20802226
27. MOJO https://github.com/cband/MOJO
28. nFuse: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3483554/
29. Pegasus: http://www.biomedcentral.com/1752-0509/8/97
30. PRADA: http://www.ncbi.nlm.nih.gov/pubmed/24695405
31. ShortFuse: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3072550/
32. SnowShoes-FTD: http://nar.oxfordjournals.org/content/39/15/e100
33. SOAPFuse: http://www.genomebiology.com/2013/14/2/R12
34. SOAPFusion: http://www.ncbi.nlm.nih.gov/pubmed/24123671
35. STAR: http://bioinformatics.oxfordjournals.org/content/29/1/15
36. STAR-Fusion: https://github.com/STAR-Fusion/STAR-Fusion/wiki
37. TopHat-Fusion: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245612/
38. TRUP: http://www.genomebiology.com/2015/16/1/7
39. ViralFusionSeq: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3582262/


Other useful programs:

Chimeraviz: https://bioconductor.org/packages/devel/bioc/html/chimeraviz.html (disclaimer: I created this)

Chimera: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4253834/
OncoFuse: http://bioinformatics.oxfordjournals.org/content/29/20/2539.long
FuMa: http://bioinformatics.oxfordjournals.org/content/early/2015/12/09/bioinformatics.btv721.abstract


Articles comparing gene fusion finders:

The structure of state of art gene fusion-finder algorithms
Application of next generation sequencing to human gene fusion detection: computational tools, features and perspectives
Comprehensive evaluation of fusion transcript detection algorithms and a meta-caller to combine top performing methods in paired-end RNA-seq data

ADD COMMENTlink modified 18 months ago by Malachi Griffith16k • written 3.0 years ago by stianlagstad890

Wow. Great post!

ADD REPLYlink written 3.0 years ago by Obi Griffith17k

Thanks! I'm creating tools to visualize gene fusions in my master thesis, so I've done quite a few searches :)

ADD REPLYlink written 3.0 years ago by stianlagstad890

Excellent. Please share those tools here as soon as you are able. This is an area that is lacking so good choice for your masters!

ADD REPLYlink written 3.0 years ago by Obi Griffith17k

You might be interested in the question I just posted: Which program, tool, or strategy do you use to visualize genomic rearrangements?

ADD REPLYlink written 3.0 years ago by stianlagstad890

Thanks for this summary! 

Based on your experience so far in this field, which of these softwares would you recommend for someone that intends to analyze fusions transcripts in several species, all available in the ensembl genome dataset? 

ADD REPLYlink written 2.9 years ago by VHahaut1.1k

I don't know enough to answer that question, sorry. The three articles that compare fusion finders might help you.

ADD REPLYlink written 2.9 years ago by stianlagstad890
10
gravatar for Obi Griffith
6.3 years ago by
Obi Griffith17k
Washington University, St Louis, USA
Obi Griffith17k wrote:

You could try Tophat-Fusion. But, if you are really just getting started with RNA-Seq analysis I would start with simple expression level and differential expression analysis. To do that, you could start by installing and learning how to use the Tuxedo suite of software (Bowtie, Tophat, cufflinks, cuffdiff, CummeRbund). Once you have mastered those you can proceed to the slightly more advanced tophat-fusion (which now comes together with tophat2). They provide a tutorial on their website.

I strongly recommend a workshop/course on the subject. However, to get you started, why not work through last year's Canadian Bioinformatics Workshop (CBW) tutorial on RNA Sequence Analysis. You can find it on the 2011 course page under the "Informatics on High Throughput Sequencing Data" course. There are probably several other lectures there that will be helpful as well.

ADD COMMENTlink modified 6.3 years ago • written 6.3 years ago by Obi Griffith17k
1

@Griffith: link to "tutorial on their website" does not work.

ADD REPLYlink written 6.3 years ago by Dataminer2.5k

Thanks for noticing that. Fixed.

ADD REPLYlink written 6.3 years ago by Obi Griffith17k
1

Let me put my 5 cents.. actually I've been using Tophat-fusion (then Tophat2) for most of the RNA-seq analysis. I think this is a gold standard in the field. Unfortunately for me (and others, see e.g. http://seqanswers.com/forums/showthread.php?t=13096) tophat-fusion-post is really hard to get working. Tophat-fusion produces numerous false-positives (most of fusions are coming from the same transcript), as well as a lot of read-through events. I would like to recommend our post-filtering pipeline http://bioinformatics.oxfordjournals.org/content/early/2013/08/24/bioinformatics.btt445, hope you'll find it useful

ADD REPLYlink written 5.1 years ago by mikhail.shugay3.3k

Not gold-standard, but very popular.

ADD REPLYlink written 5.1 years ago by Sean Davis25k
1

Agreed. I don't think anything in the current crop can be considered a gold standard for fusion detection.

ADD REPLYlink written 5.0 years ago by Obi Griffith17k

Thanks! These filters should really be a part of tophat-fusion. Especially the gene-to-gene filter, but also the in-frame filter.

ADD REPLYlink written 5.0 years ago by Danielk550

It is important to note that the in-frame filter remains a complex question. Based on analysis of available manually mapped fusion junctions in literature [http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0004805] there could be indels that could repair/break fusion frame thus having a very high impact on presence and function of fusion protein. I'm not sure if currently available fusion detection software could effectively handle this issue (correct me if I'm wrong).

ADD REPLYlink written 5.0 years ago by mikhail.shugay3.3k

I have a problem with RNA-seq. I run tophat2-Fusion and got results.html and in that file there are 11 candidate. Now I want to validate. how can I choose fusion candidate for validation ?

ADD REPLYlink written 5.5 years ago by charitrakumarmishra0
2

@charitrakumarmishra :Ask it as a seperate Question!!!!

ADD REPLYlink written 5.5 years ago by Rm7.7k
5
gravatar for Nicolas Rosewick
6.3 years ago by
Belgium, Brussels
Nicolas Rosewick6.6k wrote:

Here's a list of fusion gene detection tools working with RNA-seq data

ADD COMMENTlink modified 6.3 years ago by Malachi Griffith16k • written 6.3 years ago by Nicolas Rosewick6.6k
3
gravatar for David Langenberger
6.3 years ago by
Deutschland
David Langenberger8.2k wrote:

You could take the RNAseq datasets and map them against a reference genome using mapping tools that can handle split reads. Two tools, I can recommend are segemehl, and TopHat Fusion.

They use reads that overlap with splice sites and appear to be cleaved when mapping them back to a genome (one side to one exon, the other side to the next exon and intron in between), to predict splice sites and/or fusion transcripts. The difference is basically, that when you have a splice site, both ends are somewhat 'close' to each other. In a fusion transcript, these splits are far away, on different strands, or even on different chromosomes.

How to use segemehl for your problem (on the segemehl link, every step is explained in more detail):

  • create a index for you genome (you have to do it only once)
  • run segemehl
  • run haarz, which will call the splice junctions
ADD COMMENTlink written 6.3 years ago by David Langenberger8.2k
2

segemehl is NOT a fusion gene finder! segemehl is a simple aligner exactly like Bowtie2, BWA, START, etc. This is stated by the segemehl people here: http://seqanswers.com/forums/showthread.php?t=40765&highlight=segemehl

 

Finding splice junctions is different than finding fusion genes!

 

 

 

 

 

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by enxxx23170
1

I did not claim that segemehl is a 'fusion-gene-finder' per se, did I? I just explained, how you can use segemehl together with haarz to predict 'splice junctions', which can be fusion-junctions. :) 

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by David Langenberger8.2k

The question is asking for fusion finder! TopHat-Fusion is in the same sentence with SEGEMHL in your answer when TopHat-Fusion is a fusion gene finder and SEGEMHL is not!

Also, the three steps from above are a method for finding splice junctions. Please note that finding splice junctions does not mean that one is finding fusion genes! Most of the things ( that is 98%) what will be found with those 3 steps will be readthroughs and readthroughs!

ADD REPLYlink written 4.3 years ago by enxxx23170
1

I see your point. Sorry for being somewhat off-topic then. I did not want to send anyone down the wrong track, nor upset you. Thanks for clarifying the difference of "fusion-junction" and "fusion-gene", of course it is something different and I should have been much more detailed in the first place. Just wanted to help though. :) Next time more precise! :)

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by David Langenberger8.2k

I see your point also and I think that this confusion comes from the authors of SEGEMEHL which state in their paper that SEGEMEHL identifies readily fusion transcripts without need of separate post-processing.

Here is the paper:

Hoffmann et al. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing, and FUSION DETECTION, Genome Biol. 2014.http://www.ncbi.nlm.nih.gov/pubmed/24512684

and here is the quote from the above article:

"Implemented in the segemehl mapping tool, it readily identifies conventional splice junctions, collinear and non-collinear fusion transcripts, and trans-spliced RNAs, without the need for separate post-processing..."

ADD REPLYlink written 4.3 years ago by enxxx23170
1

I believe that overall gene fusions with both breakpoints in introns will be simply the most frequent class of fusions. So if a software could be sensitive enough to spot a junction between exons of different genes and report it - it is capable of fusion detection

ADD REPLYlink written 4.3 years ago by mikhail.shugay3.3k

According to this definition then BLAST/BLAT/BOWTIE/BOWTIE2/BWA fusion finders too!

Out there are only known few hundreds of fusion genes known all together for all cancers! See for Mitelman database of gene fusions and COSMIC Catalog of somatic mutations in cancer!

If one just runs straight any aligner on one sample will "find" discover thousands of candidate fusion genes which makes one wonder how is this possible that one sample has more "gene fusions" than all known validated fusion genes in all cancers (which come from thousands of samples)? 

 

 

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by enxxx23170

So, you wanted to use it for fusion-gene detection, but it didn't work and now you are upset? I'm really sorry, if our statements confused you and you wasted your time. Perhaps we can help you running segemehl and haarz to get some fusion-junctions and downstream some fusion-genes out of it? 

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by David Langenberger8.2k

Also a note for the rest of interested people:

If any of you is interested in learning how to use segemehl to detect fusion transcripts and/or circularized RNAs, I can recommend you the following hands-on course:

Discovering standard and non-standard RNA transcripts - How to detect canonical splicing, circular RNAs, trans-splicing, and fusion transcripts 

Developers of the algorithm will explain you step-by-step how you can use segemehl to detect standard and non-standard transcripts. They will assure that all of you understand the difference between 'fusion-junctions' and 'fusion-genes' and what exactly you can do with segemehl and all its downstream analysis tools like (lack or haarz).

ADD REPLYlink written 4.3 years ago by David Langenberger8.2k

If you wish I may give a lecture/presentation at your course about finding fusion genes and fusion genes finders where a clear and easy to understand explanation will be given about:

- what fusion genes really are, 

- what somatic fusion genes really are, 

- how many validated fusion genes are known today in the scientific literature, 

- what is the difference between conjoined genes and somatic fusion genes,

- what is the difference between germline fusion gene and somatic fusion gene,

- what is the difference between alternative splicing and fusion gene,

- why fusion genes are more interesting than readthroughs,

- how the validation in the wet-lab is done for fusion genes, and

- why is important to do wet-lab validation of the bioinformatic predictions (for example ENCODE has found bioinformatically that over 80% of human genome is functional in 2012 and the biologists proved that claim very wrong; see:  D.Graur "On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE", 2013, http://gbe.oxfordjournals.org/content/early/2013/02/20/gbe.evt028.short?rss=1 )

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by enxxx23170

Giving a lecture is actually a great idea! Unfortunately, the workshop is already prepared, announced and there are no free slots left. Nevertheless, I will ask around at the bioinformatics group of the University of Leipzig and I'm pretty sure there are more than enough interested people for a talk. Thanks for your offer! I will come back to you and then we can discuss about a date. Can you send me your contact information? (Is there a possibility for private messages here? If not, send your contact info to david.langenberger@ecseq.com)

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by David Langenberger8.2k

Hi enxx23,

I have a great interest in this lecture, can you share the ppt with me?

ADD REPLYlink written 2.7 years ago by Lilian0
1

Most of things identified by TopHat-Fusion will also be readthroughs. Thats why usually some explicit readthrough filtering is added to software. And by the way those readthroughs will be as real as gene fusions, with mRNA molecules being there in cells.

ADD REPLYlink written 4.3 years ago by mikhail.shugay3.3k

Again, out there are known only a few hundred of gene fusions in all cancers! Tophat-fusion is finding ~130 000 candidate fusion genes in 4 samples (from: http://www.hindawi.com/journals/bmri/2013/340620/ )!!!! That is over 1000 times more than all gene fusion known in all cancers!!! For sure 99.99% of those 130 000 fusions do not exist and are just false positives! There are fusion finders which offer the full package and require no additional filtering!

ADD REPLYlink written 4.3 years ago by enxxx23170
3
gravatar for Malachi Griffith
6.3 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith16k wrote:

In addition to splice junction discovery, MapSplice is capable of detecting gene fusions in RNA-seq data. From their website:

MapSplice is an algorithm for mapping RNA-seq data to reference genome for splice junction discovery. Features of MapSplice include:

  1. alignment of both short reads < 75bp and long reads >= 75bp.
  2. both CPU and memory efficiency.
  3. detection of small exons.
  4. discovery of canonical, semi-canonical and non-canonical junctions.
  5. splice inference based on the alignment quality and diversity of reads mapped to a junction.
  6. identification of chimeric events (intra-chromosomes and inter-chromosomes, inter-strands) with long reads.
  7. identification of chimeric events (intra-chromosomes and inter-chromosomes, inter-strands) with short paired-end reads.
  8. support paired-end reads and single-end reads
ADD COMMENTlink written 6.3 years ago by Malachi Griffith16k
3
gravatar for Malachi Griffith
5.1 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith16k wrote:

FusionCatcher is another option.

FusionCatcher searches for novel/known fusion genes, translocations, and chimeras in RNA-seq data (paired-end reads from Illumina NGS platforms like Solexa and HiSeq) from diseased samples. The aims of FusionCatcher are: very good detection rate for finding candidate fusion genes, very easy to use (i.e. no a priori knowledge of databases and bioinformatics is needed in order to run FusionCatcher), to be as automatic as possible (i.e. the FusionCatcher will choose automatically the best parameters in order to find candidate fusion genes, e.g. finding automatically the adapters, building the exon-exon junctions automatically based on the length of the input reads, etc.) while providing the best possible detection rate for finding fusion genes

Current citation for FusionCatcher:

D. Nicorici, M. Satalan, H. Edgren, S. Kangaspeska, A. Murumagi, O. Kallioniemi, S. Virtanen, O. Kilkku, FusionCatcher – a tool for finding somatic fusion genes in paired-end RNA-sequencing data, bioRxiv, Nov. 2014, DOI:10.1101/011650


 

 

ADD COMMENTlink modified 3.6 years ago • written 5.1 years ago by Malachi Griffith16k
2
gravatar for Sean Davis
6.3 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

You can take a look at using GSNAP for this purpose, also.

ADD COMMENTlink modified 6.3 years ago by Malachi Griffith16k • written 6.3 years ago by Sean Davis25k
2
gravatar for Malachi Griffith
6.3 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith16k wrote:

Trans-ABySS has been used to successfully identify gene-fusions from RNA-seq data by the group that developed it (full disclosure, I used to belong to that group).

ADD COMMENTlink written 6.3 years ago by Malachi Griffith16k
2
gravatar for Ron
23 months ago by
Ron810
United States
Ron810 wrote:

Here is a Fusion Detection tool for supervised analysis.You have to provide your list of fusion genes to check if present. https://github.com/FusionInspector/FusionInspector/wiki Also,it gives the bam files that show fusion support.

ADD COMMENTlink modified 23 months ago • written 23 months ago by Ron810
1
gravatar for Malachi Griffith
6.3 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith16k wrote:

BreakFusion is another option that was recently reported and released.

ADD COMMENTlink written 6.3 years ago by Malachi Griffith16k
1
gravatar for Rm
6.1 years ago by
Rm7.7k
Danville, PA
Rm7.7k wrote:

For Gene-fusion detection you can also try SnowShoes-FTD:

Article describing it: http://nar.oxfordjournals.org/content/early/2011/05/27/nar.gkr362.full

ADD COMMENTlink written 6.1 years ago by Rm7.7k
1
gravatar for enxxx23
4.3 years ago by
enxxx23170
Finland
enxxx23170 wrote:

Here is a comparison of several Fusion Genes Finders:

https://code.google.com/p/fusioncatcher/wiki/comparison

ADD COMMENTlink written 4.3 years ago by enxxx23170

Still this only provides some clues on good recall of fusioncatcher. Its not very clear what the precision is (afaik fusioncatcher paper is not yet available in e-pub). So I believe the choice of fusion detection software has to be done via trial and error :)

ADD REPLYlink written 4.3 years ago by mikhail.shugay3.3k

Precision and FDR is presented also there!

SOAPfuse has that kind of statistics. http://genomebiology.com/2013/14/2/R12

If this helps you we are using deFuse. ;-)

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by enxxx23170
0
gravatar for JC
6.3 years ago by
JC6.8k
Mexico
JC6.8k wrote:

A third vote for TopHat-Fusion, I recently used to see fusion in brain cancer samples, it predicts a lots of possible fusion (you can filter the results by coverage and a secondary analysis for mapping specificity with blat).

ADD COMMENTlink written 6.3 years ago by JC6.8k
0
gravatar for KS
6.3 years ago by
KS340
KS340 wrote:

Is TopHat fusion included in Galaxy Server (http://main.g2.bx.psu.edu/)??

ADD COMMENTlink written 6.3 years ago by KS340
2

This should probably be asked as a comment on one of the existing answers suggesting tophat-fusion or, since multiple answers suggested tophat-fusion, as a comment to your original question. But, definitely not as an answer to your question.

ADD REPLYlink written 6.3 years ago by Obi Griffith17k
0
gravatar for Rm
5.1 years ago by
Rm7.7k
Danville, PA
Rm7.7k wrote:

chimera bioconductor package an use rnaseq-STAR outputs too

ADD COMMENTlink written 5.1 years ago by Rm7.7k
0
gravatar for Malachi Griffith
4.7 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith16k wrote:

Genomon Fusion is yet another option.

ADD COMMENTlink written 4.7 years ago by Malachi Griffith16k
0
gravatar for Malachi Griffith
4.7 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith16k wrote:

SOAPfusion

"A novel tool for fusion discovery with paired-end RNA-Seq reads. The tool follows a different strategy by “finding fusions directly and verifying them”, differentiating it from all other existing tools by “finding the candidate regions and searching for the fusions afterwards”. This enables the fusion discovery process to be more effective and sensitive, also with a specular performance under low coverage of sequencing far more better than other tools."

Not to be confused with:

SOAPfuse

"An open source tool developed for genome-wide detection of fusion transcripts from paired-end RNA-Seq data. By comparing with previously released tools, SOAPfuse has a good performance. It is developed in perl. So far, it is developed only for analysis on human being RNA-Seq data.

ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by Malachi Griffith16k

Ok I'm quite confused now :) How are these tools performing compared to each other? Is one of them just a newer version of another?

ADD REPLYlink written 4.7 years ago by mikhail.shugay3.3k
0
gravatar for syxbestmayer
3.3 years ago by
syxbestmayer0 wrote:

have you ever used SOAPfuse,can you tell me how to use SOAPfuse?

the details of command.

ADD COMMENTlink written 3.3 years ago by syxbestmayer0
0
gravatar for aditisk
23 months ago by
aditisk0
aditisk0 wrote:

Hello everyone,

I had a concern about the outputs from 2 gene fusion calling tools that I was able to get to work. I ran Tophar-fusion and STAR-fusion on one of my RNA-Seq samples and the output gene fusion list doesn't match at all.

Has anyone else faced a similar issue before ? Any thoughts on why this could be the case would be really appreciated.

Thanks a lot.

ADD COMMENTlink written 23 months ago by aditisk0

Please provide more info:

  1. How the output was compared? Have you checked if the output is matched when allowing +/- shift in breakpoint coordinates? Are there any matches if fusion list is collapsed to gene pairs?

  2. What is the average number of fusions you've got with each tool? What about the coverage: number of reads spanning the junction itself, etc?

  3. Are you sure that there are any fusions in your sample? :)

ADD REPLYlink modified 23 months ago • written 23 months ago by mikhail.shugay3.3k

Actually, Tophat-fusion and STAR-fusion need a lot of improvement. Tophat-fusion consistently in is the bottom of all comparisons of fusion finders. TopHat-fusion calls hundreds of thousands of fusions per sample when it is well known that fusions are very rare and one has one fusion for every 10 samples analyzed. Therefore is it is expected that the lists of fusions from TopHat-fusion and STAR-fusion do not match at all.

ADD REPLYlink written 17 months ago by enxxx23170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 963 users visited in the last hour