4.5 years ago by
Czech Republic, Brno, CEITEC
First, thanks for the feedback. I'm the author of Oncofuse. I would like to note, that the chimeric junctions in RNAstar are selected based on the following criteria:
"the segments belong to different chromosomes, or different strands, or are far from each other"
Those junctions that didn't make it to "chimeric" category while on the same chromosome should be quite close. Oncofuse filters all junctions in which reads belong to the same gene, as those are splicing events, while the tool is solely focused on gene fusions. Moreover tools like Tophat-fusion report lots of such junctions.
If the reads come from genes that are close to each other, then there is a possibility that the transcript is of ra eadthrough nature. Such transcripts often occur in normal tissues, so a priori the likelihood of them being oncogenic is less (yet there are many counter-examples).
Anyways, I believe the chimeric junction file also reports fusions on the same chromosome (see section 5.2 in https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf).
On the other hand, I agree that there appears to be no clear definition of a parameter that sets the minimal distance between read parts to be considered a chimera or not. The most likely option for this is
If there is really a chance that STAR misses important chimeric transcripts in Chimeric junctions file, then I'll consider implementing a parser for it.
According to the reply here (https://github.com/alexdobin/STAR/issues/8),
--alignIntronMax is the parameter that controls which junctions get filtered to "Chimeric" output files. Other options determine which junctions make it to standard output, SJ.out.tab.