Possible to use Oncofuse on the splice junction output file of rna-star?
1
2
Entering edit mode
9.5 years ago

Hi everyone,

I'm using rna-star for my Illumina RNA-Seq data and I recently found the tool oncofuse to be really helpful checking for interesting fusion genes.

The problem that I'm having is that rna-star outputs 2 files, one is chimeric junction file that works with oncofuse and the other splice junction file that doesn't work with oncofuse. The splice junction file reports linear fusions on the same chromosome and because the format of this file is different, oncofuse can't use it.

Does anyone know a way to make oncofuse accept the splice junction file or get rna-star to report linear junction on the same chromosome in the chimeric junction format?

Thanks in advance for any help!

/Mark

oncofuse rna-star RNA-Seq splice-junction • 3.9k views
ADD COMMENT
0
Entering edit mode
9.5 years ago

Hello!

First, thanks for the feedback. I'm the author of Oncofuse. I would like to note, that the chimeric junctions in RNAstar are selected based on the following criteria:

"the segments belong to different chromosomes, or different strands, or are far from each other"

Those junctions that didn't make it to "chimeric" category while on the same chromosome should be quite close. Oncofuse filters all junctions in which reads belong to the same gene, as those are splicing events, while the tool is solely focused on gene fusions. Moreover tools like Tophat-fusion report lots of such junctions.

If the reads come from genes that are close to each other, then there is a possibility that the transcript is of readthrough nature. Such transcripts often occur in normal tissues, so a priori the likelihood of them being oncogenic is less (yet there are many counter-examples).

Anyways, I believe the chimeric junction file also reports fusions on the same chromosome (see section 5.2 in https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf).

On the other hand, I agree that there appears to be no clear definition of a parameter that sets the minimal distance between read parts to be considered a chimera or not. The most likely option for this is --outSJfilterIntronMaxVsReadN option.

If there is really a chance that STAR misses important chimeric transcripts in Chimeric junctions file, then I'll consider implementing a parser for it.

EDIT

According to the reply here, --alignIntronMax is the parameter that controls which junctions get filtered to "Chimeric" output files. Other options determine which junctions make it to standard output, SJ.out.tab.

Best regards,
Mike

ADD COMMENT
1
Entering edit mode

Hi Mike,

I was hoping you would react. Thanks btw for the very nice tool!

I think I understand the differences between the junction files and my understanding is that if mapped reads are on the same strand and on the same chromosome they get reported in the Splice Junction out, of course taking the --outSJfilterIntronMaxVsReadN in consideration which you definitely need to set high (~10mb, or even 100mb?) since otherwise junctions very far apart will not end up in the chimeric junction file and not in splice junction.

We actually had an example of a known fusion gene that was close on the same chromosome and linear (same strand) and was not reported in either of the files because of setting the --outSJfilterIntronMaxVsReadN to small. I still think it is strange because you could potentially miss junctions this way although I dont know if this is still the case in recent versions.

I feel that this could also be something the author of rna-star could think about so I will see if I can contact him as well. (maybe I'm missing something)

Thanks a lot for your reply.

Best,
Mark

ADD REPLY
0
Entering edit mode

Ok now I see, it would be wise to ask the author then. I've just created an issue at the STAR repository.

ADD REPLY
0
Entering edit mode

Please see the edit above, this clarifies a lot

ADD REPLY

Login before adding your answer.

Traffic: 2401 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6