Question

define alternative splice sites

0

Entering edit mode

6.9 years ago

novicebioinforesearcher ▴ 70

Given junction.bed files from tophat how can is define exon splicing events? for example skipped exon, constitutive exon or skipped junctions.

alternative splicing • 2.0k views

ADD COMMENT • link updated 6.9 years ago by Malcolm.Cook ★ 1.5k • written 6.9 years ago by novicebioinforesearcher ▴ 70

score 0 · Answer 1 · 2017-05-26

0

Entering edit mode

6.9 years ago

Devon Ryan 104k

That's what rMATS is for, though it'll take the BAM file instead of the junction.bed file, which in my opinion is essentially worthless.

Also, stop using tophat. Use something better, like STAR or even hisat2.

ADD COMMENT • link 6.9 years ago by Devon Ryan 104k

0

Entering edit mode

So how to use BAM files to look for splicing events? could you explain the algorithm of splicing events?

ADD REPLY • link 6.9 years ago by Ben ▴ 60

0

Entering edit mode

Please see the rMATS paper.

ADD REPLY • link 6.9 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks @Devon Ryan i am aware of rmats, (picking up midway some ones work) I need to annotate junctions that were differentially expressed by using DEXseq. I only have set of junctions now, no access to fastq or bams.

ADD REPLY • link 6.9 years ago by novicebioinforesearcher ▴ 70

0

Entering edit mode

Hmm, I'm sure there's something for this but it's not something I've ever needed to do. For the most part, what you're seeing is just changes in isoform usage.

ADD REPLY • link 6.9 years ago by Devon Ryan 104k

0

Entering edit mode

Yes, I guess I need to write something myself.

ADD REPLY • link 6.9 years ago by novicebioinforesearcher ▴ 70

score 0 · Answer 2 · 2017-05-26

Hi NBS:

If you provide

an exhaustive list of the varieties of "exon splicing events" you wish to define
descriptions of what you mean by each
an example or two of each given as a simulated junction.bed file

... then perhaps someone will be able or inclined to take your challenge.

In my experience, these terms are not consistently defined in the literature so it would be a mistake to try and assume what you really want.

For example, I've never heard of "skipped junctions" as a kind of "exon splicing event"

Similarly, "constitutive exon" is a label given to an exon which appears in every (known) isoform of a gene. But it is not a name for an "exon splicing event".

So you really have to be quite specific in what you are asking for.

That said, I expect that however you frame the question, you will find that knowing just the locations of (putative) introns, as provided by a junction.bed file, will prove insufficient to answer it. This is because these files don't tell you where the surrounding exons begin and end. They just tell you where the introns are.

Nonetheless: you might think along these lines:

Consider your junctions.bed file(s) as a directed graph(s) with each line in the file representing an 'edge' connecting a 'donor' with an 'acceptor' site (where the sites are integers being the chromosomal coordinate).
Then split it up into a set of its 'connected components'.
Then relabel each connected components, changing the label from the chromosomal coordinate to its rank in the list of all the chromosomal coordinates.

Then each unique graph might correspond to a "kind" of exon splicing event.

Example

(ignoring strand and chromosome for simplicity) given these junctions as input directed graph:

The connection components would be

which would be relabeled as

Now, you might decide that [[1,2],[1,3]] is the canonical motif for an alternative acceptor event (of which we have 2), and [[1,3],[2,3]] is the canonical motif for an alternative donor event (of which we have 1).

BUT, remember, you don't know where the surrounding exons end, so, you might well be making a mistake in-so-doing.

If you know and represent the extent of the surrounding exons (as might be inferred from RNA-Seq coverage, or might be given as known in a GTF file), this kind of approach extends nicely. A little trickier though but doable.

FWIW: I still wonder if these categories are really biologically meaningful. Many different schemes have been devised to classify them (a good review appears in: A General Definition and Nomenclature for Alternative Splicing Events) but less interesting work has substantiated that these classes are biologically relevant, my prior efforts notwithstanding. I would appreciate being educated contrariwise here... For instance: Do we know that different RBPs control switching between A3SS (Altenative 3' Splice Sites) than control switching between, say, MXE (Mutually Exclusive Exon). That would be interesting!