That's what rMATS is for, though it'll take the BAM file instead of the junction.bed file, which in my opinion is essentially worthless.
Also, stop using tophat. Use something better, like STAR or even hisat2.
If you provide
- an exhaustive list of the varieties of "exon splicing events" you wish to define
- descriptions of what you mean by each
- an example or two of each given as a simulated junction.bed file
... then perhaps someone will be able or inclined to take your challenge.
In my experience, these terms are not consistently defined in the literature so it would be a mistake to try and assume what you really want.
For example, I've never heard of "skipped junctions" as a kind of "exon splicing event"
Similarly, "constitutive exon" is a label given to an exon which appears in every (known) isoform of a gene. But it is not a name for an "exon splicing event".
So you really have to be quite specific in what you are asking for.
That said, I expect that however you frame the question, you will find that knowing just the locations of (putative) introns, as provided by a junction.bed file, will prove insufficient to answer it. This is because these files don't tell you where the surrounding exons begin and end. They just tell you where the introns are.
Nonetheless: you might think along these lines:
Consider your junctions.bed file(s) as a directed graph(s) with each line in the file representing an 'edge' connecting a 'donor' with an 'acceptor' site (where the sites are integers being the chromosomal coordinate).
Then split it up into a set of its 'connected components'.
Then relabel each connected components, changing the label from the chromosomal coordinate to its rank in the list of all the chromosomal coordinates.
Then each unique graph might correspond to a "kind" of exon splicing event.
(ignoring strand and chromosome for simplicity) given these junctions as input directed graph:
1100 1200 1100 1300 2100 2200 2100 2300 3100 3200 3100 3300 3100 3400
The connection components would be
1100 1200 1100 1300 2100 2200 2100 2300 3100 3400 3200 3400
which would be relabeled as
1 2 1 3 1 2 1 3 1 3 2 3
Now, you might decide that [[1,2],[1,3]] is the canonical motif for an alternative acceptor event (of which we have 2), and [[1,3],[2,3]] is the canonical motif for an alternative donor event (of which we have 1).
BUT, remember, you don't know where the surrounding exons end, so, you might well be making a mistake in-so-doing.
If you know and represent the extent of the surrounding exons (as might be inferred from RNA-Seq coverage, or might be given as known in a GTF file), this kind of approach extends nicely. A little trickier though but doable.
FWIW: I still wonder if these categories are really biologically meaningful. Many different schemes have been devised to classify them (a good review appears in: A General Definition and Nomenclature for Alternative Splicing Events) but less interesting work has substantiated that these classes are biologically relevant, my prior efforts notwithstanding. I would appreciate being educated contrariwise here... For instance: Do we know that different RBPs control switching between A3SS (Altenative 3' Splice Sites) than control switching between, say, MXE (Mutually Exclusive Exon). That would be interesting!