I previously called splicing event (ie. intron retention) using alternative splicing detection software Spladder (https://spladder.readthedocs.io/en/latest/spladder_cohort.html) on a large cohort. The event output from spladder is stored in GFF3 format with each event as a "mini gene". For example in intron retention case,
chr1 intron_retention gene 629062 629433 . + . ID=intron_retention.3;GeneName="ENSG00000225972.1";HasNovelJunction="Y"
chr1 intron_retention mRNA 629062 629433 . + . ID=intron_retention.3_iso1;Parent=intron_retention.3;GeneName="ENSG00000225972.1";HasNovelJunction="Y"
chr1 intron_retention exon 629062 629256 . + . Parent=intron_retention.3_iso1
chr1 intron_retention exon 629324 629433 . + . Parent=intron_retention.3_iso1
chr1 intron_retention mRNA 629062 629433 . + . ID=intron_retention.3_iso2;Parent=intron_retention.3;GeneName="ENSG00000225972.1";HasNovelJunction="N"
chr1 intron_retention exon 629062 629433 . + . Parent=intron_retention.3_iso2
The event intron_retention.3 has two isoforms, in which intron_retention.3_iso1 is the intron splice-out version while intron_retention.3_iso2 is intron splice-in version.
Now I have a new set of samples in pair-end fastq files. Instead of creating new event, I want to 1) mapping the fastq files back to pre-called events (GFF3) 2) quantify splice-in and splice-out reads 3) calculate PSI using the read counts covered in the splice junctions (see below, in intron retention case, Alt exon will be the intron)
I have explored Spladder software itself, it does not have an obvious solution for the task. If this is the case, what external tools can I use to achieve above 3 steps?