I am working on ~100 samples for detecting alternative splicing. TopHat generates a junction.bed file for each sample. However, each of these bed files has different number of rows, and the coordinates of each junction is not same across samples. I think this junction.bed file includes known and novel junctions.
Since I am only interested in known junctions in Ensembl annotation database, how can I map these 100 junction.bed files to Ensembl gtf file and obtain a table matrix with the row as known-junction and column as sampleID?
Or do I need to create a exon-exon junction annotation bed file from Ensembl, then apply RSeQC to obtain reads for each junction against mapped .bam files?