Question

junctions.bed file in hisat2

1

Entering edit mode

8.3 years ago

badribio ▴ 290

Hi,

I was wondering if there is a way I can generate junctions.bed file from hisat2 just like the one in tophat.

Hisat2 documentation says use extract_splice_site.py python script but that gets splice site from gtf file, is there a way to know splice sites and no of reads associated with those sites using hisat2?

Thanks

hisat pipeline alignment splicing • 5.3k views

ADD COMMENT • link updated 8.2 years ago by EagleEye 7.6k • written 8.3 years ago by badribio ▴ 290

4

Entering edit mode

8.2 years ago

badribio ▴ 290

Figured out this does the job

https://regtools.readthedocs.io/en/latest/commands/junctions-extract/

ADD COMMENT • link 8.2 years ago by badribio ▴ 290

score 4 · Accepted Answer · 2017-03-22

Easiest solution will be to use featureCounts and counts reads from exon-exon junctions by providing your aligned BAM files. Check manual page 35 for detailed information.

**-J (juncCounts)**    Count the number of reads supporting each exon-exon junction.
Junctions are identified from those exon-spanning reads
(containing ‘N’ in CIGAR string) in input data. The output
result includes names of primary and secondary genes that
overlap at least one of the two splice sites of a junction. Only
one primary gene is reported, but there might be more than
one secondary gene reported. Secondary genes do not overlap
more splice sites than the primary gene. When the primary
and secondary genes overlap same number of splice sites, the
gene with the smallest leftmost base position is selected as
the primary gene. Also included in the output result are the
position information for the left splice site (‘Site1’) and the
right splice site (‘Site2’) of a junction. These include chromosome
name, coordinate and strand of the splice site. In
the last columns of the output, number of supporting reads is
provided for each junction for each library