Question: Detecting a small set of given Intron-Exon junctions in a RNA-Seq dataset
gravatar for Pierre Lindenbaum
4.1 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum128k wrote:

Hi all, I've been given a set of RNA-Seq fastqs (4 samples) and a collaborator would like to know how many defined intron-exon junction in his favorite gene are detected in those reads.

My idea was to clean the reads with cutadapt, align with Bowtie and create a custom program to loop over the cigar string and detecting the junctions.

or is there an existing tool for this ? Bonus if the tool can tell me wether two junctions belong to the same transcript.

junction rna-seq intron exon • 1.6k views
ADD COMMENTlink modified 4.1 years ago by Malachi Griffith18k • written 4.1 years ago by Pierre Lindenbaum128k
gravatar for igor
4.1 years ago by
United States
igor10k wrote:

If you use the STAR aligner, it will output with high confidence collapsed splice junctions in tab-delimited format.

ADD COMMENTlink written 4.1 years ago by igor10k

that was simple and it workded fine, thanks

ADD REPLYlink written 4.0 years ago by Pierre Lindenbaum128k
gravatar for Malachi Griffith
4.1 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith18k wrote:

You can use the tool RegTools to get this information from a BAM file. So you would align your reads to the reference genome first using TopHat2, STAR, HISAT2, etc. Then use regtools to both extract and annotate the exon-exon junctions (i.e. exon-intron ... intron-exon). The

Code is here:

Documentation is here:

You would start with regtools junctions extract followed by regtools junctions annotate

Usage looks like this:

regtools junctions extract [options] indexed_alignments.bam
regtools junctions annotate [options] junctions.bed ref.fa annotations.gtf

The annotate result will among other things, tell you which junctions correspond to which transcripts, what novel junctions might be observed in your RNA-seq data, how these relate to known transcripts, etc. The annotations.gtf file could be a list of known transcript annotations. For example, we often use the GTF files that you can down directly from Ensembl.

ADD COMMENTlink written 4.1 years ago by Malachi Griffith18k
gravatar for Khader Shameer
4.1 years ago by
Manhattan, NY
Khader Shameer18k wrote:

I have used MapSplice, better to use their recent version MapSplice 2

MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery

ADD COMMENTlink written 4.1 years ago by Khader Shameer18k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1187 users visited in the last hour