Question: How many 'novel' splice junctions/splice events are resonably expected from human RNA,
10 weeks ago by
RNAseqer 40
wrote:

Hello all,

I was just wondering what a reasonable percentage of 'novel' splice junctions/splice events is for human RNAseq data using the program I am new to RNAseq and just running some published human RNAseq data through my pipeline in order to familiarize myself with the programs and protocols. When I performed this splice junction analysis I got what was to me an eyebrow raising estimate of novel splice junctions/events:

Splicing junctions: - Complete Novel = 62% - Partial novel =5% - Annotated 34%

Splicing events - Complete Novel =17% - partial novel=1% - known =81%

Should I be worried about that 62% complete novel splice junction estimate?

If you are interested, here is what I've done:

I am using 104 bp paired end reads off of avg. 250bp fragments (distribution of inner distances has stdev of 50).

From a GTF file Homo_sapiens.GRCh38.95.gtf.gz I created a bed file using the following command line:

$ awk '{if($3 != "gene") print $0}' homo_sapiens_grch38.95_chameleon_cleaned.gtf | grep -v "^#" | gtfToGenePred /dev/stdin /dev/stdout | genePredToBed stdin Homo_sapiens.GRCh38.95.bed

While my bam file was generated from a HISAT2 .sam output using the command line:

samtools view -bS testoutput3.sam | samtools sort -o testoutput3.bam
Using the program with the following command line:

$ -r Homo_sapiens.GRCh38.95.bed -i testoutput3.bam -o out

I got the following output:

    Reading reference bed file:  Homo_sapiens.GRCh38.95.bed  ...  Done
Load BAM file ...  Done
total = 14081359

Total splicing  Events: 14081359
Known Splicing Events:  11341230
Partial Novel Splicing Events:  99514
Novel Splicing Events:  2348855

Total splicing  Junctions:  441831
Known Splicing Junctions:   148196
Partial Novel Splicing Junctions:   21482
Novel Splicing Junctions:   272153

Many thanks for any advice/input/help you can give!

Double-check the version that you are using. Note the release notes:

RSeQC v2.6.1
Fix bug in “” in that it would report some “novel splice junctions” that don’t exist in the BAM files. This happened when reads were clipped and spliced mapped simultaneously.


written 9 weeks ago by Kevin Blighe41k
