Question: Splice junction sites and number of splitted reads from STAR aligned bam
0
gravatar for filippo.martignano
10 months ago by
filippo.martignano10 wrote:

Hi everyone!

I'm working with some RNA-seq bams aligned with STAR. What I want to do is to detect splicing sites (both known and unknown) along with the coverage information (basically how many reads are spliced at a given position)

Basically I'm looking for a tool that does exactly what "sashimi plot" does in IGV, but wthout a grafic interface, in order to deal with a large amount of data.

Any suggestions?

Thanks!

ADD COMMENTlink modified 10 months ago by trausch1.2k • written 10 months ago by filippo.martignano10
1

Too many. While it's old, this should get you started. Good luck!

Best Approach To Predict Novel And Alternative Splicing Events From Rna-Seq Data

ADD REPLYlink modified 10 months ago • written 10 months ago by Eric Lim1.3k

Thank you very much Eric! I did alread know about that thread, anyway it's a bit a "too much information" situation for me. I explain better: I'm looking for something as "raw" as possible (as I said, basically only the raw number of splitted read at a given position), all the tools I've checked out since now are designed to infer isoform expression, or to estimate possible isoform structures...obviously that means that there are some filtering criteria that I don't want as I am looking for raw data. I can check out every software suggested in other thread but it will take ages to find one that is suitable to my uncommon purposes. That's why I'm asking here...can anyone suggest a software that does exactly what I'm looking for without any "fancy" statistical filtering step?

thanks.

ADD REPLYlink written 10 months ago by filippo.martignano10

rMATS (http://rnaseq-mats.sourceforge.net/rmats4.0.1/) is particularly popular among biostars members. I recently played around with SGSeq (https://bioconductor.org/packages/3.7/bioc/vignettes/SGSeq/inst/doc/SGSeq.html) and thought it was pretty good, but it was a bit slow with well-covered sequencing data. There are at least a dozen more tools that were developed to specifically address that. At Stoke, we ended up developing an internal version ourselves in order to stay as comprehensive as we can.

ADD REPLYlink written 10 months ago by Eric Lim1.3k
2
gravatar for caggtaagtat
10 months ago by
caggtaagtat500
caggtaagtat500 wrote:

Hi, I can recommend the R package "spliceSites" to you and I'm not the developer.

It transforms Bam files into one big gap table, which basically is a large table with one row representing a gap in the alignment of the reads, which is represented by at least one read. Among others, it states the coordinates of the splice sites and the amount of reads, having the respective gap in their alignment. After annotating the gap table, it also tells you, if the splice donor or acceptor is annotated in the gtf file you use for the mapping and if not, it states the distance of the respective splice site to the next annotated splice site, which would be appropriate.

I'm studing splicing and after data preparation, I basically only work with these gap tables, since they describe quiet nicely, the read coverage of splice sites and the usage of not annotated ones.

You can also use multiple bam files to create one gap table, making it possible to compare the usage of a splice site across samples. The read coverage of splice sites is stated in total reads and in rpmg, which represents the read coverage, normalized to the total read count in one bam file, or one bam file group.

I have to mention the 2-pass run of the STAR aligner, in case you don't know about it already, since it took me a while until i learned about it. Apparently, the 2-pass run with STAR is recommended by the STAR developer Alexander Dobin for analysis concerning splice site usage, esspecially anaylsis of not-annotated splice site usage.

So, if you are using R, I would definitly give the package "spliceSites" a try, in particular the function readTabledBamGaps, since it gives you a very raw output, describing splice site usage in your data.

Edited some gramma

ADD COMMENTlink modified 10 months ago • written 10 months ago by caggtaagtat500
1
gravatar for trausch
10 months ago by
trausch1.2k
Germany
trausch1.2k wrote:

Alfred should work

alfred count_jct -g Homo_spaiens.GRCh37.75.gtf.gz input.star.bam
ADD COMMENTlink written 10 months ago by trausch1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 922 users visited in the last hour