Splice site usage from RNA-seq data
1
0
Entering edit mode
5.2 years ago
Suicyte ▴ 10

Is there an online resource or downloadable dataset that would give me data on splice site usage from published RNA-seq experiments? Ideally, I would like to answer questions like "which spliceform is dominant in which tissue" and similar.

I have seen online data that gives multi-tissue expression profiles for multiple refseq entries per gene, one example is [http://medicalgenomics.org]. However, for all the instances I checked, the intensities were almost the same. I guess that the maintainers count a lot of non-discriminatory reads for all isoforms, thus blurring the differences. Or am I missing something?

In response to some comments/anwers, let me be more precise:

I am currently interested in human transcripts, but I might be interested in zebrafish next week, who knows. I am well aware of the fact that I can download raw RNA-seq data from GEO and elsewhere, but I was hoping that somebody has done such analyses before, since they address a rather common problem. There are several data sources out there that provide abundance data for more than one isoform per gene. However, since the bulk of the reads will not be uniquely assignable to one isoform, I was hoping for an analysis focusing on those reads that allow to make this distinction. It was of course imprecise (or even wrong) of me to talk about "splice forms", since the usual short reads can at best tell about the usage of one particular splice site.

splicing RNA-Seq database • 1.2k views
ADD COMMENT
0
Entering edit mode

You can simply download any RNA-seq dataset (the FASTQ files) and then process the data using one or more combinations of HISAT2 / StringTie, DEXSeq, and rMATS. This would give you information on expression of different splice isoforms.

Online repositories where RNA-seq is commonly stored include:

  • SRA
  • GEO
  • EGA
  • ArrayExpress

I do not know anything about the Medicalgenomics website. For specific queries, I would contact them directly.

ADD REPLY
0
Entering edit mode

In addition to that, if you are interested in such things, you can also look at the coverage of exon junctions, using R packages like for example "spliceSites".

ADD REPLY
3
Entering edit mode
5.2 years ago

What you are asking is not trivial - and for a very large proportion of genes they will have multiple isoforms which all contribute meaning there will in many cases not be a clear dominant feature. The answer also depends on which organisme you are interested (aka human vs nonhuman) and if you what you mean by "spliceform". I'll try and answer all combinations:

If you are interested in human and you refer to a specific splice junction the best one I know about is probably ASCOT where they have reprocessed all human data (published until a few years ago). Alternatively the Recount2 database can give you the raw junction counts.

If you are interested in human and you refer to a specific isoform the better option is probably GTEx here you can search for gene/transcript expression across all human tissues.

If you are interested in non-human I am not aware of any resource which have re-analyzed all the data so there you would probably need to, as @Kevin suggest, download and process the data yourself (quite easy these days). With regards to how to analyse alternative splicing in such data please refer to this answer for considerations and tools.

ADD COMMENT
1
Entering edit mode

Excellent answer! GTEx is exactly what I was looking for - thankx Recount2 looks rather complicated - too lazy to really go to the bottom. ASCOT appears to have counts for the exons, but in my particular case, the splice forms differ in using two alternative junctions of the same exon. Not sure if this data can be gleaned from ASCOT. In GTEx, it was quite intuitive to find.

ADD REPLY

Login before adding your answer.

Traffic: 2371 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6