Question

Detection of aberrant mRNA splicing

0

Entering edit mode

6.8 years ago

ÁngelMG • 0

Hello,

I am collaborating in a project in which we are trying to detect aberrant splicing of mRNA of tumoral cells of patients with leukemia and studying the metabolic routes affected by the aberrant proteins generated in this type of cells...

To tackle this problem, in the first place I have posed me to carry out a filtering of the data tagged/flagged as "j" (which identifies novel splicing alternative transcript on the novelSplicingVariant files), with his back analysis of level of expression and enrichment (using the R libraries of and/or enrichment through web services like the provided by reactome, GO or DAVID).

But once exploded and exhausted this way, I'd like to know if starting from RNA-Seq raw fastq files, there would be some alternative way, reliable and more specific to face this problem, if possible, with tools that are available publicly at galaxy's servers.

On the other hand, I would also like :

To know approximately what would be the minimum value of FPKM (the data is paired-end) of a certain isoform (aberrant or not) from which we could consider it as not relevant for the case study.
If there is a specific workflow or protocol similar to Tuxedo for this purpose, since I have not found any to date.
How I could get the aberrant assembled sequence of mRNA.

For the analysis performed by the company that sequenced the samples, they made a similar approximation to the following:

1.- Fastqc (Quality Control of reads).
2.- Trimmomatic (Preprocesing of reads).
3.- TopHat (Mapping to reference genome, a splice-aware alligner).
4.- Cufflinks (Assembling aligned reads that contain paired-end information. That provides us information of known transcripts, novel transcripts, and alternative splicing transcripts, with their expression profiles).

They also Use:

5.- STAR with GATK (SNV calling of RNA-seq Dates, mapping quality reassignment, indel realignment, and basic recalibration. The reads created in the previous step were used for variant calling with HaplotypeCaller).
6.- deFuse (Fusion gene prediction).

In advance, thank you for your collaboration and answers.

Regards,

Ángel MG.

mRNA aberrant splicing sequence RNA-Seq • 2.5k views

ADD COMMENT • link updated 6.8 years ago by Kristoffer Vitting-Seerup ★ 4.0k • written 6.8 years ago by ÁngelMG • 0

0

Entering edit mode

you need control samples to compare to tumor RNA-seq

ADD REPLY • link 6.8 years ago by Ben ▴ 60

0

Entering edit mode

Hi Ben, I have RNA-seq samples of both groups (control samples and case samples of the same type of cell).

ADD REPLY • link 6.8 years ago by ÁngelMG • 0

score 2 · Answer 1 · 2017-06-23

I can interpret your question in three different ways:

You are interested in changes in splicing because such a changes could give a cancer an advantage - aka transcript/isoform switches.
You are interested in novel transcripts identified only in cancers.
You are interested in whether alternative splicing is compromised.

All of which I have suggestions for:

Possibility 1. This is definitely a possibility - and there are a few recent articles that suggest this is a very widespread phenomenon ( One of my articles and the article by Hector et al ).

Please note that to find such isoform switches you do NOT ned to switch from a known to a novel ( from '=' to 'j') isoform - it could be an isoform switch between two known isoform.

If you are interested in finding isoform switches I would recommend my tool IsoformSwitchAnalyzeR (see vignette for full introduction) which directly parses cufflinks/cuffdiff output (although other tools such as Kallisto/Salmon/RSEM are also supported) and enables identification of isoform switches with predicted functional consequences where the consequences can be chosen from a long list but includes gain/loss protein domains etc.

Possibility 2. This can be viewed as an isoform switch between known and novel isoforms (see possibility 1) which can be analyzed using my tool IsoformSwitchAnalyzeR by specifying 'isoform_class_code' when finding consequences with the analyzeSwitchConsequences() function (see this part of the vignette).

Possibility 3. If you are interested in whether alternative splicing in general compromised a viable approach is to look for genome wide increases in isoforms with intron retion (which can be viewed as a measure of splicing effecintcy) like we did in this paper. Such an anlysis can also be performed with IsoformSwitchAnalyzeR via the global analysis methods implemented (see the second part of this section of the vignette).

Good luck

Kristoffer