Making a custom Transcriptome
0
1
Entering edit mode
3.3 years ago
rdmorris95 ▴ 10

Dear All,

I am trying to quantify the expression of splice variants of a gene I am interested in. I would typically do this using Kallisto ( via the Galaxy server) and to do this I use a reference transcriptome. There is a splice variant which I am particularly interested in - its existence is very abundant in the literature, being recorded since the early 90s! However the variant is very poorly annotated and is not identified as a splice variant on Ensembl. Is their a way I could modify an existing Ensembl transcriptome ( or make a completely new transcriptome) which would include this variant.

Many Thanks

Ryan

RNA-Seq rna-seq alignment sequencing gene • 1.3k views
ADD COMMENT
2
Entering edit mode

From what I remember when using kallisto last time, the reference transcriptome is a simple FASTA file with multiple sequences. If so, that would be a text file to which you can add your own sequence(s) using any editor.

ADD REPLY
1
Entering edit mode

Just add it as a sequence and add a special string that you're aware of to the header? (E.g. >mysequence.)

ADD REPLY
0
Entering edit mode

as others have pointed out, yes it is possible to add that to your custom transcriptome (don't forget to update indexes and such if you're done).

What species are we talking about here? if it is in Ensembl and the splice variant as you indicate is well known and described it seems a bit weird to me it is not included in the datasets that Ensembl offers.

ADD REPLY
0
Entering edit mode

Hi Ryan,

As lieven.sterck suggests, Ensembl would be very interested to revisit this annotation. Are you working with human data, or another species? In any case, please send further information about the missing transcript and the evidence to the Ensembl Helpdesk as we'd love to improve the annotation.

Best wishes

Ben

ADD REPLY
0
Entering edit mode

Hi Ben,

Alongside this post I have also emailed the Ensembl Helpdesk about this subject. I am working with human data and I am planning on quantifying the levels of neuronal Src variants in brain tumour cell lines and patient data.

In terms of the variants, their are two neuronal variants of Src ( N1 and N2) which have a mass of 542aa and 553aa. These variants arise due to an insertion of an microexon between exons 3 and 4. Because of this insert, N1-Src contains a six amino acid insert in the n-src loop of its Src homology 3 (SH3) domain, while in N2-Src, the N1 and N2 mini-exons insert a total of 17 amino acids: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4509517/

For the Src entry (https://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000197122;r=20:37344685-37406050) the 542 amino acids species (N1) is annotated ( SRC-202) but N2-Src has no annotation.

Best Wishes

Ryan

ADD REPLY
0
Entering edit mode

Hi Ryan,

Thank you for your message to the helpdesk. I have forwarded your query to the GENCODE manual annotation team to investigate in more detail. They will respond to you directly.

Best wishes

Ben

ADD REPLY

Login before adding your answer.

Traffic: 1511 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6