Question: How to deal with the FPKM values for isoforms in RNA-seq for particular gene
2
gravatar for ancient_learner
5.1 years ago by
India
ancient_learner610 wrote:

This might be one of the trivial things but being new to RNA-seq data I am really confused on how to assign fpkm value for a gene in rna-seq data that has 3 or 4 isoforms.

I have downloaded analysed rna-seq data from http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52450​  and have noticed that transcripts belonging to the same gene(isoforms) have different fpkm values which is usual. But for my analysis purpose I am thinking whether It is ok if I sum up all the fpkm values of the isoforms to represent that particular gene's expression? or should I keep the values as it is?

example:

genaA Isoform1 2.98

geneA isoform2 5.98

geneA isoform3 2.43

can i make it as geneA 11.39 (2.98+5.98+2.43)? 

rna-seq isoforms fpkm • 4.3k views
ADD COMMENTlink modified 5.1 years ago by Sukhdeep Singh9.8k • written 5.1 years ago by ancient_learner610
2
gravatar for Sukhdeep Singh
5.1 years ago by
Sukhdeep Singh9.8k
Netherlands
Sukhdeep Singh9.8k wrote:

It should work principally, but divide by the number of isoforms, to have a normalized value or the length of isoforms, depending on what you want. So, it will be called as averaged gene expression.
I checked the files you are using, generally, there is another file named `gene_exp.diff`, which has the expression value per gene generated using Tuxedo suite, so you dont have to calculate it yourself.

 

For a more detailed answer, check this How do I get one FPKM value per gene?

There is also a raw code provided by user mgogol, use discreetly after reading everything, as it assumes you to have some output files from cufflinks/cuffdiff.

ADD COMMENTlink modified 5.1 years ago • written 5.1 years ago by Sukhdeep Singh9.8k
2

+1 for using gene_exp.diff, except I think genes.fpkm_tracking is the file that you would typically look for if you ran cufflinks on your own (to get FPKM values for each sample)

ADD REPLYlink written 5.1 years ago by Charles Warden7.2k

Yes, Charles, you are right. `gene_exp.diff` is the output of `cuffdiff` while doing the DE genes analysis, though reports raw FPKM values.

ADD REPLYlink written 5.1 years ago by Sukhdeep Singh9.8k

Hi Thankyou for your suggestions I have downloaded the differential expression testing file in which value_1 and value_2 correspond to the expression values for genes at 2 different stages. Again they seem to be present as transcripts. I have annotated the ref-seq IDs with gene names and checked. As you told it would be fine if i take average of fpkm values i think it would be better for me to go head with that.

 

ADD REPLYlink written 5.1 years ago by ancient_learner610
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1193 users visited in the last hour