Question: TCGA RNAseq, RNAseqV2 RPKM meaning
0
gravatar for ad
4.1 years ago by
ad30
United States
ad30 wrote:

Hi, I'm having a little bit of trouble understanding precisely how RPKM data is calculated for TCGA RNAseq and RNAseqV2 data. Specifically for 'genes' and 'exons'.  First of all what do they mean by

>>>>>>>

a composite gene model was
generated by merging all overlapping exons (as defined by the genomic mapping)
from each associated reference transcript.  Thus, each composite gene model is
essentially the union of all associated reference transcripts.

>>>>>>>

Do they mean they simply took any reads which align to any transcripts of the gene and counted it? Do they mean  they counted only reads over overlapping exons and discarded the rest? Or counted only the reads aligning to some some funky model obtained by aligning the transcripts and trimming it?

 

Also  there seems to be some discrepancy to how RPKM is calculated for the 'gene'. Here they simply use gene length which I'm not sure means the mRNA or what

>>>>>>

RPKM for a given GeneX is calculated by:  (raw read counts × 10^9) / (total reads × length of GeneX).

>>>>>>

 

Here they are calculating RPKM through the sum of exons

>>>>>>>

RPKM is calculated using the formula:
(number of reads mapped to all exons in a gene x 1,000,000,000)/(NORM_TOTAL x sum of the lengths of all exons in the gene )
[Note: NORM_TOTAL = the total number of reads that are mapped to all exons from the composite gene models. (i.e. sum of the fractional read count for all exons)]

>>>>>>>

Also whatever the answer might be as to the actual method they use. Would it be the same for RNAseq vs RNAseqV2 data? Here are the links I'm looking at.

 

https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/distro_ftpusers/anonymous/tumor/read/cgcc/unc.edu/illuminaga_rnaseq/rnaseq/unc.edu_READ.IlluminaGA_RNASeq.mage-tab.1.3.0/DESCRIPTION.txt

 

https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/distro_ftpusers/anonymous/tumor/laml/cgcc/bcgsc.ca/illuminaga_rnaseq/rnaseq/bcgsc.ca_LAML.IlluminaGA_RNASeq.mage-tab.1.9.0/DESCRIPTION.txt

 

rpkm rna-seq bioinformatics • 2.7k views
ADD COMMENTlink written 4.1 years ago by ad30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1466 users visited in the last hour