Question: Why there are genes that are not present in TCGA normalized rna-seq data?
0
gravatar for Vasei
3.4 years ago by
Vasei30
Vasei30 wrote:

I was doing a comparison between gene expressions for TCGA data obtained from microarrays and rna sequencing. I downloaded normalized data from firebrowse and when comparing gene names in those two platforms, I noticed that there are nearly 1000 genes that are present in microarray data but not in rna-seq data. It's in contradiction with my understanding of rna sequencing because I think that rna sequencing must give us whole transcriptome! So why there are genes that are not present in TCGA normalized rna-seq data?

(Sorry, I am begginner in bioinformatics!)

rna-seq tcga • 1.7k views
ADD COMMENTlink modified 3.4 years ago by Jordan Anaya960 • written 3.4 years ago by Vasei30

Is the data prefiltered? e.g. all genes with an expression <1 FPKM filtered out? What is the depth to which the data was sequenced? Perhaps the genes you are missing are just very lowly expressed.

Or is the RNA-seq polyA enriched and the microarray isn't, and you're looking at rRNA genes?

Just a few thoughts...

ADD REPLYlink written 3.4 years ago by WouterDeCoster40k

Thanks. I think that data is not filtered because there are many genes with 0 reads across many samples and also there are exactly 20532 genes in all the cohorts.

I will take a look at preparation methods, but I think both of them are using polyA filtering and measuring mRNA levels.

ADD REPLYlink written 3.4 years ago by Vasei30
3
gravatar for Jordan Anaya
3.4 years ago by
Jordan Anaya960
US/Charlottesville
Jordan Anaya960 wrote:

Yes, RNA-SEQ will give you the whole transcriptome, but you then have to take those reads and map them to transcripts. The transcript definitions used by the TCGA when they counted the reads determine how many genes will be in the normalized data from firebrowse. As a result, the transcripts from the normalized data may not be the same transcripts that were present on the microarray chips used.

If you are looking at gene expression you may want to check out my data portal www.oncolnc.org.

ADD COMMENTlink written 3.4 years ago by Jordan Anaya960

Thank you. That sounds completely correct.

ADD REPLYlink written 3.4 years ago by Vasei30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1911 users visited in the last hour