Question: Question: Rna Raw Count and Gene expression correlation
gravatar for salvatore.digiorgio
22 months ago by
salvatore.digiorgio10 wrote:


Starting From TCGA dataset HTSeq - Counts I used TCGAanalyze Normalization (TCGAbiolinks package) based on EdgeR to obtain normalized count matrix. On these data can i perform correlation test or Is better start from HTSeq - FPKM? Thank you.

edger rna-seq tcgabiolinks • 1.1k views
ADD COMMENTlink modified 22 months ago by lessismore660 • written 22 months ago by salvatore.digiorgio10

Could you provide a little more info on your design? What are you hoping to correlate? If you wanted to find the most highly expressed genes for example then using the normalized count matrix will suffice. If you are looking to compare expression profiles between genes across a series of samples then an additional standardization or transformation would help.

I also agree with Wouter that your edgeR normalization is better than FPKM.

ADD REPLYlink written 22 months ago by Jake Warner730

I'm trying to find correlation of expression levels of a small set of genes belonging to one type of cancer with different sample set, but not across different types Cancers sample set.I hope I made myself clear, What be the right course?

ADD REPLYlink modified 22 months ago • written 22 months ago by salvatore.digiorgio10
gravatar for lessismore
22 months ago by
lessismore660 wrote:

It's much better to start from normalized data. FPKM which take in consideration the library size and the gene length.

ADD COMMENTlink written 22 months ago by lessismore660

But OP generated normalized counts using edgeR, which is a more sophisticated method that FPKM. Wouldn't edgeR normalization be superior to a simple FPKM transformation?

ADD REPLYlink written 22 months ago by WouterDeCoster40k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 822 users visited in the last hour