Question: how to get correlation between the counts over each gene at the same timepoint (two replicates)
1
gravatar for Lila M
10 months ago by
Lila M 800
UK
Lila M 800 wrote:

Hi everybody, I have the counts (obtained by HTSeq) for a lot of genes(~58,000) at different time points (replicates).

gene                           t1_S1    t1_S2
ENSG00000000003.14              0        0
ENSG00000000005.5               0        0
ENSG00000000419.12              1        3
 [...]

I woul like to calculate the correlation between the counts over each gene at the same timepoint to understand how reproducible the replication timing and progression is for each repeat. Any suggestions?

ADD COMMENTlink modified 10 months ago by Nicolas Rosewick8.4k • written 10 months ago by Lila M 800
1

Check out the cor function in R. Different kinds of correlation measures are available, including Spearman and Pearson.

ADD REPLYlink written 10 months ago by ATpoint26k
1

This is what I am doing, but as I have a huge number of genes, R gets stuck . This is what I'm trying:

xx <- read.table(file="matrix_count", sep="\t", header = T)
cor(t(xx), method="pearson")

any other suggestion?

ADD REPLYlink written 10 months ago by Lila M 800
1

Do I understand correctly that you aim to calculate 58000 correlation coefficients?

ADD REPLYlink written 10 months ago by ATpoint26k
1

Read count correlation between samples

ADD REPLYlink written 10 months ago by h.mon28k
5
gravatar for Nicolas Rosewick
10 months ago by
Belgium, Brussels
Nicolas Rosewick8.4k wrote:

Do you want to test the correlation between the different timepoints or between the different genes.

Let say you have 10 timepoints and 58000 genes

To test the different timepoints :

cor(xx, method="pearson")

will give you a 10x10 matrix , so 100 correlations calculation (even though I guess the cor function is smart and should not compute twice the cor function between col A and col B ; and between col B and col A ; thus 45 correlations should be computed)

To test the different genes (in a pairwise manner) :

cor(t(xx), method="pearson")

here a 58,000 x 58,000 matrix , = 3.364e+09 correlations (or 1,681,971,000 correlations if cor function is smart). That's why R crashes, it will take to long to compute so many correlations.


Edit based on OP comments

Use the coefficent of variation : https://en.wikipedia.org/wiki/Coefficient_of_variation :

dat.coeff.var <- apply(dat,1,function(x){sd(x)/mean(x)})
ADD COMMENTlink modified 10 months ago • written 10 months ago by Nicolas Rosewick8.4k
1

Maybe I miss explain what I want. I want to know the correlation for, lets say gene ENSG00000000003.14 in the two replicates, to see if there are differences in each replicate for each gene. I'm not interested in the correlation ENSG00000000003.14 and ENSG00000000005.5. Has more sense?

ADD REPLYlink written 10 months ago by Lila M 800
1

Ok so you want to check the correlation between replicates : then cor(xx,method="pearson")

ADD REPLYlink modified 10 months ago • written 10 months ago by Nicolas Rosewick8.4k

Not exactly, because it gives to me the cor between replicates, and what I want to know is if the counts for the gene ENSG00000000003.14 is different in t1_S1 and t1_S2 (and also for the others genes)

ADD REPLYlink written 10 months ago by Lila M 800
2

Use maybe the coefficent of variation : https://en.wikipedia.org/wiki/Coefficient_of_variation : dat.coeff.var <- apply(dat,1,function(x){sd(x)/mean(x)})

ADD REPLYlink modified 10 months ago • written 10 months ago by Nicolas Rosewick8.4k
1

that's exactly what I want! thanks!

ADD REPLYlink written 10 months ago by Lila M 800

ok great. I modified my answer to archive the right answer. If the answer suits you you can accept the question.

ADD REPLYlink written 10 months ago by Nicolas Rosewick8.4k
1

There is no correlation for a single pair of measures. The correlation between samples will give you a general view of how similar samples are, and you can plot the values to check outliers. However, you have to take into account sample sequencing depth.

ADD REPLYlink written 10 months ago by h.mon28k

How do you know any other way to do that?

ADD REPLYlink written 10 months ago by Lila M 800
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2275 users visited in the last hour