Question: Does More Number Of Replicates Results In Insignificant P-Values?
1
gravatar for RT
8.7 years ago by
RT350
European Union
RT350 wrote:

Hi All,

I am analyzing affymetrix data for stress vs control. I have 4 replicates for control and 5 replicates for stress. Normally, I use the stringency crietria fold change > log(1.5) and adjustedPval <0.01. For this analyses, I can see genes with significant fold change but p-value is not significant. And yes some preliminary analyses done earlier showed the differential expression of few genes. Do I have to reduce my stringency criteria for adj p-val (what will be preferable)? Is there any chance that more number of replicates has increased the p-val.

Can any one please help me regarding this?

Many thanks, Ritu

ADD COMMENTlink modified 8.7 years ago by Obi Griffith18k • written 8.7 years ago by RT350

If the adjusted p value is actually a False Discovery Rate, then requiring 0.01 is overly stringent.

ADD REPLYlink written 8.7 years ago by Sean Davis26k

It is certainly possible that you have real differences which are not reaching significance level because of your small sample size. Increasing replicates may help. But, also try some pre-filtering which I suggest in my answer below.

ADD REPLYlink written 8.7 years ago by Obi Griffith18k
2
gravatar for John
8.7 years ago by
John1.5k
John1.5k wrote:

Statistically provided that you have control over source of errors, high nuber of replications gives higher power and less false discovery - means that "whatever decision" comes out is more true ...There are problems with low digree of freedom data (usually genomic designs do as we can not afford to make total treatments 30 or more for example) and high number of treatments (chances of getting false positive is high) ....so need a good statistical consultation...

ADD COMMENTlink written 8.7 years ago by John1.5k
0
gravatar for RT
8.7 years ago by
RT350
European Union
RT350 wrote:

Thanks Sean and John for your help.

Sean- I tried with relaxed stringency crietria Adj.Pvalues <0.05. Still no genes are satisfying this criteria.

But if just see my unadjusted Pvalues (unadjusted.P-values<0.01) they look good and I have 400 candidate genes that are differentially expressed. Can I consider the criteria of unadjusted P values? I am not a statistician so I am not very clear where to consider adjusted P-values and where to consider unadjusted ones? Can any one please explain me.

Below I am copying few lines of my top results. Why so much difference between adjusted P values and unadjusted P-values? Any help/suggestions would be much appreciated.

Many thanks, R.

ID logFC AveExpr t P.Value adj.P.Val B

xx1_at -2.0682486 8.298777 -5.671953 4.754428e-05 0.4441129 1.16804549

xx3_at -1.1045124 7.838776 -5.446288 7.206374e-05 0.4441129 0.90168939

xx9_at -0.9933025 5.900082 -5.378236 8.180199e-05 0.4441129 0.81951230

xx5_at 0.5784688 5.979741 5.211694 1.118385e-04 0.4441129 0.61478979

xx2_at -1.1998221 8.423590 -5.174542 1.199786e-04 0.4441129 0.56842423

xx8_at -1.7810264 7.939280 -5.071211 1.460053e-04 0.4441129 0.43814027

xx4_at -0.4775558 2.965913 -5.026975 1.588722e-04 0.4441129 0.38177252

xx3_at 0.4783773 6.547145 4.917570 1.959800e-04 0.4441129 0.24084966

xx2_at -0.6317992 2.953137 -4.795892 2.479321e-04 0.4441129 0.08161861

xx7_at -0.5901271 3.606651 -4.743114 2.747013e-04 0.4441129 0.01174557

xx2_at -0.6615366 5.228286 -4.708753 2.937133e-04 0.4441129 -0.03400289

xx4_at -1.5438633 5.824030 -4.674656 3.139213e-04 0.4441129 -0.07960106

xx2_at 0.5362903 4.838442 4.613063 3.541284e-04 0.4441129 -0.16246898

ADD COMMENTlink written 8.7 years ago by RT350

why are your adj.P.Val all the same number? And what is the last column? I think you've pasted slightly wrong data or done the correction wrong. If you're doing multiple testing then you should always perform a multiple hypothesis testing corrections and use the adjusted p-values.

ADD REPLYlink written 8.7 years ago by Nathan Harmston1.1k

Its not uncommon for adjusted p-values to have many of the same value (depending on which correction method you use).

ADD REPLYlink written 8.7 years ago by Obi Griffith18k

Please don't post new questions as answers: edit the original question or use comments under answers.

ADD REPLYlink written 8.7 years ago by Neilfws49k
0
gravatar for Obi Griffith
8.7 years ago by
Obi Griffith18k
Washington University, St Louis, USA
Obi Griffith18k wrote:

Have you tried a pre-filtering step to reduce the total number of tests? This should be unbiased with regard to your comparison (i.e., do not use FC or p-value). It is common with Affymetrix expression datasets to filter out genes with very low (or extremely high) variance (or coefficient of variation) across all samples. It is also common to filter out genes which are not considered "present" or "expressed above background" in at least some minimum percentage of samples. If you have a matrix of normalized log2 expression values (e.g., from rma or gcrma) you can use something like:

library(genefilter)
#Preliminary gene filtering
X=data
#Take values and un-log2 them, then filter out any genes according to following criteria (recommended in multtest/MTP documentation): 
#At least 20% of samples should have raw intensity greater than 100 
#The coefficient of variation (sd/mean) is between 0.7 and 10
ffun=filterfun(pOverA(p = 0.2, A = 100), cv(a = 0.7, b = 10))
filt=genefilter(2^X,ffun)
filt_Data=rawdata[filt,]
ADD COMMENTlink written 8.7 years ago by Obi Griffith18k

Thanks Obi. But I have already done the filtering on the normalized data. This has not helped either. Is this has to do with the quality of my arrays because I have analyzed few more arrays with just two replicates and results were fine. Any other ideas are welcome.

ADD REPLYlink written 8.7 years ago by RT350

It could be a problem with your arrays. If you do quality checks and look at the overall distributions do you see anything unusual? The other possibility is simply that there aren't significant differences between your treatments. Or, that there were sample mix-ups. As an aside, I wouldn't trust p-values obtained from a test with only 2 replicates. So, that's probably not a good baseline for comparison.

ADD REPLYlink written 8.7 years ago by Obi Griffith18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1567 users visited in the last hour