Help interpreting p value distribution
1
0
Entering edit mode
2.1 years ago

Hello everyone!

I am using Transcriptome Analysis Console (TAC) for DE testing of gene expression data and R for downstream analyses and visualization. I generated volcano plots for three separate datasets (let's call them A, B, and C), and the distribution of p values does not look uniform (there are some gaps as can be seen in the images below). Even in dataset B whose P value distribution looks good, there are no genes with -log10p around 4.

I have never seen this pattern before and I am not sure what to make out of it. What do you think this indicates? Perhaps poor quality of the data?

enter image description here

microarray gene expression differential • 398 views
ADD COMMENT
0
Entering edit mode
2.1 years ago

That seems like some sort of systematic error of the processing itself. Perhaps filtering went wrong, or some sort of normalization process kicks in.

The main point is that it is systematic and not a natural error process.

It is hard to see how merely poor quality, noisy data would produce missing p-values in a narrow band alone.

ADD COMMENT

Login before adding your answer.

Traffic: 2204 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6