Question: using fdr to gate values in NGS comparison ttest
0
gravatar for BarristanTheBold
4.8 years ago by
United States
BarristanTheBold0 wrote:

New to NGS analysis, but that's the task I've been assigned. I have received NGS data that I am trying to decipher.

I’m attempting to learn what exactly is meant by "unadjusted p-value" and "FDR" in looking at comparison ttests of genes (the comparisons are between NGS of animals treated with drug or placebo). I understand the basic concepts, but not how to functionally make use of them. Most of the values seem fairly large (well over 0.1 for p-values, in the 0.1 to 0.9 range for FDR) when looking at data sets of ~20,000 to 40,000 members. My goal here is to determine a value for each that would allow me to gate on the genes with meaningful expression differences. Is there a specific value I should use as the boundary, or some way to calculate it based on the sample size or something?

ngs unadjusted p-value fdr • 1.8k views
ADD COMMENTlink modified 4.8 years ago by Devon Ryan89k • written 4.8 years ago by BarristanTheBold0
2
gravatar for Devon Ryan
4.8 years ago by
Devon Ryan89k
Freiburg, Germany
Devon Ryan89k wrote:

Ignore unadjusted p-values completely. Unadjusted p-values, also called "raw p-values" or simply p-values, don't have much relevance in individually when you perform multiple testing (see this XKCD comic for a nice example of why multiple-testing and fishing for changes increases false-positive rates). A common threshold for adjusted p-values (or FDR) is 0.1 (as with p-value thresholds in general, there's some wiggle room here). That's a bit higher than the typical 0.05 that you'd use with a raw p-value, but it turns out to be a convenient trade-off. After making a list of significant findings, sort them by fold-change to help prioritize results.

ADD COMMENTlink written 4.8 years ago by Devon Ryan89k

This. I see so many people making the mistake of assuming a low p-value is a large effect size. 

ADD REPLYlink written 4.8 years ago by David Westergaard1.4k

Isn't the way to combat that to just lower your threshold for calling something significant?

ADD REPLYlink written 4.8 years ago by BarristanTheBold0
2

Never confuse statistical significance and biological relevance.

ADD REPLYlink written 4.8 years ago by Devon Ryan89k
1

No. P-value is a measure of significance, and therefore more related to variation and sample consistency. If all the drug treated were at 102.1% expression plus or minus 0.001, this would have high certainty of difference without much biological relevance; compared to another gene with 300% plus or minus 50. As Devon said, use fdr to gate then sort for high fold change. They will be correlated..

ADD REPLYlink written 4.8 years ago by karl.stamm3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1098 users visited in the last hour