Hello everyone, I have another trivial question. I've searched around but haven't been able to find a solid answer.
Very trivial question: How do I remove rows under a certain pValue in a DGELRT object? I have ran glmTreat on a DGEGLM object that was generated by glmQLFit. I have found genes that are highly under or overexpressed, but I have also found junk reads that have pValues of 1 or close to 1. Here are the topTags from an LRT object that was output from calling glmTreat on a DGEGLM object:
> treat <- glmTreat(DGEList8_GLMFit, contrast= FBvsEC_contrast, lfc=log2(1.5))
> topTags(treat)
GeneID Chr Start End Strand Length logFC unshrunk.logFC logCPM PValue
606500 606500 chr16 89561430 89561501 + 72 -19.74088 -19.74600 16.82385 5.418552e-24
26799 26799 chr6 85677294 85677368 - 75 -25.79423 -25.85765 13.96527 1.082539e-23
26806 26806 chr1 173865968 173866028 - 61 -20.87574 -20.88726 16.12826 1.085020e-23
26790 26790 chr18 49491664 49491729 - 66 -29.88506 -30.66800 12.39961 1.431579e-23
26809 26809 chr17 28723430 28723491 + 62 -20.77361 -20.78272 16.01472 2.803338e-23
9297 9297 chr11 62853904 62853968 - 65 -20.12804 -20.13388 15.97089 2.244666e-22
26788 26788 chr16 2155023 2155105 - 83 -24.98016 -25.05564 13.71254 2.657334e-22
692212 692212 chr1 28578743 28578822 - 80 -22.17292 -22.18588 14.83805 4.610559e-22
26769 26769 chr1 173864175 173864217 - 43 -23.65515 -23.70433 13.75147 4.003719e-21
26805 26805 chr1 75787889 75787972 + 84 -24.33920 -24.40447 13.22989 1.191277e-20
FDRs were omitted but they are there. How would remove the hits that have a pValue less than 0.05?
Also, I've read somewhere that the pValues shown here are not the real pValues, and are actually a hypothesis and mean-adjusted pValue that are specific to the glmTreat function itself? If you could kindly explain what glmTreat does, I would greatly appreciate it because I was not able to find any good answers after searching for a few hours.
Damon
Thank you for the reply. But what if the FDR is low, the log change is high, but the pValue is high for a certain gene? Would that gene still be considered significant?
I don't think this is a realistic example. How would FDR be high and p be low? FDR in simple terms "penalizes" or "down-corrects" p-values to account for multiple-testing, so it is always lower than the nominal p-value. If not something very odd is happening.