How to deal with extreme values of k in RUV-seq?
0
0
Entering edit mode
11 days ago
telroyjatter ▴ 210

I discovered that a certain lab is abusing RUVseq by using a high value of k to squash variance and obtain a large number of DEGs. Have you ever seen this, and if so, what did you do?

ruvseq • 257 views
0
Entering edit mode

Analyze the same data with a different method and compare the results? One can abuse a wrench by using it as a hammer under certain circumstances. Many genomic methods are much like a screen, filtering candidate genes by some often arbitrary threshold, with the idea to then pursue those genes further with other methods (i.e. it's a hypothesis generating method). In some cases, experiment conditions dictate this is all one can do, in other cases it might simply be sloppy, flawed thinking or design, and a colossal waste of time and resources. More information would be needed to make that call (though it appears you already have?). FWIW, you might alter the title of your post to be more objective...something like "how to deal with extreme values of k in RUV Seq", or something more directly related to your concern (Validate? Justify? Explore? something something of k-values).

1
Entering edit mode

Thank you, I changed the title of the post. I could re-analyze the data and demonstrate the value of k chosen is much too high, but I guess I'm wondering what others' experiences have been with this.

Thanks again.

0
Entering edit mode

If you think it is a problem you could make a figure that plots number of DEGs with k from 0 to what they use. If your concerns are true you might see sort of a trend that number of DEGs is a function of k, I have seen things like that before. Problem eith methods like RUV is that there is no (imo) automated yet robust way go estimate a good k.

0
Entering edit mode

It's not clear based on this post what you are seeking to accomplish. Are you asking how to issue a formal rebuttal? How to warn a colleague about the potential misuse of statistics?

By the way, large values of k in RUVseq should not generate a large number of DEGs when properly applied as each factor removes a degree of freedom. Is RUVseq being misapplied (i.e., by using the corrected counts in DE instead of the original counts with factors entering as covariates)?