Question: How to extract the classification/regression metrics from a GWAS so that I can compare different tools?
gravatar for b.ambrozio
9 months ago by
b.ambrozio20 wrote:

If I understood well, GWAS is pretty much a feature selection approach based on a classification or regression algorithm, whenever the underlying trait is qualitative or quantitative, respectively.

My question is, how can I extract the classification/regression metrics from the executed GWAS algorithm when I'm using, for example, PLINK, GCTA, SAIGE, or BOLT-LMM?

Hypothetical scenario: - I'm looking for SNP-causal of type-2 diabetes in a high unbalanced (case-control=1:100), and relatively big dataset (N > 6k). I know that SAIGE is usually the best to address such a scenario, but I want to compare the results among the other tools as well. Usually, for classification algorithms, we use a confusion matrix (true/false-positives, true/false-negatives) and from that, we can calculate accuracy, precision, recall, Sensitivity, F1 Score, etc...

Therefore, how do I get the confusion-matrix from a GWAS based on classification algorithms? Is it possible to go beyond the GWAS and run the classification by using the features selected for that, throughout the mentioned tools?

I found a lot about comparing "false-positives", "statistical power", etc... But I didn't understand yet how they have been evaluated, once I didn't see how to collect the confusion matrix from the GWAS models. I mean, I don't see the classification happening after the feature selection (after the SNP p-values are assigned).

bolt-lmm plink metrics saige gcta • 201 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by b.ambrozio20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2378 users visited in the last hour