Question: How to extract the classification/regression metrics from a GWAS so that I can compare different tools?
0
gravatar for b.ambrozio
9 months ago by
b.ambrozio20
b.ambrozio20 wrote:

If I understood well, GWAS is pretty much a feature selection approach based on a classification or regression algorithm, whenever the underlying trait is qualitative or quantitative, respectively.

My question is, how can I extract the classification/regression metrics from the executed GWAS algorithm when I'm using, for example, PLINK, GCTA, SAIGE, or BOLT-LMM?

Hypothetical scenario: - I'm looking for SNP-causal of type-2 diabetes in a high unbalanced (case-control=1:100), and relatively big dataset (N > 6k). I know that SAIGE is usually the best to address such a scenario, but I want to compare the results among the other tools as well. Usually, for classification algorithms, we use a confusion matrix (true/false-positives, true/false-negatives) and from that, we can calculate accuracy, precision, recall, Sensitivity, F1 Score, etc...

Therefore, how do I get the confusion-matrix from a GWAS based on classification algorithms? Is it possible to go beyond the GWAS and run the classification by using the features selected for that, throughout the mentioned tools?

I found a lot about comparing "false-positives", "statistical power", etc... But I didn't understand yet how they have been evaluated, once I didn't see how to collect the confusion matrix from the GWAS models. I mean, I don't see the classification happening after the feature selection (after the SNP p-values are assigned).

bolt-lmm plink metrics saige gcta • 201 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by b.ambrozio20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2378 users visited in the last hour