preforming armitage test on chunks of data in R of VCF file
8 weeks ago
Eliza • 0

i have data of SNPs form VCF file. basically what i need to do is to perform Cochran-Armitage which is done by the catt function (link with explanations about the function https://search.r-project.org/CRAN/refmans/CATT/html/CATT.html). so for every unique sn_id i want to perform it this is the code for the function :

CATT(data$is_severe,data$encoding)


im not sure how to preform this in every SNP as agroup :for example chunk of data it has the same value in sn_id column - chr1 69511 and for those 4 observation i want to use catt function ( there are a lot of different chunks like this in the data so i would like to preform the test on each one of them and get the pvalue ) :

9           1        2   chr1 69511
10          1        2   chr1 69511
11          1        1   chr1 69511
12          0        1   chr1 69511


if there is any wat to do this in python it would also be great

R SNP Armitage VCF
Are you doing this on a VCF or the data format in the screenshot?

rpolicastro on data frame format

This can also be done with list columns but I think splitting the data is a little syntactically clearer.

library("dplyr")
library("CATT")

results <- data |>
split(~snp_id) |>
lapply(\(x) {
catt_results <- CATT(x$is_severe, x$encoding)
data.frame(p.value=catt_results$p.value, z.score=catt_results$statistic)
}) |>
bind_rows(.id="snp_id") |>

rpolicastro during wrapup: 0 (non-NA) cases Error: no more error handlers available (recursive errors?); invoking 'abort' restart