Question: Single rare variant approaches
gravatar for Tash
9 months ago by
Tash0 wrote:

Hi all,

I am new to this forum - forgive me if this is a very basic question.

I am currently analysing a WES dataset, and have identified 4 rare variants of interest in the same gene (a total of 8 cases share mutations in this gene in a sample of 50 cases). Two of the variants have a MAF < 1%, while the other two are around 1.2%.

I have read that it may be appropriate to perform a burden association test in PLINK /SEQ for the rarer variants in this gene to obtain a score of 'mutational load' (I guess to increase power), but I am hoping to get some general guidance in what would be the best approach to analyse single rare variants?


next-gen • 348 views
ADD COMMENTlink modified 9 months ago by Nandini720 • written 9 months ago by Tash0
gravatar for Nandini
9 months ago by
Nandini720 wrote:

You can try Variant tools that includes SKAT as mentioned by @WouterDeCoster . These can take vcf files as input . There are tutorials on the website which can help find your way with these tests. otherwise if your comfortable to try R, have a look at SKAT and R

ADD COMMENTlink modified 9 months ago • written 9 months ago by Nandini720

Oh yeah, that might be easier than going in R yourself and less painful than what I described to do manually.
But I like my analysis authentic, artisanal, GMO and gluten-free ;-)

ADD REPLYlink written 9 months ago by WouterDeCoster35k
gravatar for WouterDeCoster
9 months ago by
WouterDeCoster35k wrote:

SKAT and SKAT-O (in R) are two commonly used tools for rare variant association testing, using also more sophisticated models than simple burden test.

ADD COMMENTlink written 9 months ago by WouterDeCoster35k

Appreciate your reply.

Do you know of any good example data I could download to get a feel for how to perform SKAT in R? To me it is not completely straightforward what the input file should entail (again... newbie), but it might help to see something in front of me.

ADD REPLYlink written 9 months ago by Tash0

I usually use plink to generate a .raw file:

plink --vcf yourvariants.vcf --out projectname
plink --hwe 0.001 --bfile projectname --max-maf 0.01 --recode A --out projectname

Make sure to use the right .fam file, fill in conditions (phenotype)

And then use the following Rscript, saved as e.g. SKAT.R:

Rscript SKAT.R projectname.raw



cat("WARNING: this script doesn't perform a sanity check on your data and therefore it's your own responsibility to provide correct input data.\n")
args = unlist(strsplit(commandArgs(trailingOnly = TRUE)," "))
basename = strsplit(basename(args), "\\.")[[1]][1]
dat <- read.table(args[1], header=TRUE, stringsAsFactors=FALSE) 
phen = ifelse(dat$PHENOTYPE == 2, 0, 1) #Convert affected = 2 to affected = 0
geno = as.matrix(dat[,-c(1:6)])
obj <- SKAT_Null_Model(phen~1, out_type="D")
skatT <- SKAT(geno, obj, method="optimal.adj")

warnings = names(as.vector(warnings()))
wardf <- data.frame(V1=paste("warning", 1:length(warnings), ':', sep=""), V2=warnings)
output = rbind(
        V1=c("individuals", "controls", "cases", paste("pval for rho=", skatT$param$rho, ":", sep=""), "test_p:"),
        V2=c(length(phen), table(phen)[[2]], table(phen)[[1]], skatT$param$p.val.each, skatT$p.value)),
write.table(output, paste(basename, "_SKAT_details.txt", sep=""), col.names=F, quote=F, row.names=F, sep="\t")
ADD REPLYlink modified 9 months ago • written 9 months ago by WouterDeCoster35k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1267 users visited in the last hour