Question: SNPRelate LD Pruning List of Loci
1
gravatar for jessivannatta
3 months ago by
jessivannatta10 wrote:

Hi there,

I'm having trouble finding which SNPs have been pruned after LD pruning. I want to be able to list those loci that are being removed so I can remove them from my dataset for future analyses. I'll be honest that I am very new to this stuff!

Here is my code (forgive me for any errors!):

library(gdsfmt)
library(SNPRelate)

### File Conversion - VCF to GDS ###
vcf.fn <- "H:PhD Research/UNEAK_Bat_West1+3+4/Filtered Data/CORA 50%.vcf" snpgdsVCF2GDS(vcf.fn, "CORA.gds", method="biallelic.only")
snpgdsSummary("CORA.gds") 

setwd("H:/PhD Research/UNEAK_Bat_West1+3+4/SNPRelate/")

### Clean-up Linkage Disequilibrium ###
CORA.gds <- snpgdsOpen("CORA.gds", readonly=FALSE, allow.duplicate=TRUE, allow.fork=FALSE) 
set.seed(1000) 
snpset <- snpgdsLDpruning(CORA.gds, ld.threshold=0.8) 
names(snpset) head(snpset$chr0) 
snp.id <- unlist(snpset) 
read.gdsn(index.gdsn(CORA.gds, "snpset"))

I'm trying to read the GDS field in the last line but can't seem to get an output. Instead I get an error:

Error in inherits(node, "gdsn.class") : No such GDS node "snpset"!

I'm also trying to calculate HWE equilibrium and have come across very little documentation on how to do this, but here is my code:

### Clean-up HWE ###
CORA.gds <- snpgdsOpen("CORA.gds", readonly=FALSE, allow.duplicate=FALSE, allow.fork=FALSE)
sample.id <- read.gdsn(index.gdsn(CORA.gds, "sample.id"))
p <- snpgdsHWE(CORA.gds, sample.id=NULL, snp.id=NULL, with.id=TRUE) 
summary(p) 
plot(-log10((1:length(p))/length(p)), -log10(p[order(p)]), xlab="-log10(expected P)", ylab="-log10(observed P)", main="QQ Plot")

Any advice at all on any of these aspects would be much appreciated!

snprelate snp • 268 views
ADD COMMENTlink modified 3 months ago by 159932162500 • written 3 months ago by jessivannatta10

Hey, thank you for re-posting here. Just for the LD part, I think that you may in fact want to select ld.threshold=0.2 - 0.8 seems too high? This may not change anything, though.

How did you build this tutorial so far? - following this: http://corearray.sourceforge.net/tutorials/SNPRelate/ ?

The packages loaded correctly?

Have you performed any pre-processing on your VCF?; Did both snpgdsVCF2GDS and snpgdsSummary run okay?

ADD REPLYlink written 3 months ago by Kevin Blighe41k

Hi Kevin,

Thanks for your response. Yes, I was following the tutorial you mention to build my code, but it doesn't mention how to visualize the markers once they have been pruned.

All of the packages loaded correctly and all of the functions worked. It tells me I have 2,745 markers selected from my original 3,569, but I'm just not sure how to visualize them.How do I know which markers have been pruned so I can remove them from my dataset and use the cleaned set for future analyses? Is there a way to write the .gds to a .csv or something similar? Also, I did not perform any pre-processing on my VCF. I took it just as it is and converted it to .gds. Let me know your thoughts.

ADD REPLYlink written 3 months ago by jessivannatta10

Why not just get the difference between the list of 3569 markers and that of 2745? I have not used SNPRelate. I use PLINK (recommended by 15993216250) for this type of analysis.

ADD REPLYlink written 3 months ago by Kevin Blighe41k

Yes, I am trying to get the difference between the two lists but I'm not sure how to visualize that. How do I see which markers were removed from the list of 3569 to make it 2745? How can I get an output? If I can't figure out SNPRelate, I am going to give PLINK a try. Thank you for your input!

ADD REPLYlink written 3 months ago by jessivannatta10
0
gravatar for 15993216250
3 months ago by
159932162500 wrote:

How about trying a function of plink called --indep-pairwise to do SNP pruning?

ADD COMMENTlink written 3 months ago by 159932162500

Thank you for your suggestion. I'm not very familiar with plink, but I feel as if it might be too complex for someone who has very little experience with using the command line, like myself. However, I'm very open to giving it a try. Do you have a simple code you would recommend for SNP pruning in plink?

ADD REPLYlink written 3 months ago by jessivannatta10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1508 users visited in the last hour