file formatting for SNPRelate
1
0
Entering edit mode
6.7 years ago
lovestowell ▴ 10

Hello!

I am trying to use the R package SNPRelate to calculate pairwise relatedness between 8 males and 8 females. I have a VCF file with ~13,000 SNPs. I am able to use SNPRelate to convert the VCF file to a GDS file with snpgdsVCF2GDS(). However, when I try to use the functions that act on the GDS file, such as snpgdsLDpruning() or snpgdsIBDMLE(), I get an error: "There is no SNP!" because all ~13,000 SNPs were removed as non-autosomal. However, almost all my SNPs are autosomal. So I'm wondering if my VCF was improperly formatted or missing information and how I might add it back in. Suggestions? Thx.

#open vcf

parents.vcf <- "~/Data/parents.vcf"

#convert vcf to gds

snpgdsVCF2GDS(parents.vcf,"parents.gds")

#open gds

parents.gds <- snpgdsOpen("parents.gds")

#calculate relatedness

snpgdsIBDMLE(parents.gds)

>Identity-By-Descent analysis (MLE) on SNP genotypes:
Removing 13354 SNP(s) on non-autosomes
Error in .InitFile2(cmd = "Identity-By-Descent analysis (MLE) on SNP genotypes:",  :
There is no SNP!

SNPRelate relatedness VCF GDS file fomat • 4.2k views
0
Entering edit mode
6.7 years ago
lovestowell ▴ 10

In case anyone else is having the same problem and also didn't read the man pages carefully...

Both snpgdsVCF2GDS() and snpgdsIBDMLE() have some extra arguments that make things work more smoothly. I wasn't sure if the sex chromosome in my original VCF file was part of the trouble, so I removed it and then proceeded to use SNPRelate.

In bash shell:

grep -v '^chr_Sex' parents.vcf > parents.autosomal.vcf
vcftools --vcf parents.autosomal.vcf --recode --recode-INFO-all --out parents.autosomal.vcf


In R:

parents.vcf <- "~/Data/parents.autosomal.recode.vcf"
snpgdsVCF2GDS(parents.vcf,"parents.gds",ignore.chr.prefix="chr_")
parents.gds <- snpgdsOpen("parents.gds")
IBDMLE <- snpgdsIBDMLE(parents.gds)