Question: Population information for Fst in StAMPP
0
gravatar for aesculus
17 months ago by
aesculus10
aesculus10 wrote:

Hi,

I want to calculate P values for pairwise Fst estimates among populations for my RADseq SNP dataset. I have been trying to use the R package StAMPP. The accepted input formats are .stampp, .vcf, .csv and an R genlight object. See: https://cran.r-project.org/web/packages/StAMPP/StAMPP.pdf

The problem I have is that .vcf and genlight objects do not seem to contain a population column for individuals, but this is required by StaMPP to estimate allele frequencies (and then Fst estimates).

Is there a way to include population information in a vcf file or genlight object?

Alternatively, is there an easy way to produce a .stampp file? The format is hard to reproduce manually with a dataset of tens of thousands of SNPs.

Packages

   {r}
    #LOAD LIBRARIES
    library(pinfsc50)
    library(vcfR)
    library(adegenet)
    library(StAMPP)

Method 1

##1.1 Read fstat .dat into R and convert to genind object
genind<-read.fstat("f002-p3-r90-m10.dat", quiet=FALSE)

No genind2genlight command?

Method 2

##2.1 Read vcf file into into R
vcf<-read.vcfR("f002-p3-r60-m10.vcf", verbose = FALSE)
##2.2 Convert vcf into genlight object
x <- vcfR2genlight(vcf)
##2.3 Convert genlight to allele frequencies for StAMPP
data.freq<-stamppConvert(x, type = "genlight")

Error in pop.num[i] <- which(pop.names[i] == pops) : replacement has length zero

Need a population column in the vcf file or genlight object!

stampp snp fst adegenet vcf • 938 views
ADD COMMENTlink modified 16 months ago • written 17 months ago by aesculus10
0
gravatar for aesculus
16 months ago by
aesculus10
aesculus10 wrote:

Luke Pembleton very kindly provided this successful solution, should anyone else need it:

## as per your post 
##2.1 Read vcf file into into R 
vcf<-read.vcfR("g002-p3-r60-m10.vcf", verbose = FALSE) 
##2.2 Convert vcf into genlight object 
x <- vcfR2genlight(vcf) 

### convert genlight object to a matrix in stampp format 
x2 <- as.matrix(x) #convert genlight object to matrix 
sample <- row.names(x2) #sample names 

#pop.names <- pop(x) #extract ploidy info from genlight object (however this is not available when imported via a vcf file) 
#pop.names <- #provide a vector here of the same length of the number of samples, with a corresponding population name/id
#pop.names <- c(")
#e.g. c("popA", "popA", "popB", "popB", "popB", "popC", "popC") 

ploidy <- ploidy(x) #extract ploidy info from genlight object 
x2 = x2 * (1/ploidy) #convert allele counts to frequency 
x2[is.na(x2)] = NaN 
format <- vector(length = length(sample))
#format id for the genotype data
format[1:length(format)] = "freq"  

x.stampp <- as.data.frame(cbind(sample, pop.names, ploidy, format, x2)) #convert to basic r data.frame suitable to stamppConvert 

geno <- stamppConvert(x.stampp, 'r') 

#you should now be able to run all the stampp commands with 'geno' as the input object 
# e.g. fst <- stamppFst(geno)
ADD COMMENTlink written 16 months ago by aesculus10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2511 users visited in the last hour