Hello,
I want to detect local adaption between different populations using pooled sequencing data.
luckily there is a reliable R package that one can use. pcadapt (https://cran.r-project.org/web/packages/pcadapt/index.html)
The struggle is to convert my vcf files to "a frequency matrix with n rows and L columns (where n is the number of populations and L is the number of genetic markers)" https://bcm-uga.github.io/pcadapt/articles/pcadapt.html#a--reading-genotype-data so from vcf file to .bed (PLINK binary biallelic genotype table) https://www.cog-genomics.org/plink2/formats#bed with only the number of populations in rows and the allele frequency in colums
I can just calculate the allele frequency and format the table into the required format using Excel... But there must be a better way to automate the process.
If an anyone knows a R plug in or another way to convert this I would greatly appreciate
Have you tried, eh, plink? https://www.cog-genomics.org/plink/1.9/data
Yes I have, thank you
I was hoping for an R based way
with Plink its kind of annoying, because it assumes that the Pools I have in my vcf file are Individuals. I'm sure there is a solution but I just used the vcfR package in R
thanks for the input