How to create proper genotype text file format for use in Coancestry from a vcf file
0
0
Entering edit mode
2.1 years ago
Karen • 0

Hello, I am trying to create the proper format of genotype data for use in the R package con ancestry to determine a relatedness matrix. I cannot get the format correct. I have a vcf file and have extracted genotypes and tried to reformat into the programs requirements. Any help would be greatly appreciated!

Thanks in advance. Karen

Here is my code thus far:
captiveRSvcf_file <- "vcf.vcf"
vcf_data <- read.vcfR(captiveRSvcf_file)

genotype_matrix <- extract.gt(vcf_data)
genotype_matrix.t <- t(genotype_matrix)

I am not sure how to separate the genotypes in each column for each locus for the appropriate format. Her are the requirements for Coancestry...
The file containing the genotype data to be analyzed. The file will need to be in R's working directory, and have the following characteristics: (1) It should be a text file (not and Excel file); (2) It should be space- or tab-delimited; (3) Missing data must be represented as zeros (0); and (4) There should not be a header row containing column names. Column 1 should contain individual identifiers, columns 2 and 3 should contain alleles 1 and 2 for locus 1, columns 4 & 5 should contain alleles 1 and 2 for locus 2, and so on. Thus, the total number of columns should be 2 x the number of loci + 1.

coancestry R • 541 views
ADD COMMENT

Login before adding your answer.

Traffic: 3725 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6