using PGDSpider to generate structure file
4 months ago
jiazhou0116

Hi,

I am trying to generate a structure file through PGDSpider, for a diploid organism, where each locus is in two consecutive columns in one row, rather than data for each individual be stored as 2 consecutive rows, where each locus is in one column (as mentioned as one option in structure format: https://web.stanford.edu/group/pritchardlab/software/structure22/readme.pdf). In the spid file, I defined: # What is the ploidy of the data? VCF_PARSER_PLOIDY_QUESTION=DIPLOID_ONE_ROW (http://www.cmpg.unibe.ch/software/PGDSpider/PGDSpider%20manual_vers%202-1-1-5.pdf). However, the output still have each individuals be stored as 2 consecutive rows rather than in one row. Just wondering whether you have any suggestions for it.

Thanks,

Jia

4 months ago
jiazhou0116

Hi,

To answer the question I asked a bit earlier: From my understanding, we could not generate a structure format with each locus in two consecutive columns in one row through PGDSpider directly. So instead, I used the following R script to convert

# remove population column

all_loci_2=all_loci[,-2]

# split the odd and even entries of the rows

s2=split(all_loci_2, 1:2)

# rename column names

colnames(s2$'1') <- paste(colnames(s2$'1'),"a", sep = "") colnames(s2$'2') <- paste(colnames(s2$'2'),"b", sep = "")

# Combine two data frames of the same size one column after each other

all_loci3 <- cbind(s2$'1', s2$'2')[order(c(seq_along(s2$'1'), seq_along(s2$'2')))] all_loci4=all_loci3[,-2] write.table(all_loci4, file = "SNP_structure_edited.txt")

Thanks,

Jia