before reading the table, we need to unify the separations in the txt. The original structure file generated by PGDSpider separated by both "\t" and " ".

Question

using PGDSpider to generate structure file

0

Entering edit mode

3.3 years ago

jiazhou0116 ▴ 10

Hi,

I am trying to generate a structure file through PGDSpider, for a diploid organism, where each locus is in two consecutive columns in one row, rather than data for each individual be stored as 2 consecutive rows, where each locus is in one column (as mentioned as one option in structure format: https://web.stanford.edu/group/pritchardlab/software/structure22/readme.pdf). In the spid file, I defined: # What is the ploidy of the data? VCF_PARSER_PLOIDY_QUESTION=DIPLOID_ONE_ROW (http://www.cmpg.unibe.ch/software/PGDSpider/PGDSpider%20manual_vers%202-1-1-5.pdf). However, the output still have each individuals be stored as 2 consecutive rows rather than in one row. Just wondering whether you have any suggestions for it.

Thanks,

Jia

SNP • 1.2k views

ADD COMMENT • link 3.3 years ago by jiazhou0116 ▴ 10

score 1 · Answer 1 · 2021-01-04

Hi,

To answer the question I asked a bit earlier: From my understanding, we could not generate a structure format with each locus in two consecutive columns in one row through PGDSpider directly. So instead, I used the following R script to convert

before reading the table, we need to unify the separations in the txt. The original structure file generated by PGDSpider separated by both "\t" and " ".

all_loci=read.table("SNP_structure2.txt", header=TRUE, sep="")

remove population column

all_loci_2=all_loci[,-2]

split the odd and even entries of the rows

s2=split(all_loci_2, 1:2)

rename column names

colnames(s2$'1') <- paste(colnames(s2$'1'),"a", sep = "") colnames(s2$'2') <- paste(colnames(s2$'2'),"b", sep = "")

Combine two data frames of the same size one column after each other

all_loci3 <- cbind(s2$'1', s2$'2')[order(c(seq_along(s2$'1'), seq_along(s2$'2')))] all_loci4=all_loci3[,-2] write.table(all_loci4, file = "SNP_structure_edited.txt")

Thanks,

Jia