Problem creating STRUCTURE inputs with PLINK
0
0
Entering edit mode
29 days ago

I have formatting issues when preparing my data for STRUCTURE analysis. I am creating the input files for STRUCTURE, but unfortunately the .strct_in file produced seems to show the samples names, that trigger an error in STRUCTURE. Did anyone had this problem before?

    # Base names for outputs 
    PLINK_BASE="all.work10.pruned" 
    STRUCTURE_BASE="all_for_structure"

    # Step 1: Convert to PLINK format  
plink --vcf "${PLINK_BASE}.vcf.gz" \
              --make-bed --double-id --allow-extra-chr \
              --out "$PLINK_BASE"

# Step 2: Convert to STRUCTURE format  

plink --bfile "$PLINK_BASE" \
              --recode structure --allow-extra-chr \
              --out "$STRUCTURE_BASE"
vcf population plink structure genetics • 14k views
ADD COMMENT
0
Entering edit mode

Provide the error message.

ADD REPLY
0
Entering edit mode

Can you provide the error message and what plink version you are using?

For what it's worth, I've encountered many challenges in getting plink to convert to a format STRUCTURE will accept ... I end up using PGDSpider2 sometimes, especially if I just have a single vcf to convert.

ADD REPLY

Login before adding your answer.

Traffic: 3703 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6