How should the data be formatted to run an association mapping in PLINK?
1
0
Entering edit mode
8.0 years ago
beausoleilmo ▴ 600

I'm trying to run a plink command to make an association mapping.

I created plink related files like this for the genetic files:

vcftools --vcf output.vcf --plink --out ./plink/output_in_plink

And I was trying to create phenotypic files in PLINK with this:

plink --file output_in_plink --pheno pheno.txt

But it returns nothing exceptional.

My phenotype file looks like this:

FID            IID        sex   tars wingc mass mbl mbw mbd
fortis         JP3162_for 2     21.3 74.2 15.8 10.34 8.76 9.51
fortis         JP3171_for 1     22.92 75.5 27.6 13.32 11.66 14.35
fortis         SH520_for  1     21.86 71.3 23.5 12.6 10.36 12.05
fortis         JP3402_for 1     22.41 68 24.1 12.52 11.1 12.69
fortis         JP3539_for 2     22.36 68 24.3 12.35 10.28 11.43
scandens       JP3565_sca 1     22.39 68 21.8 14.31 8.44 8.81
fortis         JP3574_for 2     23.61 73 29.1 14.69 12.74 14.47
fortis         JP3582_for 1     21.11 65 16.4 10.04 9.55 9.43
scandens       JP3583_sca 1     20.85 67 20.5 15.1 8.04 8.49
scandens       JP3587_sca 2     21.65 61 20.7 14.7 7.9 7.81
magnirostris   JP3607_mag other 22.99 69 23 13.21 11.31 13.2

I've also tried this one:

plink --file output_in_plink --no-fid --no-parents --pheno pheno1.txt --all-pheno --assoc --maf 0.05 --out run1

But this is not working.

Is it a problem with my phenotypic file?

PLINK association mapping VCF Phenotype GWAS • 2.9k views
ADD COMMENT
3
Entering edit mode
8.0 years ago
Sarthok ▴ 70

I have done similar analysis using plink 1.9. From your phenotype file as I can see in the sex column all values has to be numeric. If you are using a mac book to create the txt file please make sure you save (as) the file as UTF-8 (no BOM) because text file created in classic mac encoding does not work with plink.

If you please paste the analysis output message from plink I might be able to suggest you specific suggestions.

I have found Plink 1.9 forum very helpful.

ADD COMMENT
0
Entering edit mode

I ran od -c pheno1.txt on the terminal and saw that the end of line is only \n. I'm going to try it later. For the moment, I modified my R script to create my file like this:

write.table(df.pheno,"~/Desktop/pheno1.txt", 
            quote = FALSE,
            col.names = TRUE,
            row.names = FALSE,
            eol = "\r\n"). # This is the line of code that I needed to write to add the \r\n!

I've also change the encoding with TextWrangler.

On the PLINK website, it's saying this http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml:

Sex (1=male; 2=female; other=unknown)

"If an individual's sex is unknown, then any character other than 1 or 2 can be used."

The message I have is this one (returning only a log file and a nose file, even if I have a sex column, and trying with the no BOM file...):

 plink --file output_in_plink --pheno pheno1.txt --out run2

@----------------------------------------------------------@
|        PLINK!       |     v1.07      |   10/Aug/2009     |
|----------------------------------------------------------|
|  (C) 2009 Shaun Purcell, GNU General Public License, v2  |
|----------------------------------------------------------|
|  For documentation, citation & bug-report instructions:  |
|        http://pngu.mgh.harvard.edu/purcell/plink/        |
@----------------------------------------------------------@

Skipping web check... [ --noweb ] 
Writing this text to log file [ run2.log ]
Analysis started:

 Thu Dec 29 12:48:31 2016

Options in effect:
    --noweb
    --file output_in_plink
    --pheno pheno1.txt
    --out run2

840907 (of 840907) markers to be included from [ output_in_plink.map ]
Warning, found 96 individuals with ambiguous sex codes
These individuals will be set to missing ( or use --allow-no-sex )
Writing list of these individuals to [ run2.nosex ]
96 individuals read from [ output_in_plink.ped ] 
0 individuals with nonmissing phenotypes
Assuming a disease phenotype (1=unaff, 2=aff, 0=miss)
Missing phenotype value is also -9
0 cases, 0 controls and 96 missing
0 males, 0 females, and 96 of unspecified sex
Reading alternate phenotype from [ pheno1.txt ] 
0 individuals with non-missing alternate phenotype
Assuming a disease phenotype (1=unaff, 2=aff, 0=miss)
Missing phenotype value is also -9
0 cases, 0 controls and 96 missing
Before frequency and genotyping pruning, there are 840907 SNPs
96 founders and 0 non-founders found
Total genotyping rate in remaining individuals is 1
0 SNPs failed missingness test ( GENO > 1 )
0 SNPs failed frequency test ( MAF < 0 )
After frequency and genotyping pruning, there are 840907 SNPs
After filtering, 0 cases, 0 controls and 96 missing
After filtering, 0 males, 0 females, and 96 of unspecified sex

Analysis finished: Thu Dec 29 12:49:38 2016
ADD REPLY

Login before adding your answer.

Traffic: 1494 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6