Question: Problem with PRSice package using dosage files
0
gravatar for m93
20 months ago by
m93140
m93140 wrote:

I am trying to use the PRSice package to calculate a polygenic risk score for my group of samples. My input files are .imputed files (generated with IMPUTE2) that has been modified with bash to have dosage data.

In other words, instead of looking like this (3 probabilities per SNP: AA,AB,BB)

SNP1 rs1 1000 A C 1 0 0 1 0 0
SNP2 rs2 2000 G T 1 0 0 0 1 0
SNP3 rs3 3000 C T 1 0 0 0 1 0
SNP4 rs4 4000 C T 0 1 0 0 1 0
SNP5 rs5 5000 A G 0 1 0 0 0 1

My .imputed actually look like this (1 value/column per SNP per individual)

SNP1 rs1 1000 A C 0 0
SNP2 rs2 2000 G T 0 1 
SNP3 rs3 3000 C T 0 1
SNP4 rs4 4000 C T 1 1
SNP5 rs5 5000 A G 1 2

Calculation used to obtain dosage = [0 * p(AA)] + [1 * p(AB)] + [2 * p(BB)]) Actual command in bash:

cat file.imputed | awk '{printf $1"\t"$2"\t"$3"\t"$4"\t"$5; for(i=6; i<nf; i+="3)" {if($(i+0)="=" 0="" &amp;&amp;="" $(i+1)="=" 0="" &amp;&amp;="" $(i+2)="=" 0)="" printf="" "\tna";="" else="" printf="" "\t"$(i+0)*0+$(i+1)*1+$(i+2)*2};="" printf="" "\n"}'="" &gt;="" file_dosages.imputed<="" p="">

I also have a .sample file which has phenotypes. The format is:

ID_1 ID_2 missing father mother sex pheno cov1 cov2 cov3 cov4 cv5
0 0 0 D D D B D D D D D
sample1 sample1 0 0 0 0 0 0 2 2 1 3
sample2 sample2 0 0 0 0 1 0 0 1 0 3
sample3 sample3 0 0 0 0 0 0 4 999 0 3
sample4 sample4 0 0 0 0 0 0 4 0 1 2

My command in PRSice is:

R --file=/Users/bob/Downloads/PRSice_v1.25/PRSice_v1.25.R -q --args \
plink /Users/bob/Downloads/plink-1.07-mac-intel/plink \
base ../../ORs.txt\
target ../../file_dosages.imputed \
dosage T \
dos.format 1 \
dos.impute2 T \
dos.sep.fam ../../Pheno_files_for_PRSice/file.sample \
dos.fam.is.samp T \
slower 0.000000001 \
sinc 0.01 \
supper 0.5 \
covary F \
clump.snps F \
report.individual.scores T

However, I get the following error: Error in file(file, "rt") : cannot open the connection Calls: read.table -> file Execution halted

The PROFILES.log file says: Reading dosage information from [ ../../file_dosages.imputed ] Format set to three genotype probabilities Writing results to [ PROFILES.assoc.dosage ]

I have the impression PRSice is not understanding that my dosage format is 1 SNP per individual, not "Format set to three genotype probabilities". I don't understand why my command doesn't work. Any help would be much appreciated.

ADD COMMENTlink modified 20 months ago • written 20 months ago by m93140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1290 users visited in the last hour