Question: Plink unable to read plink.dosage.gz file
2
gravatar for paulwjeffries
22 months ago by
paulwjeffries10 wrote:

I received imputed dosage files from the Michigan Imputation Server (minimac3). The files are vcf format and compressed (gz). I used DosageConvertor to convert the files to plink dosage files; these files are also compressed.

When I try to use a compressed plink dosage file (for example, fileName.plink.dosage.gz) in a linear regression using plink 1.07, plink returns the error "ERROR: Bad format fdr (sic) dosage file, expecting more columns". However, if I uncompress the file and run the same analysis, plink produces no error and completes the analysis. I did include the argument --Zin with the compressed file and omitted the argument when using the uncompressed file.

I used wc to count the number of lines in the compressed file, and the returned line count was what it should be. I counted the number of “words”, and the returned count was correct. The number of columns should equal the number of words / number of lines, since each line should have the same number of words. But since plink is not counting the correct number of columns, does this mean there is a delimiter missing or a delimiter where it should not be.

I believe plink files are white space (space or tab) delimited. Nonetheless, I used sed to change each tab to a single white space and consecutive white spaces to a single white space. But plink is still unable to read the compressed file.

Of course, I can run the analysis with uncompressed files, but it would be nice to keep the files compressed. Can anyone suggest what the problem might be?

All advice is appreciated, Paul

snp plink software error • 1.1k views
ADD COMMENTlink modified 22 months ago by chrchang5237.4k • written 22 months ago by paulwjeffries10

Hi Paul, I was wondering whether you managed to solve the problem of using compressed dosage files? I am at the same stage right now, having received my dosage files from the Michigan Imputation server. I have used DosageConverter to convert the files to plink dosage and now have a set of compressed plink.dosage files. I need to perform some QC filtering on these in terms of MAF and HWE and was wondering whether it is better to uncompress the files and perform these steps.

ADD REPLYlink written 18 months ago by dnyanada.gokhale10
0
gravatar for chrchang523
22 months ago by
chrchang5237.4k
United States
chrchang5237.4k wrote:

The problem is that you are still using plink 1.07 for dosage analysis. plink 2.0 can read VCF dosages directly, and supports the full range of linear/logistic regression options on dosage data rather than the limited set offered by plink 1.x --dosage.

ADD COMMENTlink written 22 months ago by chrchang5237.4k

Hi,

you mean we can use plink run logistic regression for dose.vcf.gz directly? could you share the plink code to run this? i tried many different code. however, no one worked so far...Thank you for your help

ADD REPLYlink written 6 months ago by gyang09200

This depends on how your dosages are encoded in the VCF, but something like

plink2 --vcf [VCF path] dosage=DS --pheno [phenotype file] --pheno-name [phenotype name] --glm

should work.

ADD REPLYlink modified 6 months ago • written 6 months ago by chrchang5237.4k

Thank you for your help. So, if i want to add some covariant, I could aff --covar ...? or I need other different command?

ADD REPLYlink written 6 months ago by gyang09200

Yes, use --covar for that.

ADD REPLYlink written 6 months ago by chrchang5237.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1821 users visited in the last hour