Question: Plink unable to read plink.dosage.gz file
gravatar for paulwjeffries
4 months ago by
paulwjeffries0 wrote:

I received imputed dosage files from the Michigan Imputation Server (minimac3). The files are vcf format and compressed (gz). I used DosageConvertor to convert the files to plink dosage files; these files are also compressed.

When I try to use a compressed plink dosage file (for example, fileName.plink.dosage.gz) in a linear regression using plink 1.07, plink returns the error "ERROR: Bad format fdr (sic) dosage file, expecting more columns". However, if I uncompress the file and run the same analysis, plink produces no error and completes the analysis. I did include the argument --Zin with the compressed file and omitted the argument when using the uncompressed file.

I used wc to count the number of lines in the compressed file, and the returned line count was what it should be. I counted the number of “words”, and the returned count was correct. The number of columns should equal the number of words / number of lines, since each line should have the same number of words. But since plink is not counting the correct number of columns, does this mean there is a delimiter missing or a delimiter where it should not be.

I believe plink files are white space (space or tab) delimited. Nonetheless, I used sed to change each tab to a single white space and consecutive white spaces to a single white space. But plink is still unable to read the compressed file.

Of course, I can run the analysis with uncompressed files, but it would be nice to keep the files compressed. Can anyone suggest what the problem might be?

All advice is appreciated, Paul

snp plink software error • 271 views
ADD COMMENTlink modified 4 months ago by chrchang5235.0k • written 4 months ago by paulwjeffries0

Hi Paul, I was wondering whether you managed to solve the problem of using compressed dosage files? I am at the same stage right now, having received my dosage files from the Michigan Imputation server. I have used DosageConverter to convert the files to plink dosage and now have a set of compressed plink.dosage files. I need to perform some QC filtering on these in terms of MAF and HWE and was wondering whether it is better to uncompress the files and perform these steps.

ADD REPLYlink written 10 days ago by dnyanada.gokhale10
gravatar for chrchang523
4 months ago by
United States
chrchang5235.0k wrote:

The problem is that you are still using plink 1.07 for dosage analysis. plink 2.0 can read VCF dosages directly, and supports the full range of linear/logistic regression options on dosage data rather than the limited set offered by plink 1.x --dosage.

ADD COMMENTlink written 4 months ago by chrchang5235.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 812 users visited in the last hour