gunzip error: trailing garbage ignored
0
0
Entering edit mode
12 months ago
pwjeffries • 0

I cannot extract all of the contents of a gzipped vcf file. The file is part of an encrypted tarball I downloaded from dbGaP. After decryption, I was able to extract a directory of files with this command:

 tar -xvf phg001.tar

When I used Plink to convert one of the extracted vcf files to a bed file, I got an error message: Error: Line 20 of .vcf file has fewer tokens than expected.

I counted the number of lines in the files with the help of zcat.

zcat chr22-filtered.dose.vcf.gz | wc -l

Output:

gzip: chr22-filtered.dose.vcf.gz: decompression OK, trailing garbage ignored
19

And if I try to unzip the file, I get a similar message about trailing garbage.

gzip: test22.vcf.gz: decompression OK, trailing garbage ignored

The file is too large to have only 20 lines, and if I count the number of lines without using zcat, there is indeed more to the file.

wc -l chr22-filtered.dose.vcf.gz
3632730 chr22-filtered.dose.vcf.gz

How can I extract all of the contents of the zipped file.

All advice is appreciated.
Paul

vcf gzip gz plink • 2.1k views
ADD COMMENT
0
Entering edit mode

Most likely an error happened during a download, and you will need to get the file again. If that's not it, there was an error either when creating the file or uploading it.

To the best of my knowledge there is no magical command to cleanly decompress the file that has errors in it.

ADD REPLY

Login before adding your answer.

Traffic: 1762 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6