Question: 1000 Genomes - Sample Subset Of A Vcf File
gravatar for pablo.riesgo
8.5 years ago by
pablo.riesgo140 wrote:

Hi there,

I'm trying to retrieve genotype data for a given sample from the 1000 genomes FTP repository, following their guidelines:

So I tried to execute something like: tabix -h 1 | perl /nfs/1000g-work/G1K/work/bin/vcftools/perl/vcf-subset -c NA12890

This command is giving me an error at vcf-subset like: Wrong number of fields; expected 1101, got 559. The offending line was: [...]

At first sight you may think that it has got to do with the remote VCF file, but the line where it fails is different every execution and looks like it is cutting these lines. Anybody has the same problem?

Seems to me like a problem with tabix maybe cutting the lines when network is not working very well, but it is just a guess... Any other ideas?

Regards, Pablo.

PS: I'm gonna try downloading this huge files but I did not want to do this...

vcf 1000genomes • 3.8k views
ADD COMMENTlink modified 6.5 years ago by siyu140 • written 8.5 years ago by pablo.riesgo140
gravatar for Leonor Palmeira
8.5 years ago by
Leonor Palmeira3.8k
Liège, Belgium
Leonor Palmeira3.8k wrote:

I most certainly believe it is an internet connection problem that gives you this random behaviour. In your case, I would never try this over the network, as it is bound to fail at some point on such large files, due to classical network instability.

So yes, downloading this file is probably the solution.

ADD COMMENTlink written 8.5 years ago by Leonor Palmeira3.8k
gravatar for Michael Dondrup
8.5 years ago by
Bergen, Norway
Michael Dondrup48k wrote:

Might be due to tabix, or not, but you seem to restrict to chromosome 1 via tabix, right? But the file already contains only chr1 data. You could therefore simply download the whole file with wget and see what happens.

ADD COMMENTlink written 8.5 years ago by Michael Dondrup48k

Yes, the restriction to chromosome 1 is redundant as the file only contains chromosome 1.


Thanks for the quick answers!

ADD REPLYlink written 8.5 years ago by pablo.riesgo140
gravatar for siyu
6.5 years ago by
siyu140 wrote:

I got the same error even when I download the remote VCF file. Did you find the answer??

ADD COMMENTlink written 6.5 years ago by siyu140
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2558 users visited in the last hour