Question: 1000 Genomes - Sample Subset Of A Vcf File
0
gravatar for pablo.riesgo
5.2 years ago by
pablo.riesgo120
pablo.riesgo120 wrote:

Hi there,

I'm trying to retrieve genotype data for a given sample from the 1000 genomes FTP repository, following their guidelines: http://www.1000genomes.org/faq/how-do-i-get-sub-section-vcf-file

So I tried to execute something like: tabix -h ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/ALL.chr1.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz 1 | perl /nfs/1000g-work/G1K/work/bin/vcftools/perl/vcf-subset -c NA12890

This command is giving me an error at vcf-subset like: Wrong number of fields; expected 1101, got 559. The offending line was: [...]

At first sight you may think that it has got to do with the remote VCF file, but the line where it fails is different every execution and looks like it is cutting these lines. Anybody has the same problem?

Seems to me like a problem with tabix maybe cutting the lines when network is not working very well, but it is just a guess... Any other ideas?

Regards, Pablo.

PS: I'm gonna try downloading this huge files but I did not want to do this...

vcf 1000genomes • 2.6k views
ADD COMMENTlink modified 3.2 years ago by siyu100 • written 5.2 years ago by pablo.riesgo120
0
gravatar for Leonor Palmeira
5.2 years ago by
Leonor Palmeira3.6k
Liège, Belgium
Leonor Palmeira3.6k wrote:

I most certainly believe it is an internet connection problem that gives you this random behaviour. In your case, I would never try this over the network, as it is bound to fail at some point on such large files, due to classical network instability.

So yes, downloading this file is probably the solution.

ADD COMMENTlink written 5.2 years ago by Leonor Palmeira3.6k
0
gravatar for Michael Dondrup
5.2 years ago by
Bergen, Norway
Michael Dondrup43k wrote:

Might be due to tabix, or not, but you seem to restrict to chromosome 1 via tabix, right? But the file already contains only chr1 data. You could therefore simply download the whole file with wget and see what happens.

ADD COMMENTlink written 5.2 years ago by Michael Dondrup43k

Yes, the restriction to chromosome 1 is redundant as the file only contains chromosome 1.

Downloading...

Thanks for the quick answers!

ADD REPLYlink written 5.2 years ago by pablo.riesgo120
0
gravatar for siyu
3.2 years ago by
siyu100
China
siyu100 wrote:

I got the same error even when I download the remote VCF file. Did you find the answer??

ADD COMMENTlink written 3.2 years ago by siyu100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1314 users visited in the last hour