Tool: bed_to_tabix: Download the variants from 1,000 Genomes in the regions defined in one or more BED files.
gravatar for Juan Manuel Berros
3.5 years ago by
Buenos Aires, Argentina
Juan Manuel Berros80 wrote:

I wrote this tool to easily get variant genotypes from different populations, using the data from The 1,000 Genomes Project. You just provide one or more BED files and you get a VCF.

I hope it's useful!



bed_to_tabix will download a gzipped VCF file with the 2,504 genotypes from The 1,000 Genomes Project at the regions defined in one or more BED files. The utility will specifically handle for you the BED sorting, merging of many BEDs, parallel-downloading of the different chromosome variants with tabix (you can even use HTTP URLs in case your FTP traffic is blocked) and it will merge the resulting VCFs in a single gzipped VCF. Afterwards, it will perform a cleanup of the temporary files, so you're done with a single results file.

bed_to_tabix is written in Python, but it can be used as a command line tool without any knowledge of the language.

Installation instructions here:

Example Usages:

# Download the regions in regions1.bed to regions1.vcf.gz
bed_to_tabix --in regions1.bed

# Download the regions in regions1.bed, 10 downloads at a time, to 1kg.vcf
bed_to_tabix --in regions1.bed --threads 10 --unzipped --out 1kg

# Download the regions in both bed files to regions1__regions2.vcf.gz
bed_to_tabix --in regions1.bed --in regions2.bed

# Download from the HTTP URLs in case your traffic to FTP is blocked
bed_to_tabix --in regions1.bed --http
linux tool 1000genomes python cli • 1.5k views
ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Juan Manuel Berros80
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1433 users visited in the last hour