I wrote this tool to easily get variant genotypes from different populations, using the data from The 1,000 Genomes Project. You just provide one or more BED files and you get a VCF.
I hope it's useful!
--
bed_to_tabix
bed_to_tabix
will download a gzipped VCF file with the 2,504 genotypes from The 1,000 Genomes Project at the regions defined in one or more BED files. The utility will specifically handle for you the BED sorting, merging of many BEDs, parallel-downloading of the different chromosome variants with tabix (you can even use HTTP URLs in case your FTP traffic is blocked) and it will merge the resulting VCFs in a single gzipped VCF. Afterwards, it will perform a cleanup of the temporary files, so you're done with a single results file.
bed_to_tabix
is written in Python, but it can be used as a command line tool without any knowledge of the language.
Installation instructions here: https://github.com/biocodices/bed_to_tabix
Example Usages:
# Download the regions in regions1.bed to regions1.vcf.gz
bed_to_tabix --in regions1.bed
# Download the regions in regions1.bed, 10 downloads at a time, to 1kg.vcf
bed_to_tabix --in regions1.bed --threads 10 --unzipped --out 1kg
# Download the regions in both bed files to regions1__regions2.vcf.gz
bed_to_tabix --in regions1.bed --in regions2.bed
# Download from the HTTP URLs in case your traffic to FTP is blocked
bed_to_tabix --in regions1.bed --http