Splitting VCF into 10mb window subfiles for a given chromosome
1
0
Entering edit mode
21 months ago

Hello everyone, is there an efficient way to split a given vcf file (lets say for chromosome 1) into several sub-vcf files each of which contain 10mb each pertaining to the same chromosome? Many thanks in advance!

vcf bedtools bcftools • 568 views
ADD COMMENT
1
Entering edit mode
21 months ago

I wrote http://lindenb.github.io/jvarkit/VcfToIntervals.html

(not tested)

bcftools view in.vcf.gz |  java -jar dist/vcf2intervals.jar --bed --distance "10mb" --min-distance 0 | awk '{printf("%s:%d-%s\n",$1,int($2)+1,$3);}' | while read R
do
    bcftools view -O z -o "${R//[:-]/_}.out.vcf.gz" "in.vcf.gz" "${R}"
done
ADD COMMENT
0
Entering edit mode

Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2701 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6