Entering edit mode
3.6 years ago
dec986
▴
370
I'm using break_blocks to extract regions of a gVCF file:
zcat NA19238.final_combined.g.vcf.gz | break_blocks --region-file file.bed --exclude-off-target --ref /illumina/runs/con/g1k_v37/human_g1k_v37.fasta > input.g.vcf
which produces a file with about 6 of the bed files's 43 sites, which isn't very good.
the tab-delimited bed file looks like:
10 96405502 96405502 label1
10 96541616 96541616 label2
10 96540410 96540410 label3
I get similar problems when the bed file looks like this:
10 96405502 96405503 label1
10 96541616 96541617 label2
10 96540410 96540411 label3
Almost all of the sites are missing, and yet because of the GVCF format, nearly all should be present, why are so many spots missing, then?
Also, when I convert gVCF to VCF by GATK's tool
gatk --java-options "-Xmx4g" GenotypeGVCFs -R human_g1k_v37.fasta -Vinput.g.vcf.gz -O output.final_combned.vcf.gz
I get no lines at all, why is this happening?