Question: Break_blocks conversion of gvcf -> vcf
0
gravatar for dec986
4 months ago by
dec986210
United States
dec986210 wrote:

I have a list of VCFs that I'm converting from gVCF to VCF thus:

zcat HG00096.final_combined.g.vcf.gz | gvcftools-0.17.0/bin/break_blocks --region-file gsa_positions.chromEnd_plus1.bed --ref g1k_v37/human_g1k_v37.fasta --exclude-off-target > vcf/HG00096.bb.vcf 2> break_blocks_err/HG00096.bb.err

the bed file looks like

1   58813   58814   GSA-rs114420996 .   G   A   PASS    .   GT:GQ   ./.:0.0
1   565507  565508  GSA-rs9283150   .   G   A   PASS    .   GT:GQ   ./.:0.0
1   567091  567092  GSA-rs9326622   .   T   C   PASS    .   GT:GQ   ./.:0.0
1   726911  726912  GSA-1:726912    .   A   G   PASS    .   GT:GQ   0/0:0.27129138
1   727840  727841  GSA-rs116587930 .   G   A   PASS    .   GT:GQ   0/0:0.6141555

but the problem is that the output files are missing alternate alleles, e.g.

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  HG00126
1   58814   .   G   <NON_REF>   .   .   .   GT:DP:GQ:MIN_DP:PL  0/0:31:84
:29:0,84,1192
1   565508  .   G   <NON_REF>   .   .   .   GT:DP:GQ:MIN_DP:PL  0/0:27:50
:27:0,50,857
1   567092  .   T   <NON_REF>   .   .   .   GT:DP:GQ:MIN_DP:PL  0/0:16:45
:16:0,45,675

I've also tried piping to extract_variants from Converting Gvcf Files Into Vcf but this isn't working either.

how can I get the output to not have the NON_REF so that it looks like a correct VCF?

vcf • 153 views
ADD COMMENTlink modified 4 months ago by ATpoint34k • written 4 months ago by dec986210
0
gravatar for ATpoint
4 months ago by
ATpoint34k
Germany
ATpoint34k wrote:

If the bed and the vcf have identical sort order simply take the ALT column from the BED to replace that column:

paste <(cut -f1-5 that.vcf) <(cut -f 7 that.bed | cat <(echo "ALT") /dev/stdin) <(cut -f7-) > fixed.vcf
ADD COMMENTlink written 4 months ago by ATpoint34k

they don't have identical sort order :(

ADD REPLYlink written 4 months ago by dec986210

then sort both files...

ADD REPLYlink written 4 months ago by ATpoint34k
1

the files are of different lengths. I just ended up using a Perl hash, which was much easier.

ADD REPLYlink written 4 months ago by dec986210
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1253 users visited in the last hour