How to create bed file from list of variants with indels and bp values
2
0
Entering edit mode
5 weeks ago
MAPK2 ▴ 40

Hi All,

I have a table with the list of variants and need to create a bed file for these. There are indels and some are even listed by +- bp as in the table below. How do I create a bed file for such variants?

bed • 315 views
0
Entering edit mode

A bed file is a white space delimited file with three required fields - CHR, START and END. You have two fields from which the three can be derived. What is your problem? I don't see any +, only -s signifying an insertion of deletion like VCF pre 4.2 convention.

0
Entering edit mode

Should I just subtract or add bases for the Pos_End? Right, they have not given + or -. I wanted to search these variants in my VCF file.

1
Entering edit mode

Look up some of these on dbSNP or gnomAD to understand how base positioning works - you'll need some math to calculate the End Pos, and some more math to account for the - Indels.

For example, 2:47803501delC (C>-) is in reality 2:47803500AC>A (AC>A). (https://gnomad.broadinstitute.org/variant/2-47803500-AC-A?dataset=gnomad_r3)

You can of course skip this step if standardization in not something you need. You could do just the End Pos calculation using Start Pos and the number of REF bases (I'll leave calculating the exact formula to you).

0
Entering edit mode
5 weeks ago

Hope this is useful, if you have the original VCF file:

vcf2bed --snvs < variants.vcf > snvs.bed
vcf2bed --insertions < variants.vcf > insertions.bed
vcf2bed --deletions < variants.vcf > deletions.bed

0
Entering edit mode
5 weeks ago
heskett ▴ 90

I just do awk ''{print $1,$2-1,$2,$3,\$4, etc}' although this will just give you the starting position of each variant if its not a SNV. if you have a VCF you can do GATK VariantsToTable