Hi all,
I am performing variant detection using SAMTools and 1K Genomes data and generating a VCF file. The issue is that i have several variants of interest which do not have any rs#, although, i have the chromosomal coordinates and sequences.
I wanted to know the reference alleles for these variants from the reference genome. Is there a diploid version of RefSeq that can be used or is it possible that in db SNP one can search for by chromosomal coordinates?
The reference allele is included in the VCF file. dbSNP can be queried for chromosomal position; one approach is to use the UCSC genome browser to look at the chromosomal location.
A more standard approach to dealing with VCF files, though, is to use any one of a number of programs to provide rich annotation of the VCF file. GATK has a VariantAnnotator module. You could also try annovar. There are many others.