Different SNP at the at a position in vcf file compared to REF and ALT allele.
2
0
Entering edit mode
19 months ago
Abbas.M • 0

Hello experts, I have been working with high-throughput genome sequencing data for more than 2 years. I have a confusion about the REF and ALT alleles in a vcf file. How REF and ALT alleles are determined in a vcf file? I searched a lot on internet but did not find any relevant information about it. I did GWAS and identified a SNP associated with my desired phenotype. When I annotated that SNP with a SNP annotation tool, it shows G>A conversion at a specific position, which is associated with the trait I studied. To further confirm the resuts, when I separate the GWAS population into two on the basis of G and A SNP, it shows C SNP in one population and T SNP in another population, instead of G and A. I guess, the SNP lies on the opposite strand of DNA, as C is complementary to G, and A is complementary to T. Please give your valuable comments. Following are the columns of vcf file with REF and ALT snp, and it annotation information.

#CHROM  POS ID  REF ALT QUAL    FILTER          
9   689616  rs1023  C   T   .   .   upstream_gene_variant   c.-1510G>A n.689616C>T
calling sequencing Genome variant • 1.3k views
ADD COMMENT
0
Entering edit mode
19 months ago

How REF and ALT alleles are determined in a vcf file?

REF is the base on the reference genome

ALT are all the other bases at this POSition.

ADD COMMENT
0
Entering edit mode

Thank you very much for your response. I know that REF is the the base in reference genome and ALT is the SNP at that position in my population. But in my data the REF and ALT bases are different from the SNPs at that position in the population. Its G>A change in the population at that position instead of C or T, Here are some nucleotides from that position in the reference genome, with G nucleotide in bold text, which is changed to A in the GWAS population GTCAAGTAGTTCGGTGAAGGGGGAT. But when I looked into the VCF file its shows REF allele C and ALT allele T. Should it not be G as REF allele, and A as ALT. Why is here C in REF, and T in ALT. I have checked other positions as well, and found most of the REF and ALT alleles are determined like this.

ADD REPLY
0
Entering edit mode
18 months ago
Abbas.M • 0

I have found the answer for it, if anyone face such a problem in future. As the gene is on negative strand in the reference genome, therefore the bases in this position are different in reference genome and GWAS population, but they are complementary to each other. As in the gene the nucleotides are C and T in the reference genome, whereas in the population, its G and A. Here C is complementary to G, and A is complementary to T. If I am not wrong in my assessment.

ADD COMMENT

Login before adding your answer.

Traffic: 3119 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6