Question: Interpreting allele values in VFC file
gravatar for kanat.yermekbayev
11 weeks ago by
kanat.yermekbayev0 wrote:

Hi All,

Please could someone help me with interpreting these allele values in the VFC file below.

# [1]CHROM  [2]POS  [3]REF  [4]ALT  [5]ALT  [6]QUAL [7]DP   [8]RO   [9]AO   [10]Par-1_DHT02696-8_L6:GT  [11]Par-1_DHT02696-8_L6:DP  [12]Par-1_DHT02696-8_L6:RO  [13]Par-1_DHT02696-8_L6:AO  [14]Par-1_DHT02696-8_L6:AO  [15]Par-2_DHT02696-9_L6:GT  [16]Par-2_DHT02696-9_L6:DP  [17]Par-2_DHT02696-9_L6:RO  [18]Par-2_DHT02696-9_L6:AO  [19]Par-2_DHT02696-9_L6:AO

1.Chr1A 61556   G   A   .   26.7544 4   2   2   0/1 4   2   2   .   .   .   .   .   .
2.Chr1A 95880   C   T   .   57.0319 2   0   2   1/1 2   0   2   .   .   .   .   .   .
3.Chr1A 1156169 G   T   .   1.59189e-14 90  88  2   0/0 35  33  2   .   0/0 55  55  0   .
4.Chr1A 1159646 G   A   .   0.0185916   162 149 13  0/0 67  67  0   .   0/1 95  82  13  .
5.Chr1A 1940398 TG  CG  CA  306.879 27  12  8   1/2 8   0   1   7   0/1 19  12  7   0

Here, I aligned two bam files (one from Par-1 and second from Par-2) to Reference using freebayes. Initial aim was to have VFC file with three columns for REF, Par-1 and Par-2, but as you can see the third column is mostly empty (why?). Anyway, I tried to understand it. So, I wonder if someone could help me with allele values 0/0, 0/1, 1/1, 1/2.

Does line-1 say that REF is "G", Par-1 (ALT-1) is "A". If so, what about Par-2 represented by many dots? Why in line-2 allele value is 1/1? Does line3 say that REF is "G", Par-1 (ALT-1) is "T"? and what about Par-2? The same question for line-4 and line-5.

Thanks a lot Kanat

snp alignment • 154 views
ADD COMMENTlink modified 11 weeks ago by b.bearmi0 • written 11 weeks ago by kanat.yermekbayev0
gravatar for b.bearmi
11 weeks ago by
b.bearmi0 wrote:

0/0, 0/1, 1/1 refer to the genotype, a reference base called in a position would be 0/0, an alternative in one copy (presuming diploid) would be 0/1 and substitution in both alleles 1/1. Sometimes an alternative is called but the ratio between alternative and reference is less than 0.5 (either sequencing errors or copy number variation), I assume in this situation you would get a 0/0 for a genotype in Freebayes. A dot means reference did not change. Are Par-1 and Par2 refer to the same sequenced sample or different samples? Anyway looking at the sequence at the exact position of a called SNV might help (Tablet, if you are a Mac user, IGV if not, also samtools can do it, but in a less user-friendly manner (I think it is something like samtools view aln.sorted.bam chr2:20,100,000-20,200,000, but you would have to check)

ADD COMMENTlink written 11 weeks ago by b.bearmi0

Thanks for your quick response.

So, it means line-1 should look like "Chr1A 61556 G A G (instead of dot)". What about 1/2 in the line-5? Sorry for stupid question, can you give an example for substitution in both alleles 1/1? In my understanding 0/1 is GG vs AA(line-1). Par-1 and Par-2 are different samples and refers to Parent -1 and Parent-2.

ADD REPLYlink written 11 weeks ago by kanat.yermekbayev0

Unless you are dealing with bacteria, you would have two copies of the same gene (plants - could be more, see "ploidy") 0/0 == G | G (G in copy one and G in copy 2, thinking humans: "one from mum and one from dad", unless you are studying something like X chromosome, then only females will have two copies... ) 0/1 == A | G 1/1 == A | A

1/2 confuses me a bit, but if your variant caller was running two samples simultaneously, it might have a notation for "alt in both samples", but alternatively it might refer to the depth of coverage (how many reads support the ref /how many for alt, but I think former rather then the latter.

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by b.bearmi0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1657 users visited in the last hour