Question: B allele frequency SNP array
gravatar for beaferbl
12 months ago by
beaferbl10 wrote:

Hi everyone! I'm analyzing tumour samples with SNP array (Illumina and Affymetrix). Does anyone know if sometimes the softwares have "problems" to assign the B-allele frequency correctly? I know that sometimes they recenter the logR, but it doesn't happen with B-allele frequency, right?

snp • 780 views
ADD COMMENTlink modified 12 months ago by Kevin Blighe59k • written 12 months ago by beaferbl10
gravatar for Kevin Blighe
12 months ago by
Kevin Blighe59k
Kevin Blighe59k wrote:

Correct, the 'logR', more commonly known as the Log R Ratio (LRR) is just the log (base 2) (log2) of the probe intensity in, e.g., tumour, divided by intensity in matched normal - it is a crude measure for copy number. When this log2 ratio = 0, there is no difference between tumour and normal.

The definition of B-allele frequency (BAF) is never clear; however, it can be generally regarded as the frequency of the allele under study, which may the minor allele in a population study.

There are different points at which the software will struggle to correctly compute the BAF. If your DNA sample is poor quality, then everything will be difficult to calculate! If we plot the genotype of every SNP for a single sample of good quality, we would see a figure like this:


Here, the arms represent (for A and B alleles):

  • vertical arm: BB (homozygouse B)
  • diagonal arm: AB (heterozygous)
  • horizontal arm: AA (homozygous A)

This sample has mostly well-defined genotype calls, as judged by the well proportioned / orthogonal arms. The 'fuzzy bits' between the arms represent genotype calls that are on the borderline - these genotype calls will not be accurate, and neither, therefore, will the BAFs for these.


Conversely, look at a similar plot for this very poor quality DNA sample:


That data would have to be thrown into the trash can.


Things that can affect the calculation of the BAF:

  • allelic cross talk: when a probe for the A allele binds to the B allele sequence, and vice-versa
  • allelic imbalance: this occurs, when, e.g., homozygous A (AA) signal strengths are lower or higher than homozygous B (BB), and I assume is down to differences in binding affinities between, e.g., GC and AT genotypes

Both of these sources of bias are usually corrected in any processing pipeline.

Take a read of my other answer: A: Genotyping, genotype calling or SNP calling?


ADD COMMENTlink written 12 months ago by Kevin Blighe59k

Thank you for your answer!

ADD REPLYlink written 12 months ago by beaferbl10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1743 users visited in the last hour