Question: B allele frequency SNP array
gravatar for beaferbl
13 days ago by
beaferbl0 wrote:

Hi everyone! I'm analyzing tumour samples with SNP array (Illumina and Affymetrix). Does anyone know if sometimes the softwares have "problems" to assign the B-allele frequency correctly? I know that sometimes they recenter the logR, but it doesn't happen with B-allele frequency, right?

snp • 231 views
ADD COMMENTlink modified 11 days ago by Kevin Blighe42k • written 13 days ago by beaferbl0
gravatar for Kevin Blighe
11 days ago by
Kevin Blighe42k
Republic of Ireland
Kevin Blighe42k wrote:

Correct, the 'logR', more commonly known as the Log R Ratio (LRR) is just the log (base 2) (log2) of the probe intensity in, e.g., tumour, divided by intensity in matched normal - it is a crude measure for copy number. When this log2 ratio = 0, there is no difference between tumour and normal.

The definition of B-allele frequency (BAF) is never clear; however, it can be generally regarded as the frequency of the allele under study, which may the minor allele in a population study.

There are different points at which the software will struggle to correctly compute the BAF. If your DNA sample is poor quality, then everything will be difficult to calculate! If we plot the genotype of every SNP for a single sample of good quality, we would see a figure like this:


Here, the arms represent (for A and B alleles):

  • vertical arm: BB (homozygouse B)
  • diagonal arm: AB (heterozygous)
  • horizontal arm: AA (homozygous A)

This sample has mostly well-defined genotype calls, as judged by the well proportioned / orthogonal arms. The 'fuzzy bits' between the arms represent genotype calls that are on the borderline - these genotype calls will not be accurate, and neither, therefore, will the BAFs for these.


Conversely, look at a similar plot for this very poor quality DNA sample:


That data would have to be thrown into the trash can.


Things that can affect the calculation of the BAF:

  • allelic cross talk: when a probe for the A allele binds to the B allele sequence, and vice-versa
  • allelic imbalance: this occurs, when, e.g., homozygous A (AA) signal strengths are lower or higher than homozygous B (BB), and I assume is down to differences in binding affinities between, e.g., GC and AT genotypes

Both of these sources of bias are usually corrected in any processing pipeline.

Take a read of my other answer: A: Genotyping, genotype calling or SNP calling?


ADD COMMENTlink written 11 days ago by Kevin Blighe42k

Thank you for your answer!

ADD REPLYlink written 6 hours ago by beaferbl0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1144 users visited in the last hour