Question: What causes some individuals to be separated clusters from the population the in PCA
gravatar for haneenih7
5 months ago by
haneenih760 wrote:


I am doing population genetics structure analyses with cereal crop plants (Fonio millet). I have 157 samples/accessions that are coming from different locations in West Africa.

I've run the Principal component analysis.

Results showed that there're 8 individuals from Togo separated and clusters apart from the rest of individuals (check the image in the link):


I want to investigate more this separation, what causes this clear clusters of these 8 samples? and how to test the hypotheses?

In a genetic perspective; many processes could involve in shaping the genetic of species:

-Genetic drift can affect the structure ???

-Natural selection ??

-Deleterious mutations ??

I did PCA analysis on the two subgenomes, and one of the subgenomes separated 8 accessions coming from south Togo from the rest of Togo's samples and the other accession


I've also conducted chromosomes-wise PCA, and the separation was only with chromosomes 3A and 9A


I didn't find any particular pattern for chromosomes 3A and 9A when conducting FST, or allele frequencies.

There must be something, but how to reveal it?


population genetics snp pca • 261 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by haneenih760

How to add images to a Biostars post

ADD REPLYlink written 5 months ago by WouterDeCoster43k
gravatar for Kevin Blighe
5 months ago by
Kevin Blighe56k
Kevin Blighe56k wrote:

Please first explain how you processed and pre-filtered the data. Do those Togolese Republic individuals comprise a family, by any chance?

Note that the percent explained variation is quite low on both axes, so, the differences to which we are referring here are minute / small. A PCA bi-plot will always 'expand out' the samples in the plot space; so, it can frequently occur that small differences can appear magnified.

For example, if I do a PCA analysis of variants / mutations in tumour and normal samples, then the percent expalined variation would be upward of 80%.

ADD COMMENTlink written 5 months ago by Kevin Blighe56k

Mm, what you said makes sense actually.

So these are 157 individuals from the same species Fonio millet (Digerati exilis), and they're coming from different locations, and different bioclimate variables.

The VCF file was filter accoring to the following:

  • allow no more than three SNPs into a 10-bp window and to remove indels

  • tolerate 10% missing data per SNP

  • low and high mean depth (14 ≤ DP ≥ 42)

  • and extract only biallelic SNPs

  • remove individuals that have more than 33% missing data and SNPs present in the unanchored chromosomes has been removed

The PCA analysis was done using all SNPs.

I already have a significant effect of climatic, geographic as well as social (i.e., ethnicity and linguistic groups) on the genetic structure. But I want to look at the data in genetic perspectives. What are the processes that made 8 samples from Togo to be separated? Is it because of gene flow? genetic drifts ....etc. And what I want is to find a way to test for these hypotheses.

I tried different approaches to somehow have a hint such as: Fst (check the differentiation between 8 samples from Togo and a random number of samples from the rest of individuals)

I looked at the allele frequency and the alternative allele frequency and didn't find a specific pattern to Togo's samples

I also checked the SNP density of all alleles / alternative alleles only and no specific patterns were observed

ADD REPLYlink modified 5 months ago • written 5 months ago by haneenih760

Interesting but perhaps outside of the scope of my experience. If you can trace back the origin of those particular samples, then that may allude to their differences. Could it be that they are from some 'laboratory' lineage that has been cultivated / bred repeatedly over time in vitro? Or perhaps it is the 'social' aspect that explains, i.e., the way that they cultivate this crop over the centuries has been quite specific.

ADD REPLYlink written 5 months ago by Kevin Blighe56k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1731 users visited in the last hour